Sat, 21 Jan 2023

RcppSimdJson 0.1.9 on CRAN: New Upstream

The RcppSimdJson package was just updated to release 0.1.9.

RcppSimdJson wraps the fantastic and genuinely impressive simdjson library by Daniel Lemire and collaborators. Via very clever algorithmic engineering to obtain largely branch-free code, coupled with modern C++ and newer compiler instructions, it results in parsing gigabytes of JSON parsed per second which is quite mindboggling. The best-case performance is ‘faster than CPU speed’ as use of parallel SIMD instructions and careful branch avoidance can lead to less than one cpu cycle per byte parsed; see the video of the talk by Daniel Lemire at QCon.

This release updates the underlying simdjson library to version 3.0.1, settles on C++17 as the language standard, exports a worker function for direct C(++) access, and polishes a few small things around the package and tests.

The NEWS entry for this release follows.

Changes in version 0.1.9 (2023-01-21)

  • The internal function deseralize_json is now exported at the C++ level as well as in R (Dirk in #81 closing #80).

  • simdjson was upgraded to version 3.0.1 (Dirk in #83).

  • The package now defaults to C++17 compilation; configure has been retired (Dirk closing #82).

  • The three main R access functions now use a more compact argument check via stopifnot (Dirk).

Courtesy of my CRANberries, there is also a diffstat report for this release. For questions, suggestions, or issues please use the issue tracker at the GitHub repo.

If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

RcppFastFloat 0.0.4 on CRAN: New Upstream

A new release of RcppFastFloat arrived on CRAN yesterday. The package wraps fast_float, another nice library by Daniel Lemire. For details, see the arXiv paper showing that one can convert character representations of ‘numbers’ into floating point at rates at or exceeding one gigabyte per second.

This release updates the underlying fast_float library version. Special thanks to Daniel Lemire for quickly accomodating a parsing use case we had encode as a test, namely with various whitespace codes. The default in fast_float, as in C++17, is to be more narrow but we enable the wider use case via two #define statements.

Changes in version 0.0.4 (2023-01-20)

  • Update to fast_float 3.9.0

  • Set two #define re-establish prior behaviour with respect to whitespace removal prior to parsing for as.double2()

  • Small update to continuous integration actions

Courtesy of my CRANberries, there is also a diffstat report for this release.

If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link