Thu, 25 Jun 2020

RcppSimdJson 0.0.6: New Upstream, New Features!

A very exciting RcppSimdJson release with the updated upstream simdjson release 0.4.0 as well as a first set of new JSON parsing functions just hit CRAN. RcppSimdJson wraps the fantastic and genuinely impressive simdjson library by Daniel Lemire and collaborators. Via very clever algorithmic engineering to obtain largely branch-free code, coupled with modern C++ and newer compiler instructions, it results in parsing gigabytes of JSON parsed per second which is quite mindboggling. The best-case performance is ‘faster than CPU speed’ as use of parallel SIMD instructions and careful branch avoidance can lead to less than one cpu cycle use per byte parsed; see the video of the recent talk by Daniel Lemire at QCon (which was also voted best talk). The very recent 0.4.0 release further improves the already impressive speed.

And this release brings a first set of actually user-facing functions thanks to Brendan which put in a series of PRs! The full NEWS entry follows.

Changes in version 0.0.6 (2020-06-25)

  • Created C++ integer-handling utilities for safe downcasting and integer return (Brendan in #16 closing #13).

  • New JSON functions .deserialize_json and .load_json (Brendan in #16, #17, #20, #21).

  • Upgrade Travis CI to 'bionic', extract package and version from DESCRIPTION (Dirk in #23).

  • Upgraded to simdjson 0.4.0 (Dirk in #25 closing #24).

Courtesy of CRANberries, there is also a diffstat report for this release.

For questions, suggestions, or issues please use the issue tracker at the GitHub repo.

If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link