RcppSimdJson wraps the fantastic and genuinely impressive simdjson library by Daniel Lemire and collaborators. Via very clever algorithmic engineering to obtain largely branch-free code, coupled with modern C++ and newer compiler instructions, it results in parsing gigabytes of JSON parsed per second which is quite mindboggling. The best-case performance is ‘faster than CPU speed’ as use of parallel SIMD instructions and careful branch avoidance can lead to less than one cpu cycle per byte parsed; see the video of the talk by Daniel Lemire at QCon (also voted best talk).
Other than the upstream update, Brendan added some new utilities to check for valid utf-8 or json format, and to minify json plus a small workaround for a clang-9 bug we encountered. We can confirm Daniel’s statement on ridiculously fast utf-8 validattion. It is so cool to work with amazing tools.
The NEWS entry follows.
Changes in version 0.1.3 (2020-11-01)
Added URLs to DESCRIPTION (Dirk closing #50).
Upgraded to simdjson 0.6.0 (Dirk in #52).
Added workaround for odd clang-9 bug (Brendan in #57).
New utility functions
fminify()(Brendan in #58).