Author: Daniel Adler
Diff between ff versions 1.0-1 dated 2007-11-03 and 2.0.0 dated 2008-08-03
Title: memory-efficient storage of large atomic vectors and arrays on disk and fast access functions
Description: The ff package provides atomic data structures that are stored on
disk but behave (almost) as if they were in RAM by transparently
mapping only a section (pagesize) in main memory - the effective
virtual memory consumption per ff object. ff supports atomic data types
'double', 'logical', 'raw' and 'integer' and close-to-atomic types
'factor', 'POSIXct' and custom close-to-atomic types. ff now has native
C-support for vectors, matrices and arrays with flexible dimorder
(major column-order, major row-order and generalizations for arrays).
ff objects store raw data in binary flat files in native encoding,
and complement this with metadata stored in R as physical and virtual
attributes. ff objects have well-defined hybrid copying semantics,
which gives rise to certain performance improvements through
virtualization. ff objects can be stored and reopened across R
sessions. ff files can be shared by multiple ff R objects
(using different data en/de-coding schemes) in the same process
or from multiple R processes to exploit parallelism. A wide choice of
finalizer options allows to work with 'permanent' files as well as
creating/removing 'temporary' ff files completely transparent to the
user. On certain OS/Filesystem combinations, creating the ff files
works without notable delay thanks to using sparse file allocation.
Several access optimization techniques such as Hybrid Index
Preprocessing and Virtualization are implemented to achieve good
performance even with large datasets, for example virtual matrix
transpose without touching a single byte on disk. Further, to reduce
disk I/O, the atomic data gets stored native and compact on binary flat
files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE
and NA. Beyond basic access functions, the ff package also provides
compatibility functions that facilitate writing code for ff and ram
objects and support for batch processing on ff objects (e.g. as.ram,
as.ff, ffapply).
NOTE: A professional extension is available from the authors, which
integrates additional high-performance features neatly into the ff
package. The extension allows efficient handling of symmetric matrices
and supports more packed data types: boolean (1 bit), quad (2 bit
unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs),
ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte
unsigned), single (4 byte float with NAs). For example 'quad' allows
efficient storage of genomic data as an 'A','T','G','C' factor.
The unsigned types support 'circular' arithmetic.
ff-1.0-1/ff/R/error.R |only
ff-1.0-1/ff/R/evalmindex.R |only
ff-1.0-1/ff/R/ffm.R |only
ff-1.0-1/ff/R/ffmdataframe.r |only
ff-1.0-1/ff/R/getrange.R |only
ff-1.0-1/ff/R/indmat.R |only
ff-1.0-1/ff/R/runique.R |only
ff-1.0-1/ff/R/sample.R |only
ff-1.0-1/ff/R/seqpack.r |only
ff-1.0-1/ff/R/testfile.R |only
ff-1.0-1/ff/README.txt |only
ff-1.0-1/ff/demo |only
ff-1.0-1/ff/inst |only
ff-1.0-1/ff/man/ff.Rd |only
ff-1.0-1/ff/man/ffm.Rd |only
ff-1.0-1/ff/man/ffm.data.frame.Rd |only
ff-1.0-1/ff/man/getpagesize.Rd |only
ff-1.0-1/ff/man/runique.Rd |only
ff-1.0-1/ff/man/sample.Rd |only
ff-1.0-1/ff/man/seqpack.Rd |only
ff-1.0-1/ff/src/MultiArray.hpp |only
ff-1.0-1/ff/src/MultiIndex.cpp |only
ff-1.0-1/ff/src/MultiIndex.hpp |only
ff-1.0-1/ff/src/ffm.cpp |only
ff-1.0-1/ff/src/ffm.h |only
ff-1.0-1/ff/src/r_api.c |only
ff-2.0.0/ff/ANNOUNCEMENT.txt |only
ff-2.0.0/ff/DESCRIPTION | 67
ff-2.0.0/ff/LICENSE | 33
ff-2.0.0/ff/NAMESPACE | 541 +++
ff-2.0.0/ff/R/CFUN.R |only
ff-2.0.0/ff/R/array.R |only
ff-2.0.0/ff/R/as.ff.R |only
ff-2.0.0/ff/R/bigsample.R |only
ff-2.0.0/ff/R/ff.R | 4469 +++++++++++++++++++++++++++-
ff-2.0.0/ff/R/ffapply.R |only
ff-2.0.0/ff/R/ffreturn.R |only
ff-2.0.0/ff/R/generics.R |only
ff-2.0.0/ff/R/getpagesize.R | 74
ff-2.0.0/ff/R/hi.R |only
ff-2.0.0/ff/R/util.R |only
ff-2.0.0/ff/R/vmode.R |only
ff-2.0.0/ff/R/vt.R |only
ff-2.0.0/ff/R/zzz.R |only
ff-2.0.0/ff/README_devel.txt |only
ff-2.0.0/ff/configure | 591 +++
ff-2.0.0/ff/configure.ac | 5
ff-2.0.0/ff/exec |only
ff-2.0.0/ff/man/CFUN.rd |only
ff-2.0.0/ff/man/Extract.ff.rd |only
ff-2.0.0/ff/man/LimWarn.rd |only
ff-2.0.0/ff/man/add.rd |only
ff-2.0.0/ff/man/array2vector.rd |only
ff-2.0.0/ff/man/arrayIndex2vectorIndex.rd |only
ff-2.0.0/ff/man/as.ff.rd |only
ff-2.0.0/ff/man/as.hi.rd |only
ff-2.0.0/ff/man/as.integer.hi.rd |only
ff-2.0.0/ff/man/as.vmode.rd |only
ff-2.0.0/ff/man/bbatch.rd |only
ff-2.0.0/ff/man/bigsample.rd |only
ff-2.0.0/ff/man/clone.rd |only
ff-2.0.0/ff/man/close.ff.rd |only
ff-2.0.0/ff/man/delete.rd |only
ff-2.0.0/ff/man/dim.ff.rd |only
ff-2.0.0/ff/man/dimnames.ff_array.rd |only
ff-2.0.0/ff/man/dimorderCompatible.rd |only
ff-2.0.0/ff/man/dummy.dimnames.rd |only
ff-2.0.0/ff/man/ff.rd |only
ff-2.0.0/ff/man/ffapply.rd |only
ff-2.0.0/ff/man/ffconform.rd |only
ff-2.0.0/ff/man/ffreturn.rd |only
ff-2.0.0/ff/man/ffsuitable.rd |only
ff-2.0.0/ff/man/ffxtensions.rd |only
ff-2.0.0/ff/man/filename.rd |only
ff-2.0.0/ff/man/fixdiag.rd |only
ff-2.0.0/ff/man/geterror.ff.rd |only
ff-2.0.0/ff/man/getpagesize.rd |only
ff-2.0.0/ff/man/getset.ff.rd |only
ff-2.0.0/ff/man/hi.rd |only
ff-2.0.0/ff/man/hiparse.rd |only
ff-2.0.0/ff/man/intrle.rd |only
ff-2.0.0/ff/man/is.ff.rd |only
ff-2.0.0/ff/man/is.open.rd |only
ff-2.0.0/ff/man/is.readonly.rd |only
ff-2.0.0/ff/man/is.sorted.rd |only
ff-2.0.0/ff/man/length.ff.rd |only
ff-2.0.0/ff/man/length.hi.rd |only
ff-2.0.0/ff/man/levels.ff.rd |only
ff-2.0.0/ff/man/matcomb.rd |only
ff-2.0.0/ff/man/matprint.rd |only
ff-2.0.0/ff/man/maxffmode.rd |only
ff-2.0.0/ff/man/maxlength.rd |only
ff-2.0.0/ff/man/mismatch.rd |only
ff-2.0.0/ff/man/na.count.rd |only
ff-2.0.0/ff/man/names.ff.rd |only
ff-2.0.0/ff/man/open.ff.rd |only
ff-2.0.0/ff/man/physical.rd |only
ff-2.0.0/ff/man/print.ff.rd |only
ff-2.0.0/ff/man/ram2ffcode.rd |only
ff-2.0.0/ff/man/ramattribs.rd |only
ff-2.0.0/ff/man/readwrite.ff.rd |only
ff-2.0.0/ff/man/repfromto.rd |only
ff-2.0.0/ff/man/rlepack.rd |only
ff-2.0.0/ff/man/swap.rd |only
ff-2.0.0/ff/man/symmIndex2vectorIndex.rd |only
ff-2.0.0/ff/man/symmetric.rd |only
ff-2.0.0/ff/man/unclass_-.rd |only
ff-2.0.0/ff/man/undim.rd |only
ff-2.0.0/ff/man/unsort.rd |only
ff-2.0.0/ff/man/update.ff.rd |only
ff-2.0.0/ff/man/vecprint.rd |only
ff-2.0.0/ff/man/vector.vmode.rd |only
ff-2.0.0/ff/man/vector2array.rd |only
ff-2.0.0/ff/man/vectorIndex2arrayIndex.rd |only
ff-2.0.0/ff/man/vmode.rd |only
ff-2.0.0/ff/man/vt.rd |only
ff-2.0.0/ff/man/vw.rd |only
ff-2.0.0/ff/prebuild.sh |only
ff-2.0.0/ff/src/Array.hpp | 363 +-
ff-2.0.0/ff/src/Error.cpp | 19
ff-2.0.0/ff/src/Error.hpp | 21
ff-2.0.0/ff/src/FSInfo.hpp | 62
ff-2.0.0/ff/src/FSInfo_statfs.cpp | 18
ff-2.0.0/ff/src/FSInfo_win32.cpp | 18
ff-2.0.0/ff/src/FileMapping.hpp | 80
ff-2.0.0/ff/src/MMapFileMapping.cpp | 55
ff-2.0.0/ff/src/MMapFileMapping.hpp | 205 -
ff-2.0.0/ff/src/Win32FileMapping.cpp | 82
ff-2.0.0/ff/src/Win32FileMapping.hpp | 247 -
ff-2.0.0/ff/src/ac_config.h.in | 21
ff-2.0.0/ff/src/config.h | 10
ff-2.0.0/ff/src/ff.cpp | 353 +-
ff-2.0.0/ff/src/ff.h | 95
ff-2.0.0/ff/src/r_ff.c |only
ff-2.0.0/ff/src/r_ff.h |only
ff-2.0.0/ff/src/r_ff_addgetset.h |only
ff-2.0.0/ff/src/r_ff_makevmodes.h |only
ff-2.0.0/ff/src/r_ff_methoddeclaration.h |only
ff-2.0.0/ff/src/r_ff_methodswitch.h |only
ff-2.0.0/ff/src/types.hpp | 52
ff-2.0.0/ff/src/utk_config.hpp |only
ff-2.0.0/ff/src/utk_file_allocate_fseek.cpp |only
ff-2.0.0/ff/src/utk_file_allocate_fseek.hpp |only
ff-2.0.0/ff/src/utk_platform_macros.hpp |only
144 files changed, 6748 insertions(+), 733 deletions(-)