Sun, 03 Aug 2008

Package ff updated to version 2.0.0 with previous version 1.0-1 dated 2007-11-03

Author: Daniel Adler , Christian Gläser , Oleg Nenadic , Jens Oehlschlägel , Walter Zucchini
Title: memory-efficient storage of large atomic vectors and arrays on disk and fast access functions
Description: The ff package provides atomic data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports atomic data types 'double', 'logical', 'raw' and 'integer' and close-to-atomic types 'factor', 'POSIXct' and custom close-to-atomic types. ff now has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with 'permanent' files as well as creating/removing 'temporary' ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, the atomic data gets stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). NOTE: A professional extension is available from the authors, which integrates additional high-performance features neatly into the ff package. The extension allows efficient handling of symmetric matrices and supports more packed data types: boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned types support 'circular' arithmetic.

Diff between ff versions 1.0-1 dated 2007-11-03 and 2.0.0 dated 2008-08-03

 ff-1.0-1/ff/R/error.R                       |only
 ff-1.0-1/ff/R/evalmindex.R                  |only
 ff-1.0-1/ff/R/ffm.R                         |only
 ff-1.0-1/ff/R/ffmdataframe.r                |only
 ff-1.0-1/ff/R/getrange.R                    |only
 ff-1.0-1/ff/R/indmat.R                      |only
 ff-1.0-1/ff/R/runique.R                     |only
 ff-1.0-1/ff/R/sample.R                      |only
 ff-1.0-1/ff/R/seqpack.r                     |only
 ff-1.0-1/ff/R/testfile.R                    |only
 ff-1.0-1/ff/README.txt                      |only
 ff-1.0-1/ff/demo                            |only
 ff-1.0-1/ff/inst                            |only
 ff-1.0-1/ff/man/ff.Rd                       |only
 ff-1.0-1/ff/man/ffm.Rd                      |only
 ff-1.0-1/ff/man/ffm.data.frame.Rd           |only
 ff-1.0-1/ff/man/getpagesize.Rd              |only
 ff-1.0-1/ff/man/runique.Rd                  |only
 ff-1.0-1/ff/man/sample.Rd                   |only
 ff-1.0-1/ff/man/seqpack.Rd                  |only
 ff-1.0-1/ff/src/MultiArray.hpp              |only
 ff-1.0-1/ff/src/MultiIndex.cpp              |only
 ff-1.0-1/ff/src/MultiIndex.hpp              |only
 ff-1.0-1/ff/src/ffm.cpp                     |only
 ff-1.0-1/ff/src/ffm.h                       |only
 ff-1.0-1/ff/src/r_api.c                     |only
 ff-2.0.0/ff/ANNOUNCEMENT.txt                |only
 ff-2.0.0/ff/DESCRIPTION                     |   67 
 ff-2.0.0/ff/LICENSE                         |   33 
 ff-2.0.0/ff/NAMESPACE                       |  541 +++
 ff-2.0.0/ff/R/CFUN.R                        |only
 ff-2.0.0/ff/R/array.R                       |only
 ff-2.0.0/ff/R/as.ff.R                       |only
 ff-2.0.0/ff/R/bigsample.R                   |only
 ff-2.0.0/ff/R/ff.R                          | 4469 +++++++++++++++++++++++++++-
 ff-2.0.0/ff/R/ffapply.R                     |only
 ff-2.0.0/ff/R/ffreturn.R                    |only
 ff-2.0.0/ff/R/generics.R                    |only
 ff-2.0.0/ff/R/getpagesize.R                 |   74 
 ff-2.0.0/ff/R/hi.R                          |only
 ff-2.0.0/ff/R/util.R                        |only
 ff-2.0.0/ff/R/vmode.R                       |only
 ff-2.0.0/ff/R/vt.R                          |only
 ff-2.0.0/ff/R/zzz.R                         |only
 ff-2.0.0/ff/README_devel.txt                |only
 ff-2.0.0/ff/configure                       |  591 +++
 ff-2.0.0/ff/configure.ac                    |    5 
 ff-2.0.0/ff/exec                            |only
 ff-2.0.0/ff/man/CFUN.rd                     |only
 ff-2.0.0/ff/man/Extract.ff.rd               |only
 ff-2.0.0/ff/man/LimWarn.rd                  |only
 ff-2.0.0/ff/man/add.rd                      |only
 ff-2.0.0/ff/man/array2vector.rd             |only
 ff-2.0.0/ff/man/arrayIndex2vectorIndex.rd   |only
 ff-2.0.0/ff/man/as.ff.rd                    |only
 ff-2.0.0/ff/man/as.hi.rd                    |only
 ff-2.0.0/ff/man/as.integer.hi.rd            |only
 ff-2.0.0/ff/man/as.vmode.rd                 |only
 ff-2.0.0/ff/man/bbatch.rd                   |only
 ff-2.0.0/ff/man/bigsample.rd                |only
 ff-2.0.0/ff/man/clone.rd                    |only
 ff-2.0.0/ff/man/close.ff.rd                 |only
 ff-2.0.0/ff/man/delete.rd                   |only
 ff-2.0.0/ff/man/dim.ff.rd                   |only
 ff-2.0.0/ff/man/dimnames.ff_array.rd        |only
 ff-2.0.0/ff/man/dimorderCompatible.rd       |only
 ff-2.0.0/ff/man/dummy.dimnames.rd           |only
 ff-2.0.0/ff/man/ff.rd                       |only
 ff-2.0.0/ff/man/ffapply.rd                  |only
 ff-2.0.0/ff/man/ffconform.rd                |only
 ff-2.0.0/ff/man/ffreturn.rd                 |only
 ff-2.0.0/ff/man/ffsuitable.rd               |only
 ff-2.0.0/ff/man/ffxtensions.rd              |only
 ff-2.0.0/ff/man/filename.rd                 |only
 ff-2.0.0/ff/man/fixdiag.rd                  |only
 ff-2.0.0/ff/man/geterror.ff.rd              |only
 ff-2.0.0/ff/man/getpagesize.rd              |only
 ff-2.0.0/ff/man/getset.ff.rd                |only
 ff-2.0.0/ff/man/hi.rd                       |only
 ff-2.0.0/ff/man/hiparse.rd                  |only
 ff-2.0.0/ff/man/intrle.rd                   |only
 ff-2.0.0/ff/man/is.ff.rd                    |only
 ff-2.0.0/ff/man/is.open.rd                  |only
 ff-2.0.0/ff/man/is.readonly.rd              |only
 ff-2.0.0/ff/man/is.sorted.rd                |only
 ff-2.0.0/ff/man/length.ff.rd                |only
 ff-2.0.0/ff/man/length.hi.rd                |only
 ff-2.0.0/ff/man/levels.ff.rd                |only
 ff-2.0.0/ff/man/matcomb.rd                  |only
 ff-2.0.0/ff/man/matprint.rd                 |only
 ff-2.0.0/ff/man/maxffmode.rd                |only
 ff-2.0.0/ff/man/maxlength.rd                |only
 ff-2.0.0/ff/man/mismatch.rd                 |only
 ff-2.0.0/ff/man/na.count.rd                 |only
 ff-2.0.0/ff/man/names.ff.rd                 |only
 ff-2.0.0/ff/man/open.ff.rd                  |only
 ff-2.0.0/ff/man/physical.rd                 |only
 ff-2.0.0/ff/man/print.ff.rd                 |only
 ff-2.0.0/ff/man/ram2ffcode.rd               |only
 ff-2.0.0/ff/man/ramattribs.rd               |only
 ff-2.0.0/ff/man/readwrite.ff.rd             |only
 ff-2.0.0/ff/man/repfromto.rd                |only
 ff-2.0.0/ff/man/rlepack.rd                  |only
 ff-2.0.0/ff/man/swap.rd                     |only
 ff-2.0.0/ff/man/symmIndex2vectorIndex.rd    |only
 ff-2.0.0/ff/man/symmetric.rd                |only
 ff-2.0.0/ff/man/unclass_-.rd                |only
 ff-2.0.0/ff/man/undim.rd                    |only
 ff-2.0.0/ff/man/unsort.rd                   |only
 ff-2.0.0/ff/man/update.ff.rd                |only
 ff-2.0.0/ff/man/vecprint.rd                 |only
 ff-2.0.0/ff/man/vector.vmode.rd             |only
 ff-2.0.0/ff/man/vector2array.rd             |only
 ff-2.0.0/ff/man/vectorIndex2arrayIndex.rd   |only
 ff-2.0.0/ff/man/vmode.rd                    |only
 ff-2.0.0/ff/man/vt.rd                       |only
 ff-2.0.0/ff/man/vw.rd                       |only
 ff-2.0.0/ff/prebuild.sh                     |only
 ff-2.0.0/ff/src/Array.hpp                   |  363 +-
 ff-2.0.0/ff/src/Error.cpp                   |   19 
 ff-2.0.0/ff/src/Error.hpp                   |   21 
 ff-2.0.0/ff/src/FSInfo.hpp                  |   62 
 ff-2.0.0/ff/src/FSInfo_statfs.cpp           |   18 
 ff-2.0.0/ff/src/FSInfo_win32.cpp            |   18 
 ff-2.0.0/ff/src/FileMapping.hpp             |   80 
 ff-2.0.0/ff/src/MMapFileMapping.cpp         |   55 
 ff-2.0.0/ff/src/MMapFileMapping.hpp         |  205 -
 ff-2.0.0/ff/src/Win32FileMapping.cpp        |   82 
 ff-2.0.0/ff/src/Win32FileMapping.hpp        |  247 -
 ff-2.0.0/ff/src/ac_config.h.in              |   21 
 ff-2.0.0/ff/src/config.h                    |   10 
 ff-2.0.0/ff/src/ff.cpp                      |  353 +-
 ff-2.0.0/ff/src/ff.h                        |   95 
 ff-2.0.0/ff/src/r_ff.c                      |only
 ff-2.0.0/ff/src/r_ff.h                      |only
 ff-2.0.0/ff/src/r_ff_addgetset.h            |only
 ff-2.0.0/ff/src/r_ff_makevmodes.h           |only
 ff-2.0.0/ff/src/r_ff_methoddeclaration.h    |only
 ff-2.0.0/ff/src/r_ff_methodswitch.h         |only
 ff-2.0.0/ff/src/types.hpp                   |   52 
 ff-2.0.0/ff/src/utk_config.hpp              |only
 ff-2.0.0/ff/src/utk_file_allocate_fseek.cpp |only
 ff-2.0.0/ff/src/utk_file_allocate_fseek.hpp |only
 ff-2.0.0/ff/src/utk_platform_macros.hpp     |only
 144 files changed, 6748 insertions(+), 733 deletions(-)

More information about ff at CRAN
Permanent link

New package Read.isi with initial version 0.4
Package: Read.isi
Type: Package
Title: Access old data saved in fixed-width format based on ISI-formatted codebooks.
Version: 0.4
Date: 2008-06-27
Author: Rense Nieuwenhuis
Maintainer:
Description: Old statistical data was often stored in formats that are difficult to gain access to by modern statistical software. An example of this are the data-files of the `World Fertility Survey', which are stored in fixed-width format and accompanied by codebooks in a format developed by the International Statistical Institute. The read.isi package allows to gain access to these statistical data automatically, or to convert to codebook to SPSS-syntax.
License: Copyright (C) 2008 Rense Nieuwenhuis (email: contact@rensenieuwenhuis.nl).
Packaged: Sun Aug 3 16:34:26 2008; Rense

More information about Read.isi at CRAN
Permanent link


Built and running on Debian GNU/Linux using R, littler and blosxom. Styled with Bootstrap.