Introduction
W-data was designed to satisfy the following requirements:
- Binary data is stored in a conceptually easy format that allows a variety of tools/languages to be used.
- Format provides storage for data with time stepping (frames / measurements / cycles).
- Data format is suitable for parallel processing (preferably with MPI I/O).
- Data is easy for processing via VisIt.
- It provides the extensible framework - new variables can be created and easily added to the existing dataset.
- Data is convenient for copy between computing systems.
- It allows for easy extraction / copy of selected variables
W-data format concept
Data set will consist of set of files, example:
test.wtxt # metadata file, this one should be indicated when opening in VisIt
test_density_a.wdat # binary file with data
test_delta.wdat # binary file with data
test_current_a.wdat # binary file with data
Content of test.wtxt may look like:
# Comments with additional info about data set
# Comments are ignored when reading by parser
NX 24 # lattice
NY 28 # lattice
NZ 32 # lattice
DX 1 # spacing
DY 1 # spacing
DZ 1 # spacing
datadim 3 # dimension of block size: 1=NX, 2=NX*NY, 3=NX*NY*NZ
prefix test # prefix for files belonging to this data set, binary files have names prefix_variable.format
cycles 10 # number of cycles (measurements)
t0 0 # time value for the first cycle
dt 1 # time interval between cycles
# variables
# tag name type unit format
var density_a real none wdat
var delta complex none wdat
var current_a vector none wdat
# links
# tag name link-to
link density_b density_a
link current_b current_a
# consts
# tag name value
const eF 0.5
const kF 1
According to our experience, three types of variables (real
, complex
, vector
) are sufficient and cover more 90% of applications.
Binary files store data as row arrays, called datablocks:
The size of datablock depends on variable type and data dimensionality.
-
real:
blocksize=blocklength*8 Bytes
-
complex:
blocksize=blocklength*16 Bytes
-
vector:
blocksize=blocklength*8*3 Bytes
where blocklength
is
- for datadim=3:
blocklength=NX*NY*NZ
- for datadim=2:
blocklength=NX*NY
- for datadim=1:
blocklength=NX
Note that for vector variables we use following storage pattern:
To compute time associated with given cycle use: time=t0+cycleid*dt
W-data format allows for the representation of the following elements:
-
variable:
Each variable is represented by the binary file of nameprefix_varname.format
. The variable description has following format:
var name type unit format
Following formats are allowed:-
wdat
: default format for WSLDA codes. Binary files contain row data (no header). -
dpca
: (deprecated) previous format of cold atomic codes. Binary file contains header of size 68B where additional info about file content is stored. For this format wdata lib provides only reading functionality. -
npy
: binary files are numpy arrays. Functionality under construction
In order to switch WSLDA codes to writing in this format add to input file:
dataformat npy
-
-
link:
It is an alternative name for a given variable. In the case of WSLDA codes in many cases, we do the computation for systems that exhibit some symmetries, like spin symmetry. Then densities for spin-a and spin-b particles are exactly the same. In order to save disk space we can save only one of them:
var density_a real none wdat
and for another one set link:
link density_b density_a
-
constant:
Typically besides variables, we have some constants that are useful during the data analysis process. For example, when making plots it is convenient to express variables in dimensionless form, like delta/eF. To provide user info what are values of selected constants we useconst
field:
const eF 0.5
Low-level reading and writing of data
Code snippet demonstrating how to load datablock from wdat
file:
double *data; // pointer to real data
int cycleid; // id of cycle for loading
char file_name[128]="test_density_a.wdat";
FILE *pFile;
pFile= fopen (file_name, "rb"); // open file
if(pFile==NULL) printf("ERROR: Cannot open file\n");
// set pointer to correct location
size_t blocklength = NX*NY*NZ; // for datadim=3
size_t blocksize = blocklength*8; // for real variable
size_t ptr_shift = 0;
// in case of other formats, like npy:
// ptr_shift += size_of_header; // skip header
ptr_shift += blocksize*cycleid; // set pointer to correct location
if(fseek ( pFile, ptr_shift, SEEK_SET ) != 0 ) printf("ERROR: Cannot set pointer to datablock\n");
size_t test_ele = fread (data , blocksize, 1, pFile); // read datablock
if(test_ele != 1) printf("ERROR: Cannot read datablock\n");
fclose(pFile); // close file
Code snippet demonstrating how to add new datablock to wdat
file:
double *data; // pointer to real data
char file_name[128]="test_density_a.wdat";
FILE *pFile;
pFile= fopen (file_name, "ab"); // open file
if(pFile==NULL) printf("ERROR: Cannot open file\n");
size_t blocklength = NX*NY*NZ; // for datadim=3
size_t blocksize = blocklength*8; // for real variable
size_t test_ele = fwrite (data , blocksize, 1, pFile); // add datablock
if(test_ele != 1) printf("ERROR: Cannot add datablock\n");
// in case of other formats, like npy, additional change of header is required.
// ...
fclose(pFile); // close file
W-data C library
Folder lib-wdata contains library that provides support for w-data processing.
List of examples demonstrating how to use this library.
- example-write.c: code creates artificial set of variables and writes them do wdat files.
- example-read.c: code reads data from wdat files (created by example-write.c).
- example-addvar.c: code adds new variable to existing data set (created by example-write.c).
In order to compile examples (optionally you will need to modify Makefile):
[gabrielw@dell cold-atoms]$ cd lib-wdata/
[gabrielw@dell lib-wdata]$ make