This is an old version of this page. You can view the most recent version or browse the history.

wdata format concept

Introduction
W-data format concept
Tags

Introduction

W-data was designed to satisfy the following requirements:

Binary data is stored in a conceptually easy format that allows a variety of tools/languages to be used.
Format provides storage for data with time stepping (frames/measurements/cycles).
Data format is suitable for parallel processing (preferably with MPI I/O).
Data is easy to process via VisIt.
It provides an extensible framework - new variables can be created and easily added to the existing dataset.
Data is convenient for copying between computing systems.
It allows for easy extraction/copying of selected variables.

W-data format is not a library. It is only a concept. It specifies how the data should be saved or read. It means that you do not need to use any external libraries to be able to read or write. It is sufficient to use standard I/O functions to work with this format. We provide within this repository libraries written in C or in Python just for convenience. The example code demonstrating this concept can be found here c-examples /example-write-low-level.c. The concept is described below.

W-data format concept

The data set will consist of a set of files, for example:

test.wtxt           # metadata file, this one should be indicated when opening in VisIt
test_density_a.wdat # binary file with data
test_delta.wdat     # binary file with data
test_current_a.wdat # binary file with data

The content of test.wtxt may look like this:

# Comments with additional  info about the data set
# Comments are ignored when reading by the parser  

nx                       24   # lattice
ny                       28   # lattice
nz                       32   # lattice
dx                        1   # spacing
dy                        1   # spacing
dz                        1   # spacing
x0                      -12   # origin of x-axis
y0                      -14   # origin of y-axis
z0                      -16   # origin of z-axis
datadim                   3   # dimension of block size: 1=nx, 2=nx*ny, 3=nx*ny*nz
prefix                 test   # prefix for files belonging to this data set`
cycles                   10   # number of cycles (measurements)
t0                        0   # time value for the first cycle
dt                        1   # time interval between cycles

# variables
# tag                  name                    type                    unit                  format
var               density_a                    real                    none                    wdat
var                   delta                 complex                    none                    wdat
var               current_a                  vector                    none                    wdat

# links
# tag                  name                 link-to
link              density_b               density_a
link              current_b               current_a

# consts
# tag                  name                   value                    unit
const                    eF                     0.5                     MeV
const                    kF                       1                    1/fm

According to our experience, three types of variables (real, complex, vector) are sufficient and cover more than 90% of applications.

Binary files store data as row arrays called datablocks:

The variables can be stored either in double or float precision.

type	float	double
real	`real4`	`real`, `real8`
complex	`complex8`	`complex`, `complex16`
vector	`vector4`	`vector`, `vector8`

The size of the datablock depends on the variable type and the data dimensionality, and is computed according formula (result in bytes B):
blocksize=blocklength*sizeB

The blocklength depends on the data dimensionality (datadim)

datadim	blocklength
1	`nx`
2	`nx*ny`
3	`nxnynz`

The sizeB depends on the data type

type	sizeB	comment
`real`, `real8`	8	`sizeof(double)`
`real4`	4	`sizeof(float)`
`complex`, `complex16`	16	=82 double(re,im)
`complex8`	8	=42 float(re,im)
`vector(d)`, `vector8(d)`	8*d	d-vector dimensionality, d=1,2,3(default)
`vector4(d)`	4*d	d-vector dimensionality, d=1,2,3(default)

Scalars: real

Real variables are stored as single-dimensional arrays, where we use the following prescription of the coordinate decoding

// lattice indicase
int ix=...; // ix in [0,nx)
int iy=...; // iy in [0,ny)
int iz=...; // iz in [0,nz)

// coordinate decoding
double x = x0 + dx*ix;
double y = y0 + dy*iy;
double z = z0 + dz*iz;

// array index
int ixyz = iz + nz*iy + nz*ny*ix; 
double *var; // pointer to real array
var[ixyz]=...; // value for the given coordinate

Scalars: complex

The complex variables are stored as pairs of two real numbers: real and imaginary parts. To store the complex variables, we use native C types double complex or float complex. These are structures with two double/float fields.

// array index
int ixyz = iz + nz*iy + nz*ny*ix; 
double complex *varC; // pointer to real array
varC[ixyz]=1.0 + 2.0*I; // complex value for the given coordinate

// or you can cast it to a real array of size 2*nx*ny*nz
double *var = (double*)varC;
var[2*ixyz+0]=1.0; // real part
var[2*ixyz+1]=2.0; // imaginary part

Vectors

For vector variables, we do not introduce a new structure. Instead, we store components of vector variables in separate arrays, placed one by one. The number of components is given in the parentheses at the end of the type name vector(d). If the number of components is not given, the default value d=3 is assumed. Thus, vector is equivalent to vector(3). The vector variable of type vector(1) becomes equivalent to real.

The storage pattern for a vector(3) variable is shown below:

Below example of an element decoding

// For 3d vector v = [v_x(x,y,z), v_y(x,y,z), v_z(x,y,z)]
double *var; // pointer to real array
dataV[ixyz + 0 * blocklength] = v_x; // 
dataV[ixyz + 1 * blocklength] = v_y;
dataV[ixyz + 2 * blocklength] = v_z;

Time

To compute the time associated with a given icycle, we use the formula

time = t0 + dt * icycle;

If dt is negative then time parameter has to be extracted from the additional binary file of the name prefix__t.wdat. For more information, see the implementation of function wdata_get_time(...) from our C library.

Let's get back to the *wtxt file.

GitLab

wdata format concept

Introduction

W-data format concept

Scalars: real

Scalars: complex

Vectors

Time

Tags

Variables

Links

Constants