Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • wslda wslda
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • wtools
  • wsldawslda
  • Wiki
  • W data format

W data format · Changes

Page history
Create W data format authored Dec 10, 2020 by Gabriel Wlazłowski's avatar Gabriel Wlazłowski
Show whitespace changes
Inline Side-by-side
W-data-format.md 0 → 100644
View page @ 3e814e43
### W-data format
#### Introduction
W-data was designed to satisfy following requirements:
* Format provides storage for data with time stepping (frames / measurements / cycles).
* Binary data is stored in conceptually easy format that allows variety of tools/languages to be used.
* Data format is suitable for parallel processing (preferably with MPI I/O).
* Data is easy for processing via VisIt.
* It provides extensible framework - new variables can be created and easily added to existing dataset.
* Data is convenient for copy between computing systems.
* It allows for easy extraction / copy of selected variables
#### W-data format concept
Data set will consist of set of files, example:
```bash
test.wtxt # metadata file, this one should be indicated when opening in VisIt
test_density_a.wdat # binary file with data
test_delta.wdat # binary file with data
test_current_a.wdat # binary file with data
```
Content of test.wtxt may look like:
```bash
# Comments with additional info about data set
# Comments are ignored when reading by parser
NX 24 # lattice
NY 28 # lattice
NZ 32 # lattice
DX 1 # spacing
DY 1 # spacing
DZ 1 # spacing
datadim 3 # dimension of block size: 1=NX, 2=NX*NY, 3=NX*NY*NZ
prefix test # prefix for files belonging to this data set, binary files have names prefix_variable.format
cycles 10 # number of cycles (measurements)
t0 0 # time value for the first cycle
dt 1 # time interval between cycles
# variables
# tag name type unit format
var density_a real none wdat
var delta complex none wdat
var current_a vector none wdat
# links
# tag name link-to
link density_b density_a
link current_b current_a
# consts
# tag name value
const eF 0.5
const kF 1
```
According our experience three types of variables (`real`, `complex`, `vector`) are sufficient and cover more 90% of applications.
Binary files store data as row arrays, called datablocks:
![datablocks](http://git2.if.pw.edu.pl/gabrielw/cold-atoms/uploads/0c3f3e8cf8d2d2a4630da138c390166c/datablocks.png)
Size of datablock depends on variable type and data dimensonality.
* *real*: `blocksize=blocklength*8 Bytes`
* *complex*: `blocksize=blocklength*16 Bytes`
* *vector*: `blocksize=blocklength*8*3 Bytes`
where `blocklength` is
* for datadim=3: `blocklength=NX*NY*NZ`
* for datadim=2: `blocklength=NX*NY`
* for datadim=1: `blocklength=NX`
Note that for vector variables we use following storage pattern:
![vecvar](http://git2.if.pw.edu.pl/gabrielw/cold-atoms/uploads/28a263ab139358c5160d8fe38890a234/vecvar.png)
To compute time associated with given cycle use: `time=t0+cycleid*dt`
W-data format allows for representation of following elements:
* *variable*:
Each variable is represented by binary file of name `prefix_varname.format`. Variable description has follwing format:
`var name type unit format`
Following formats are allowed:
* `wdat`: default format for WSLDA codes. Binary files contain row data (no header).
* `dpca`: (*deprecated*) previous format of cold atomic codes. Binary file contain header of size 68B where additional info about file content is stored. For this format *wdata* lib provides only reading functionality.
* `npy`: binary files are *numpy* arrays. **Functionality under construction**
In order to switch WSLDA codes to writing in this format add to input file:
`dataformat npy`
* *link*:
It is alternative name for given variable. In case of WSLDA codes in many cases we do computation for systems that exhibit some symmetries, like spin symmetry. Then densities for spin-a and spin-b particles are exactly the same. In order to save disk space we can save only one of them:
`var density_a real none wdat`
and for another one set link:
`link density_b density_a`
* *constant*:
Typically beside variables we have some constants that are useful during the data analysis process. For example, when making plots it is convenient to express variables in dimensionless form, like delta/eF. To provide user info what are values of selected constants we use `const` field:
`const eF 0.5`
#### Low level reading and writing of data
Code snippet demonstrating how to load datablock from `wdat` file:
```c
double *data; // pointer to real data
int cycleid; // id of cycle for loading
char file_name[128]="test_density_a.wdat";
FILE *pFile;
pFile= fopen (file_name, "rb"); // open file
if(pFile==NULL) printf("ERROR: Cannot open file\n");
// set pointer to correct location
size_t blocklength = NX*NY*NZ; // for datadim=3
size_t blocksize = blocklength*8; // for real variable
size_t ptr_shift = 0;
// in case of other formats, like npy:
// ptr_shift += size_of_header; // skip header
ptr_shift += blocksize*cycleid; // set pointer to correct location
if(fseek ( pFile, ptr_shift, SEEK_SET ) != 0 ) printf("ERROR: Cannot set pointer to datablock\n");
size_t test_ele = fread (data , blocksize, 1, pFile); // read datablock
if(test_ele != 1) printf("ERROR: Cannot read datablock\n");
fclose(pFile); // close file
```
Code snippet demonstrating how to add new datablock to `wdat` file:
```c
double *data; // pointer to real data
char file_name[128]="test_density_a.wdat";
FILE *pFile;
pFile= fopen (file_name, "ab"); // open file
if(pFile==NULL) printf("ERROR: Cannot open file\n");
size_t blocklength = NX*NY*NZ; // for datadim=3
size_t blocksize = blocklength*8; // for real variable
size_t test_ele = fwrite (data , blocksize, 1, pFile); // add datablock
if(test_ele != 1) printf("ERROR: Cannot add datablock\n");
// in case of other formats, like npy, additional change of header is required.
// ...
fclose(pFile); // close file
```
#### W-data C library
Folder [lib-wdata](https://gitlab.fizyka.pw.edu.pl/gabrielw/wslda/-/tree/public/lib-wdata) contains library that provides support for w-data processing.
List of examples demonstrating how to use this library.
* [example-write.c](https://gitlab.fizyka.pw.edu.pl/gabrielw/wslda/-/tree/public/lib-wdata/example-write.c): code creates artificial set of variables and writes them do wdat files.
* [example-read.c](https://gitlab.fizyka.pw.edu.pl/gabrielw/wslda/-/tree/public/lib-wdata/example-read.c): code reads data from wdat files (created by [example-write.c](https://gitlab.fizyka.pw.edu.pl/gabrielw/wslda/-/tree/public/lib-wdata/example-write.c)).
* [example-addvar.c](https://gitlab.fizyka.pw.edu.pl/gabrielw/wslda/-/tree/public/lib-wdata/example-addvar.c): code adds new variable to existing data set (created by [example-write.c](https://gitlab.fizyka.pw.edu.pl/gabrielw/wslda/-/tree/public/lib-wdata/example-write.c)).
In order to compile examples (optionally you will need to modify Makefile):
```bash
[gabrielw@dell cold-atoms]$ cd lib-wdata/
[gabrielw@dell lib-wdata]$ make
```
Clone repository
  • API version
  • Automatic interpolations
  • Auxiliary tools
  • Browsing the code
  • Broyden algorithm
  • C and CUDA
  • Campaign of calculations
  • Checking correctness of settings
  • Chemical potentials control
  • Code & Results quality
  • Common failures of static codes
  • Common failures of time dependent codes
  • Computation domain
  • Configuring GPU machine
  • Constraining densities and potentials
View All Pages