|
|
|
[[_TOC_]]
|
|
|
|
# Diagonalization engine
|
|
|
|
For diagonalizations of the BdG matrix, the W-SLDA toolkit can utilize libraries:
|
|
|
|
* [ScaLapack](https://www.netlib.org/scalapack/scalapack_home.html) (routines: [pzheevr](https://www.netlib.org/scalapack/explore-html/d1/d81/pzheevr_8f.html) or [pzheevd](https://www.netlib.org/scalapack/explore-html/da/d75/pzheevd_8f.html))
|
|
|
|
* [ELPA](https://elpa.mpcdf.mpg.de/)
|
|
|
|
|
|
|
|
The selection is the diagonalization library is done via `machine.h` file:
|
|
|
|
```c
|
|
|
|
/**
|
|
|
|
* Select diagonalization routine
|
|
|
|
* ELPA demonstrates the best performance; use it if the target system supports this library.
|
|
|
|
* Otherwise use standard ScaLapack lib (PZHEEV?).
|
|
|
|
* In case of ScaLapack, it is recommended to use PZHEEVR, unless this routine does not work correctly (it may happen on some systems)
|
|
|
|
* For more info, see: Wiki -> Setting up diagonalization engine
|
|
|
|
* */
|
|
|
|
#define DIAGONALIZATION_ROUTINE PZHEEVR
|
|
|
|
// #define DIAGONALIZATION_ROUTINE PZHEEVD
|
|
|
|
// #define DIAGONALIZATION_ROUTINE ELPA
|
|
|
|
```
|
|
|
|
|
|
|
|
# ELPA Library
|
|
|
|
For static calculations, it is recommended to use [ELPA](https://elpa.mpcdf.mpg.de/) Library, which has better performance than ScaLapack. In particular, ELPA allows for the utilization of GPUs which provide a significant boost for calculations. In order to activate ELPA lib in [predefines.h](https://gitlab.fizyka.pw.edu.pl/wtools/wslda/-/tree/public/st-myproject-template/predefines.h) set:
|
|
|
|
For static calculations, it is recommended to use the [ELPA](https://elpa.mpcdf.mpg.de/) Library, which performs better than ScaLapack. In particular, ELPA supports the use of GPUs, which provide a significant boost to calculations. Once the ELPA library is activated
|
|
|
|
```c
|
|
|
|
// select diagonalization routine
|
|
|
|
// #define DIAGONALIZATION_ROUTINE PZHEEVR
|
|
|
|
// #define DIAGONALIZATION_ROUTINE PZHEEVD
|
|
|
|
#define DIAGONALIZATION_ROUTINE ELPA
|
|
|
|
```
|
|
|
|
Moreover, you need to inspect carefully part:
|
|
|
|
```
|
|
|
|
you also need to inspect carefully further parts:
|
|
|
|
```c
|
|
|
|
// ---------------------- ELPA SETTINGS ---------------------------
|
|
|
|
// Fill this part only if ELPA library is used for diagonnalization
|
|
|
|
|
|
|
|
// uncomment it if you want to activate GPU for diagonalizations
|
|
|
|
#define ELPA_USE_GPU
|
|
|
|
/**
|
|
|
|
* ---------------------- ELPA SETTINGS ---------------------------
|
|
|
|
* Fill this part only if the ELPA library is used for diagonalization
|
|
|
|
*
|
|
|
|
* Default settings are: ELPA_SOLVER_1STAGE
|
|
|
|
* but you can overwrite using the options below
|
|
|
|
* */
|
|
|
|
|
|
|
|
// Select ELPA kernels
|
|
|
|
#define ELPS_USE_SOLVER ELPA_SOLVER_1STAGE
|
|
|
|
#define ELPA_USE_COMPLEX_KERNEL ELPA_2STAGE_COMPLEX_DEFAULT
|
|
|
|
#define ELPA_USE_REAL_KERNEL ELPA_2STAGE_REAL_DEFAULT
|
|
|
|
/**
|
|
|
|
* Uncomment it if you want to activate GPUs for diagonalizations
|
|
|
|
* */
|
|
|
|
// #define ELPA_USE_GPU
|
|
|
|
|
|
|
|
// Fraction of eigenvectors to be extracted in each cycle.
|
|
|
|
// 1.0 corresponds to extraction if all eigenvectors (USE IT IF YOU YOU ARE NOT SURE)
|
|
|
|
// NOTE: value of this parameter should assure that all eigenstates below requested Ec are extracted.
|
|
|
|
// NOTE: For 3D case this value typically can be set to 0.78
|
|
|
|
#define ELPA_NEV_FRACTION 1.0
|
|
|
|
/**
|
|
|
|
* Select ELPA kernels,
|
|
|
|
* for more info, see the documentation of the ELPA lib
|
|
|
|
* */
|
|
|
|
// #define ELPA_USE_SOLVER ELPA_SOLVER_2STAGE
|
|
|
|
// #define ELPA_USE_COMPLEX_KERNEL ELPA_2STAGE_COMPLEX_GPU
|
|
|
|
// #define ELPA_USE_REAL_KERNEL ELPA_2STAGE_REAL_GPU
|
|
|
|
```
|
|
|
|
|
|
|
|
## Documentation
|
|
|
|
## Further materials for ELPA Library
|
|
|
|
### Documentation
|
|
|
|
1. [Eigenvalue SoLvers for Petaflop-Applications (ELPA)](https://elpa.mpcdf.mpg.de/)
|
|
|
|
2. [Wiki: Eigenvalue SoLvers for Petaflop-Applications (ELPA)](https://gitlab.mpcdf.mpg.de/elpa/elpa/-/wikis/home)
|
|
|
|
3. [ELPA installation guide](ELPA installation guide)
|
|
|
|
|
|
|
|
## Publications about ELPA performance
|
|
|
|
### Publications about ELPA performance
|
|
|
|
1. [GPU-Acceleration of the ELPA2 Distributed Eigensolver for Dense Symmetric and Hermitian Eigenproblems](https://arxiv.org/abs/2002.10991)
|
|
|
|
2. [Fermionic quantum turbulence: Pushing the limits of high-performance computing](https://doi.org/10.1093/pnasnexus/pgae160)
|
|
|
|
|
|
|
|
# ScaLapack library
|
|
|
|
If the target system does not provide ELPA library user can use (standard) diagonalization library: [ScaLAPACK](http://www.netlib.org/scalapack/). W-SLDA Toolkit can utilize the following ScaLapack diagonalization engines:
|
|
|
|
If the target system does not provide the ELPA library, the user can use (standard) diagonalization library: [ScaLAPACK](http://www.netlib.org/scalapack/). W-SLDA Toolkit can utilize the following ScaLapack diagonalization engines:
|
|
|
|
```c
|
|
|
|
#define DIAGONALIZATION_ROUTINE PZHEEVR
|
|
|
|
```
|
| ... | ... | @@ -43,7 +67,7 @@ or |
|
|
|
```c
|
|
|
|
#define DIAGONALIZATION_ROUTINE PZHEEVD
|
|
|
|
```
|
|
|
|
It is recommended to use `PZHEEVR`. This engine takes advantage from the fact that typically we extract only a fraction of eigenstates. However, we find that in some rare cases (system dependent) this routine does not work correctly. In such a case, `PZHEEVD` should be used.
|
|
|
|
It is recommended to use `PZHEEVR`. This engine exploits the fact that, in practice, we typically extract only a fraction of the eigenstates. However, we find that in some rare cases (system-dependent), this routine does not work correctly. In such a case, `PZHEEVD` should be used.
|
|
|
|
|
|
|
|
# Benchmarks & Scalings
|
|
|
|
All tests correspond to the extraction of **all** eigenvectors.
|
| ... | ... | @@ -80,9 +104,9 @@ All tests correspond to the extraction of **all** eigenvectors. |
|
|
|
(2-CPU): `ELPA_SOLVER_2STAGE`, `ELPA_2STAGE_COMPLEX_DEFAULT` or `ELPA_2STAGE_REAL_DEFAULT`
|
|
|
|
|
|
|
|
## Plots
|
|
|
|
These scalings are derived empirically: points correspond to **real** measurement on target system, while line shows a fit of ideal scaling for level-3 rutines ($`\sim N^3`$)
|
|
|
|
These scalings are derived empirically: points correspond to **real** measurement on the target system, while the line shows a fit of ideal scaling for level-3 routines ($`\sim N^3`$)
|
|
|
|
### [Summit](https://docs.olcf.ornl.gov/systems/summit_user_guide.html)
|
|
|
|
The scaling was derived within ALCC grant [Quantum Turbulence in Fermi Superfluids](https://www.olcf.ornl.gov/web-project/quantum-turbulence-in-fermi-superfluids/).
|
|
|
|
The scaling was derived within the ALCC grant [Quantum Turbulence in Fermi Superfluids](https://www.olcf.ornl.gov/web-project/quantum-turbulence-in-fermi-superfluids/).
|
|
|
|
* Raw data: [summit-scaling.txt](uploads/703387968474185a5966643ceb5591c6/summit-scaling.txt)
|
|
|
|
* Gnuplot script: [summit-scaling.gp](uploads/7fc8391f39d6b9d48a6fee8b38c7b3a5/summit-scaling.gp)
|
|
|
|
 |
|
|
\ No newline at end of file |