Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • wslda wslda
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • wtools
  • wsldawslda
  • Wiki
  • Setting up diagonalization engine

Last edited by Gabriel Wlazłowski May 20, 2021
Page history
This is an old version of this page. You can view the most recent version or browse the history.

Setting up diagonalization engine

ELPA Library

For static calculations, it is recommended to use ELPA Library, which has better performance than ScaLapack. In particular, ELPA allows for the utilization of GPUs which provide a significant boost for calculations. In order to activate ELPA lib in predefines.h set:

// select diagonalization routine
#define DIAGONALIZATION_ROUTINE ELPA

Moreover, you need to inspect carefully part:

// ---------------------- ELPA SETTINGS ---------------------------
// Fill this part only if ELPA library is used for diagonnalization

// uncomment it if you want to activate GPU for diagonalizations 
#define ELPA_USE_GPU

// Select ELPA kernels
#define ELPS_USE_SOLVER ELPA_SOLVER_1STAGE
#define ELPA_USE_COMPLEX_KERNEL ELPA_2STAGE_COMPLEX_DEFAULT
#define ELPA_USE_REAL_KERNEL ELPA_2STAGE_REAL_DEFAULT

// Fraction of eigenvectors to be extracted in each cycle.
// 1.0 corresponds to extraction if all eigenvectors (USE IT IF YOU YOU ARE NOT SURE)
// NOTE: value of this parameter should assure that all eigenstates below requested Ec are extracted.  
// NOTE: For 3D case this value typically can be set to 0.78
#define ELPA_NEV_FRACTION 1.0

Documentation

  1. Eigenvalue SoLvers for Petaflop-Applications (ELPA)
  2. Wiki: Eigenvalue SoLvers for Petaflop-Applications (ELPA)
  3. ELPA installation guide

Publications about ELPA performance

  1. GPU-Acceleration of the ELPA2 Distributed Eigensolver for Dense Symmetric and Hermitian Eigenproblems

ScaLapack library

If the target system does not provide ELPA library user can use (standard) diagonalization library: ScaLAPACK. W-SLDA Toolkit can utilize the following ScaLapack diagonalization engines:

#define DIAGONALIZATION_ROUTINE PZHEEVR

or

#define DIAGONALIZATION_ROUTINE PZHEEVD

It is recommended to use PZHEEVR. This engine takes advantage from the fact that typically we extract only a fraction of eigenstates. However, we find that in some rare cases (system dependent) this routine does not work correctly. In such a case, PZHEEVD should be used.

Benchmarks & Scalings

All tests correspond to the extraction of all eigenvectors.

matrix size p q mb nb prec. routine system time [sec] cost
32,768 = 2x128^2 6 8 16 16 real ELPA (2-GPU) Cygnus 93 0.05 nh
65,536 = 2x32^3 24 28 8 8 complex ELPA (1-GPU) Summit 118 0.52 nh
128,000 = 2x40^3 20 20 32 32 complex ELPA (1-GPU) Daint 220 24.4 nh
128,000 54 64 32 32 complex ELPA (2-CPU) Daint 677 54.1 nh
128,000 54 64 32 32 complex PZHEEVR Daint 945 75.6 nh
147,456 = 4x64x24^2 24 25 32 32 complex ELPA (1-GPU) Daint 375 62.5 nh
147,456 = 2x768x96 18 18 16 16 double ELPA (1-GPU) Daint 395 35.6 nh
221,184 = 2x48^3 46 84 16 16 complex ELPA (1-GPU) Summit 736 18.8 nh
221,184 46 84 16 16 complex ELPA (2-GPU) Summit 3098 79.2 nh
221,184 46 84 16 16 complex PZHEEVD Summit 5995 153.2 nh
524,288 = 2x64^3 96 112 16 16 complex ELPA (1-GPU) Summit 2,217 157.7 nh
746,496 = 2x72^3 112 192 16 16 complex ELPA (1-GPU) Summit 3,436 488.7 nh
1,769,472 = 2x96^3 300 560 32 32 complex ELPA (1-GPU) Summit 52,024 57,804 nh

(1-GPU): ELPA_SOLVER_1STAGE, ELPA_2STAGE_COMPLEX_GPU or ELPA_2STAGE_REAL_GPU
(2-GPU): ELPA_SOLVER_2STAGE, ELPA_2STAGE_COMPLEX_GPU or ELPA_2STAGE_REAL_GPU
(2-CPU): ELPA_SOLVER_2STAGE, ELPA_2STAGE_COMPLEX_DEFAULT or ELPA_2STAGE_REAL_DEFAULT

Plots

These scalings are derived empirically: points correspond to real measurement on target system, while line shows a fit of ideal scaling for level-3 rutines (\sim N^3)

Summit

The scaling was derived within ALCC grant Quantum Turbulence in Fermi Superfluids. summit-scaling

Clone repository
  • API version
  • Automatic interpolations
  • Auxiliary tools
  • Browsing the code
  • Broyden algorithm
  • C and CUDA
  • Campaign of calculations
  • Checking correctness of settings
  • Chemical potentials control
  • Code & Results quality
  • Common failures of static codes
  • Common failures of time dependent codes
  • Computation domain
  • Configuring GPU machine
  • Constraining densities and potentials
View All Pages