Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • wslda wslda
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • wtools
  • wsldawslda
  • Wiki
  • Common failures of static codes

Last edited by Gabriel Wlazłowski Dec 17, 2020
Page history

Common failures of static codes

Signal: Segmentation fault

Segmentation fault & ELPA

[gabrielw@node2067 test]$ mpirun -np 8 ./st-wslda-2d input.txt 
# CODE: ST-WSLDA-2D
# LATTICE: 64 x 64 x 16
# USING ELPA.
...
# AUTOMATIC DIVISION OF WORK - MAY NOT BE OPTIMAL!
# SETTINGS 8 KZGROUPS, EACH WITH GRID PROCESSES OF SIZE [1 x 1]
...
# SETTING UP ELPA...
# ELPA: ACTIVATING GPUs
# ELPA: SETTINGS COMPLEX KERNEL: `ELPA_2STAGE_COMPLEX_DEFAULT`
# ELPA: SETTINGS SOLVER: `ELPA_SOLVER_1STAGE`
# SETTING UP OF ELPA  DONE.
 ...
# DIAGONALIZATION 0 0...
[node2067:27224] *** Process received signal ***
[node2067:27224] Signal: Segmentation fault (11)
[node2067:27224] Signal code: Invalid permissions (2)
[node2067:27224] Failing at address: 0x7fd86ec00000
[node2067:27224] [ 0] /lib64/libpthread.so.0(+0xf710)[0x7fd9359e4710]
[node2067:27224] [ 1] /usr/local/elpa202005-openmpi311-gcc721-cuda90-lapack391/lib/libelpa.so.15(__elpa1_compute_MOD_tridiag_complex_double+0x96c)[0x7fd935f30edc]
[node2067:27224] [ 2] /usr/local/elpa202005-openmpi311-gcc721-cuda90-lapack391/lib/libelpa.so.15(__elpa1_impl_MOD_elpa_solve_evp_complex_1stage_double_impl+0x79d)[0x7fd935f958bd]

Segmentation fault is a typical error generated by diagonalization routines if there is insufficient amount of memory. In the example above calculations were executed with p=1 and q=1 parameters. In this case, it was sufficient to set in input file

p                       2
q                       2

to solve the problem. For more info see Memory usage of st-wslda.

Segmentation fault & ScaLAPACK

# LATTICE: 128 x 128 x 32
# USING SCALAPACK WITH PZHEEVR.
 ...
# DIAGONALIZATION 0 15...
srun: error: nid00111: tasks 0-17: Segmentation fault
srun: Terminating job step 512966.0
slurmstepd: error: *** STEP 512966.0 ON nid00111 CANCELLED AT 2020-06-25T17:02:46 ***

It is again error related to insufficient memory, see Segmentation fault & ELPA.

Clone repository

Content of Documentation
Official webpage
W-BSK Toolkit