Signal: Segmentation fault (11)
Segmentation fault & ELPA
[gabrielw@node2067 test]$ mpirun -np 8 ./st-wslda-2d input.txt
# CODE: ST-WSLDA-2D
# LATTICE: 64 x 64 x 16
# USING ELPA.
...
# AUTOMATIC DIVISION OF WORK - MAY NOT BE OPTIMAL!
# SETTINGS 8 KZGROUPS, EACH WITH GRID PROCESSES OF SIZE [1 x 1]
...
# SETTING UP ELPA...
# ELPA: ACTIVATING GPUs
# ELPA: SETTINGS COMPLEX KERNEL: `ELPA_2STAGE_COMPLEX_DEFAULT`
# ELPA: SETTINGS SOLVER: `ELPA_SOLVER_1STAGE`
# SETTING UP OF ELPA DONE.
...
# DIAGONALIZATION 0 0...
[node2067:27224] *** Process received signal ***
[node2067:27224] Signal: Segmentation fault (11)
[node2067:27224] Signal code: Invalid permissions (2)
[node2067:27224] Failing at address: 0x7fd86ec00000
[node2067:27224] [ 0] /lib64/libpthread.so.0(+0xf710)[0x7fd9359e4710]
[node2067:27224] [ 1] /usr/local/elpa202005-openmpi311-gcc721-cuda90-lapack391/lib/libelpa.so.15(__elpa1_compute_MOD_tridiag_complex_double+0x96c)[0x7fd935f30edc]
[node2067:27224] [ 2] /usr/local/elpa202005-openmpi311-gcc721-cuda90-lapack391/lib/libelpa.so.15(__elpa1_impl_MOD_elpa_solve_evp_complex_1stage_double_impl+0x79d)[0x7fd935f958bd]
Segmentation fault is typical error generated by diagonalization routines where there is insufficient amount of memory. In the example above calculations were executed with p=1
and q=1
parameters. In this case it was sufficient to set in input file
p 2
q 2
to solve the problem. For more info see Memory usage of st-wslda.