Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • wslda wslda
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • wtools
  • wsldawslda
  • Wiki
  • Configuring GPU machine

Configuring GPU machine · Changes

Page history
Create Configuring GPU machine authored Dec 09, 2020 by Gabriel Wlazłowski's avatar Gabriel Wlazłowski
Hide whitespace changes
Inline Side-by-side
Configuring-GPU-machine.md 0 → 100644
View page @ 527df92a
# Introduction
`td` codes require machines equipped with GPUs. The standard scenario assumes that the number of parallel MPI processes is equal number of GPUs. It is the user responsibility to provide correct prescription that uniquely assigns GPU devices to MPI processes. This step depends on the archituctture of target machine. To set up correctly prfile of target machine you need to modify `predefines.h`. To printout on screen applied mapping of MPI processes to GPUs use:
```c
/**
* Activate this flag in order to print to stdout
* applied mapping mpi-process <==> device-id.
* */
#define PRINT_GPU_DISTRIBUTION
```
# Machine with uniformly distributed GPU cards of the same type
It it the most common case. In such case it is sufficient to use default settings, by commenting out:
```c
/**
* Activate this flag if target machine has non-standard distribution of GPUs.
* In such case you need to provide body of function `assign_deviceid_to_mpi_process`.
* If this flag is commented-out it is assumed that code is running on a machine
* with uniformly distributed GPU cards across the nodes,
* and each node has `gpuspernode` (input file parameter) cards.
* */
// #define CUSTOM_GPU_DISTRIBUTION
```
and in input file setting number of GPUs that each node is eqquipped with:
```bash
gpuspernode 1 # number of GPUs per node (resource set), default=1
```
You need to execute the code with number of MPI processes equal number of GPUs. For example, if each node is eqquipped with one GPU and you plan to run the code on 512 nodes you should call it as (schematic notation):
```bash
mpirun -n 512 --ntasks-per-node=1 ./td-wslda-3d input.txt
```
# Machine with non-uniform distribution of GPUs
In such case you need to define GPUs distribution. As example consider machine that has 7 nodes and cards are diistributes as follow (content of file `nodes.txt`):
```bash
node2061.grid4cern.if.pw.edu.pl slots=8
node2062.grid4cern.if.pw.edu.pl slots=8
node2063.grid4cern.if.pw.edu.pl slots=4
node2064.grid4cern.if.pw.edu.pl slots=4
node2065.grid4cern.if.pw.edu.pl slots=4
node2066.grid4cern.if.pw.edu.pl slots=4
node2067.grid4cern.if.pw.edu.pl slots=8
```
GPU distribution is defined as follow:
```c
/**
* Activate this flag if target machine has non-standard distribution of GPUs.
* In such case you need to provide body of function `assign_deviceid_to_mpi_process`.
* If this flag is commented-out it is assumed that code is running on a machine
* with uniformly distributed GPU cards accross the nodes,
* and each node has `gpuspernode` (input file paramater) cards.
* */
#define CUSTOM_GPU_DISTRIBUTION
/**
* This function is used to assign unique device-id to mpi process.
* @param comm MPI communicator
* @return device-id assign to the process extracted by function MPI_Comm_rank(...)
* DO NOT REMOVE STATEMENT `#if ... BELOW !!!
* */
#if defined(CUSTOM_GPU_DISTRIBUTION) && defined(TDWSLDA_MAIN)
int assign_deviceid_to_mpi_process(MPI_Comm comm)
{
int np, ip;
MPI_Comm_size(comm, &np);
MPI_Comm_rank(comm, &ip);
// assign here deviceid to process with ip=iam
int deviceid=0;
if(ip==0) printf("# CUSTOM GPU DISTRIBUTION FOR MACHINE: DWARF\n");
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, &name_len);
int *ompi_local_rank;
ompi_local_rank = (int *)malloc(sizeof(int)*np);
int ompi_ppn=4;
if(strcmp (processor_name,"node2061.grid4cern.if.pw.edu.pl")==0) ompi_ppn=8;
if(strcmp (processor_name,"node2062.grid4cern.if.pw.edu.pl")==0) ompi_ppn=8;
if(strcmp (processor_name,"node2067.grid4cern.if.pw.edu.pl")==0) ompi_ppn=8;
MPI_Allgather(&ompi_ppn,1,MPI_INT,ompi_local_rank,1,MPI_INT,MPI_COMM_WORLD);
int ompi_i=0, ompi_j;
while(ompi_i<np)
{
if(ompi_local_rank[ompi_i]==8)
{
for(ompi_j=0; ompi_j<8; ompi_j++) ompi_local_rank[ompi_i+ompi_j]=ompi_j;
ompi_i+=8;
}
else
{
for(ompi_j=0; ompi_j<4; ompi_j++) ompi_local_rank[ompi_i+ompi_j]=ompi_j;
ompi_i+=4;
}
}
deviceid=ompi_local_rank[ip];
free(ompi_local_rank);
return deviceid;
}
#endif
```
\ No newline at end of file
Clone repository
  • API version
  • Automatic interpolations
  • Auxiliary tools
  • Browsing the code
  • Broyden algorithm
  • C and CUDA
  • Campaign of calculations
  • Checking correctness of settings
  • Chemical potentials control
  • Code & Results quality
  • Common failures of static codes
  • Common failures of time dependent codes
  • Computation domain
  • Configuring GPU machine
  • Constraining densities and potentials
View All Pages