This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
namd2023 [2024/02/28 21:25] root gp |
namd2023 [2024/03/04 19:55] (current) root |
||
---|---|---|---|
Line 84: | Line 84: | ||
==GPU== | ==GPU== | ||
- | Here using the number of cores available on the node (24/32/64) and one GPU (two or more GPUs ``devices 0,1,2,3`` scale poorly, not recommended or approved for AHPCC public use partitions). | + | Here we are using the number of CPU cores available on the node (24/32/64) and one GPU (two or more GPUs ``devices 0,1,2,3`` scale poorly, not recommended or approved for AHPCC public use partitions). This benchmark simulation scaled significantly with the CPU cores used up to the number of cores present. |
- | On the ``gpu72`` nodes with Intel 6130 and single NVidia V100, it's about 5 times faster than the best CPU version, so are a good use case. On ``agpu72`` nodes with AMD7543 and single A100, it's only about 10% faster than 6130/V100, so that's not a good use case for the more expensive AMD/A100 nodes. | + | On the ``gpu72`` nodes with Intel 6130 and single NVidia V100, it's about 5 times faster than the best CPU version, so are a good use case. On ``agpu72`` nodes with AMD7543 and single A100, it's only about 10% faster than 6130/V100, so that's not a good use case for the more expensive AMD/A100 nodes, unless gpu memory requires the newer GPU. The even more expensive multi-gpu ``qgpu72`` nodes also don't scale well over single-gpu and are not a good use case. |
< | < | ||
+ | # | ||
module load namd/3.0a7 | module load namd/3.0a7 | ||
namd3 +p32 +setcpuaffinity +isomalloc_sync +devices 0 step7.2_production_colvar.inp | namd3 +p32 +setcpuaffinity +isomalloc_sync +devices 0 step7.2_production_colvar.inp | ||
Info: Benchmark time: 32 CPUs 0.0393942 s/step 0.227976 days/ns 0 MB memory | Info: Benchmark time: 32 CPUs 0.0393942 s/step 0.227976 days/ns 0 MB memory | ||
- | + | # | |
- | namd3 +p64 +setcpuaffinity +isomalloc_sync step7.24_production.inp | + | namd3 +p64 +setcpuaffinity +isomalloc_sync |
Info: Benchmark time: 64 CPUs 0.0344332 s/step 0.199266 days/ns 0 MB memory | Info: Benchmark time: 64 CPUs 0.0344332 s/step 0.199266 days/ns 0 MB memory | ||
</ | </ | ||