Update Uncoordinated multi process GPU usage authored by Alex Wiens's avatar Alex Wiens
......@@ -12,3 +12,16 @@
* check MPI-rank-GPU affinity/visibility
* deploy MPS (https://docs.nvidia.com/deploy/mps/index.html)
```
Eingabe:
* Metrik acc_utilization im accelerator scope
* Metadatum numAcc
* Parameter threshold für einen accelerator
Regel:
load_mean = acc_utilization.mean('all')
load_threshold = job.numAcc * threshold
lowload = load_mean < load_threshold
Ausgabe: lowload ist True
```
\ No newline at end of file