From 7e52a5f6ea0c08716606de10cd750b12663f437c Mon Sep 17 00:00:00 2001
From: Jannis Klinkenberg <j.klinkenberg@itc.rwth-aachen.de>
Date: Fri, 6 Dec 2024 10:53:07 +0100
Subject: [PATCH] updated README.md

---
 tensorflow/cifar10_distributed/README.md | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tensorflow/cifar10_distributed/README.md b/tensorflow/cifar10_distributed/README.md
index 296a18d..a54fcc9 100644
--- a/tensorflow/cifar10_distributed/README.md
+++ b/tensorflow/cifar10_distributed/README.md
@@ -1,8 +1,9 @@
 # TensorFlow - Distributed Training
 
-This folder contains the following 2 example versions for distributed training:
-- **Version 1:** A TensorFlow native version, that requires a bit more preparation
-- **Version 2:** A version that is using Horovod ontop of TensorFlow
+This folder contains the following 3 example versions for distributed training:
+- **Version 1: (`submit_job_container_single-node.sh`)** A TensorFlow native version that is constraint to a single compute node with multiple GPUs. A single process is serving multiple GPUs with a `tf.distribute.MirroredStrategy`.
+- **Version 2: (`submit_job_container.sh`)** A TensorFlow native version that utilizes multiple processes (1 process per GPU) that work together using a `tf.distribute.MultiWorkerMirroredStrategy`. Although this is not constraint to a single node, it requires a bit more preparation to setup the distributed environment (via `TF_CONFIG` environment variable)
+- **Version 3: (`submit_job_container_horovod.sh`)** A version that is using Horovod ontop of TensorFlow to perform the distributed training and communication of e.g. model weights/updates. Typically, these calls also use 1 process per GPU.
 
 More information and examples concerning Horovod can be found under:
 - https://horovod.readthedocs.io/en/stable/tensorflow.html
-- 
GitLab