Model Architecture
GPT-NeoX-20B
Fine-tuning (Epoch 4/10)
Active Tensor Cores
4,096
Cluster A-100 (x8)
Total VRAM Usage
624 GB
Optimized / Zero-Redundancy
Current Loss
0.4219
▼ 0.004 last step
[SYS] Initializing checkpoint save...
[TRAIN] Batch 49201 processed. Loss: 0.4231
Recent Checkpoints