Skip to content

Integrate with GPT-NeoX

GPT-NeoX is a library for efficiently training large language models with tens of billions of parameters in a multimachine distributed context. This library is currently maintained by EleutherAI.

Instrument your runs with Comet to start managing experiments, create dataset versions and track hyperparameters for faster and easier reproducibility and collaboration.

Comet SDKMinimum SDK versionMinimum GPT-NeoX version
Python-SDK3.45.0master

Start logging

Add the following config or create a separate configuration file:

{
  "use_comet": true,
  "comet_project": "<your-project-name>",
  "comet_experiment_name": "<your-experiment-name>",
  "comet_tags": ["<experiment-tag>"],
  "comet_others": { "<experiment-other-name>": "<experiment-other-value>" },
}

Tip

Find a full list of GPT-NeoX recipe configs here.

Log automatically

When using the Integration, Comet automatically logs the following items, by default, with no additional configuration:

  • Training metrics like train/lm_loss, timers/forward and runtime/flops_per_sec_per_gpu.
  • All hyperparameters like data_path, DeepSpeed configuration and anything else included in the config file.

End-to-end example

The following is a basic example of using Comet with GPT-NeoX using GPT2-3B.

Clone the repo

git clone https://github.com/EleutherAI/gpt-neox/

Install dependencies

python -m pip install -r gpt-neox/requirements/requirements.txt -r gpt-neox/requirements/requirements-comet.txt

Log-in to Comet

comet_ml login

Download the training dataset

cd gpt-neox && python prepare_data.py enwik8

Write the GPT-NeoX config with Comet

Write the following config file to gpt-neox/configs/comet.yml:

{ "use_comet": true }

Run the example on a single node

python ./gpt-neox/deepy.py ./gpt-neox/train.py ./gpt-neox/configs/1-3B.yml ./gpt-neox/configs/slurm_local.yml ./gpt-neox/configs/comet.yml
Dec. 17, 2024