How to Use the Trainer API in Hugging Face for Custom Training Loops



Image by Editor | Midjourney

 

Let’s learn how to define custom training loops with Hugging Face’s Trainer API.

 

Preparation

 
First, you must install the below packages for this tutorial:

pip install transformers datasets

 

You also need to install the appropriate PyTorch package which will differ based on your environment, so please do your homework and ensure you have the proper one installed.

With all of your libraries installed, let’s get on with it.

 

Custom Training Loops with Trainer API

 
If you have ever performed the standard Transformer fine-tuning, think about how it works under the hood, and how you could try to tweak it for your own purposes. If your use case is not straightforward and requires specific things to be done, we can develop custom training loops with the Trainer API in order to accomplish these things.

We are able to use the Trainer API as it is; however, we are also able to tweak how we use the Trainer in order to develop custom training loops.

Let’s start by preparing the standard Transformer fine-tuning requirement, which includes the pre-trained model, tokenizer, and dataset.

from transformers import BertForSequenceClassification, BertTokenizer
from datasets import load_dataset

model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

dataset = load_dataset('imdb')

 

We are now using BERT to train a text classifier model.

Next, we will preprocess the data. We will also only select a few data points to boost the training process time.

def preprocess_function(examples):
    return tokenizer(examples['text'], truncation=True, padding=True)

tokenized_datasets = dataset.map(preprocess_function, batched=True)

small_train_dataset = tokenized_datasets['train'].shuffle(seed=42).select(range(100))
small_eval_dataset = tokenized_datasets['test'].shuffle(seed=42).select(range(50))

 

Now we set up the training arguments. We will use only a single epoch with a larger batch size to improve the training time.

from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    logging_dir="./logs",
    per_device_train_batch_size=32,
    per_device_eval_batch_size=32,
    num_train_epochs=1,
    logging_steps=10,
    save_total_limit=2,
)

 

Now we will proceed to developing our custom training loops with the help of Transformers. Here is an example of how we develop our custom Trainer.

from torch.optim import AdamW
from transformers import get_scheduler
from transformers import Trainer

class CustomTrainer(Trainer):
    def create_optimizer_and_scheduler(self, num_training_steps):
        if self.optimizer is None:
            self.optimizer = AdamW(self.model.parameters(), lr=self.args.learning_rate)
        if self.lr_scheduler is None:
            self.lr_scheduler = get_scheduler(
                name="linear",
                optimizer=self.optimizer,
                num_warmup_steps=0,
                num_training_steps=num_training_steps,
            )

    def train(self, resume_from_checkpoint=None, trial=None, ignore_keys_for_eval=None, **kwargs):
        # Initialize optimizer and scheduler
        num_training_steps = len(self.get_train_dataloader()) * self.args.num_train_epochs
        self.create_optimizer_and_scheduler(num_training_steps)

        model = self.model
        for epoch in range(int(self.args.num_train_epochs)):
            print(f"Starting epoch epoch + 1")
      
            for step, batch in enumerate(self.get_train_dataloader()):   
                outputs = model(**batch)
                loss = outputs.loss
                loss.backward()
               
                self.optimizer.step()
                self.lr_scheduler.step()
                self.optimizer.zero_grad()

                if step % self.args.logging_steps == 0:
                    print(f"Step step: Loss = loss.item()")

        print("Training is Done")

 

So what happens in the code above? There are a few things that we customize:

  1. We use AdamW optimizer to update model weights during training
  2. We set the scheduler, which is a linear learning rate, to reduce the learning rate over time
  3. Setting up the training loop and having a specific log for each step

This is how we customize our trainer, which you can tweak even more if you need to introduce something more specific.

Lastly, we would train and evaluate the model.

trainer = CustomTrainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['test'],
)

trainer.train()

evaluation_results = trainer.evaluate()

print(evaluation_results)

 

Output:

'eval_loss': 0.15452663600444794, 'eval_model_preparation_time': 0.0038, 'eval_runtime': 765.5939, 'eval_samples_per_second': 32.654, 'eval_steps_per_second': 1.021

 

Try to master the training loop customization to improve your training workflow.

 

Additional Resources

 

 
 

Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here