Using Hugging Face Transformers with PyTorch and TensorFlow



Using Hugging Face Transformers with PyTorch and TensorFlow
Image by Author | Ideogram

 

The presence of Generative AI is prominent in many business areas. Since products such as ChatGPT and Midjourney were released, many companies have rushed to implement Generative AI to gain competitive advantages.

With so many companies wanting to gain advantages from AI, many talents realized that learning to use and develop a Generative AI model would improve their careers. The way to understand them is by using the closed-source or open-source model.

Our Top 3 Partner Recommendations

1. Best VPN for Engineers – 3 Months Free – Stay secure online with a free trial

2. Best Project Management Tool for Tech Teams – Boost team efficiency today

4. Best Password Management for Tech Teams – zero-trust and zero-knowledge security

Hugging Face is a platform for the community to share their machine learning model, datasets, notebooks, and many more. The platform is famous as a place to share open-source models, especially the state-of-the-art open-source Generative AI model. It’s also made getting the model with the Transformers library easier.

This article will teach us how to use Hugging Face Transformers with two popular deep-learning frameworks: PyTorch and TensorFlow.

Let’s get into it.

 

Preparation

 
For this tutorial to work, you need to install the Transformers library. We would also install the datasets library to download a sample dataset. You can install it using the following code.

pip install transformers datasets

 

To install the deep learning framework library, select the PyTorch and TensorFlow versions and install them in a way that is appropriate to your environment. When everything is installed correctly, let’s begin our tutorial.

 

Hugging Face with Transformers with PyTorch and TensorFlow

 
As I have mentioned, PyTorch and TensorFlow are two of the most popular deep learning frameworks. The frameworks developed by famous companies such as PyTorch were developed by the Meta group while TensorFlow was by the Google group. They are popular for various but different reasons as the two frameworks have their advantages and disadvantages.

You can start with either framework as both were useful in their ways. PyTorch is often said easier to use than TensorFlow, but TensorFlow 2. x version has made it easier. Also, PyTorch was often used in the research and academia focus and TensorFlow has strong industry adoption. In the context of Hugging Face Transformers, PyTorch is much more seamless as TensorFlow gets less attention for integration with Transformers.

Whichever framework you use, let’s try out both frameworks to use the Hugging Face Transformers. In this tutorial, let’s try to train the LLM model for the binary text classification tasks with the IMDB dataset.

First, we would download all the required libraries and download the IMDB dataset.

import torch
import tensorflow as tf
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TFAutoModelForSequenceClassification, TrainingArguments, Trainer
from datasets import load_dataset

dataset = load_dataset("imdb")

 

Then, we would download the tokenizer for preprocessing the dataset. We can do that with the code below.

model_name = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)

def preprocess_data(examples):
    return tokenizer(examples['text'], padding='max_length', truncation=True)

encoded_dataset = dataset.map(preprocess_data, batched=True)

 

With the dataset preprocessed, we would prepare the dataset to be ready on implemented for the deep learning framework. Here is how we prepare the dataset for the PyTorch and split the dataset into train and test.

# PyTorch
encoded_dataset.set_format(type="torch", columns=['input_ids', 'attention_mask', 'label'])
train_dataset_pt = encoded_dataset['train']
test_dataset_pt = encoded_dataset['test']

 

While for TensorFlow, we need to explicitly transform them into the TensorFlow dataset.

# TensorFlow
train_dataset_tf = encoded_dataset['train'].to_tf_dataset(
    columns=['input_ids', 'attention_mask'],
    label_cols=["label"],
    shuffle=True,
    batch_size=16
)

test_dataset_tf = encoded_dataset['test'].to_tf_dataset(
    columns=['input_ids', 'attention_mask'],
    label_cols=["label"],
    shuffle=False,
    batch_size=16
)

 

Next, we would load the pre-trained model from the HuggingFace Transformers with PyTorch.

# PyTorch
model_pt = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)

 

In contrast, here is how you do it for the TensorFlow framework.

# TensorFlow
model_tf = TFAutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)

 

To fine-tune the model, the PyTorch framework allows for high-level customization of the training configuration and uses the Trainer object for the training process.

#PyTorch
training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=1,
    weight_decay=0.01,
)

trainer = Trainer(
    model=model_pt,
    args=training_args,
    train_dataset=train_dataset_pt,
    eval_dataset=test_dataset_pt,
)

trainer.train()

 

For the TensorFlow fine-tuning process, it uses the compile and fit process.

#TensorFlow
model_tf.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=2e-5),
                 loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                 metrics=['accuracy'])

model_tf.fit(train_dataset_tf, epochs=1)

 

For the PyTorch model evaluation, it can be done simply by passing the model evaluation method with the test dataset.

#PyTorch
trainer.evaluate(test_dataset_pt)

 

 Output>>
'eval_loss': 0.17904508113861084,
 'eval_runtime': 369.1378,
 'eval_samples_per_second': 67.725,
 'eval_steps_per_second': 4.234,
 'epoch': 1.0

 

The process is also similar to the TensorFlow. However, the output template can be quite different.

#TensorFlow
model_tf.evaluate(test_dataset_tf)

 

Output>>
1563/1563 [==============================] - 477s 303ms/step - loss: 0.1789 - accuracy: 0.9313

[0.17890998721122742, 0.9312800168991089]

 

Lastly, the model inferences for the PyTorch are shown in the code below.

test_texts = ["We are in love with this movie!", "I think you can watch something better. Don't waste your time."]
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model_pt.to(device)

inputs = tokenizer(test_texts, padding=True, truncation=True, return_tensors="pt")
inputs = key: value.to(device) for key, value in inputs.items()

with torch.no_grad():
    outputs = model_pt(**inputs)
    predictions = torch.argmax(outputs.logits, dim=-1)

 

Different from PyTorch, TensorFlow doesn’t need you to disable the gradient calculation when initiating the model inferences. Also, TensorFlow automatically handles the device management. You only need to pass it in the model like the code below.

test_texts = ["We are in love with this movie!", "I think you can watch something better. Don't waste your time."]
inputs = tokenizer(test_texts, padding=True, truncation=True, return_tensors="tf")
outputs = model_tf(inputs)
predictions = tf.argmax(outputs.logits, axis=-1)

 

That’s all how you can use PyTorch and TensorFlow for HuggingFace Transformers. Not much difference between the frameworks for the overall process, although the code can be unique for each framework.
 

Conclusion

 
Hugging Face Transformer is a library by Hugging Face in Python to easily access the open-source pre-trained model and the supporting tools. PyTorch and TensorFlow are both can be used for the Hugging Face Transformers with the same overall workflow although with different coding methods. In this article, we have explored how to use the basic usage of Hugging Face Transformers with both PyTorch and TensorFlow.

I hope it helps!
 
 

Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here