How to Use Pre-Trained Language Models for Regression | by Aden Haussmann | Jan, 2025


Why and how to convert mT5 into a regression metric for numerical prediction

Towards Data Science
Screenshot of https://huggingface.co/google/mt5-large

My undergraduate honour’s dissertation was a Natural Language Processing (NLP) research project. It focused on multilingual text generation in under-represented languages. Because existing metrics performed very poorly on evaluating outputs of models trained on the dataset I was using, I needed to train a learned regression metric.

Regression would be useful for many textual tasks, such as:

  • Sentiment analysis: Predict the strength of positive or negative sentiment instead of simple binary classification.
  • Writing quality estimation: Predict how high the quality of a piece of writing is.

For my use case, I needed the model to score how good another model’s prediction was for a given task. My dataset’s rows consisted of the textual input and a label, 0 (bad prediction) or 1 (good prediction).

  • Input: Text
  • Label: 0 or 1
  • The task: Predict a numerical probability between 0 and 1

But transformer-based models are usually used for generation tasks. Why would you use a pre-trained LM for…

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here