Notes on Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning

Link to paper: https://arxiv.org/abs/2307.03692

Paper published on: 2023-07-05

Paper's authors: Waseem AlShikh, Manhal Daaboul, Kirk Goddard, Brock Imel, Kiran Kamble, Parikshith Kulkarni, Melisa Russak

GPT3 API Cost: $0.02

GPT4 API Cost: $0.08

Total Cost To Write This: $0.10

Time Savings: 9:1

TLDR:

The Instruction Following Score (IFS) measures how well a language model can follow instructions.
The IFS can distinguish between base models and instruct models.
The IFS can be used to stop the training process once a model can follow instructions well enough.
The ObjecQA metric measures how objective a model's predictions are.
The IFS can track the stages of format-infusion and knowledge-infusion in training.
Many publicly available models have low instruction following scores.
Prompt engineering can improve instruction following scores.
Models' instruction-tuning abilities stabilize after seeing around 8k examples.
Composable feature blocks can optimize model performance.
The research can lead to more user-friendly AI chatbots.

DEEPER DIVE:

Introducing the Instruction Following Score

The research paper we're diving into today introduces a new metric called the Instruction Following Score (IFS). This metric measures a language model's ability to follow instructions, and it's a significant development in the field of AI because it provides a quantifiable measure of a model's performance in this regard.

For instance, consider an AI model that's given the instruction to "describe the weather in San Francisco." The IFS would measure how well the model understands and executes this instruction. If the AI responds with a detailed weather report for San Francisco, it would score highly. If it talks about the weather in New York instead, its score would be lower.

Differentiating Models with IFS

The IFS can be used to distinguish between base models and instruct models. Base models are the initial, pre-trained models, while instruct models are fine-tuned specifically to follow instructions. The paper benchmarks publicly available models of both types and shows that the ratio of well-formatted responses to partial and full sentences can effectively differentiate between the two.

In other words, instruct models, which have been fine-tuned to follow instructions, are more likely to produce well-formatted responses than base models. This distinction can be quantified using the IFS, providing a clear, numerical way to compare the performance of different models.

Using IFS for Early Stopping in Instruct Tuning

Another interesting application of the IFS is as an early stopping criteria for instruct tuning. Instruct tuning is the process of fine-tuning a model to follow instructions. The IFS can be used to monitor this process and stop it once the model's ability to follow instructions reaches a satisfactory level.

The paper computes IFS for Supervised Fine-Tuning (SFT) of 7B and 13B LLaMA models and shows that models learn to follow instructions relatively early in the training process. This suggests that the IFS could be a useful tool for optimizing the training process, as it could prevent unnecessary overfitting by indicating when a model has reached its peak performance in terms of instruction following.

Introducing the ObjecQA Metric

The paper also introduces an auxiliary metric called ObjecQA, which quantifies the objectivity of a model's predictions. This metric is designed to contrast the tone learning curve with the acquisition of semantic and domain-specific knowledge. For example, if a model is asked a question about a controversial topic, ObjecQA could measure how objectively it presents the information, as opposed to expressing a particular viewpoint.

Identifying Stages of Training with IFS

The research aims to identify stages of "format-infusion" and "knowledge-infusion" in the training process for better control over instruct tuning. Format-infusion refers to the stage where the model learns the correct format for responses, while knowledge-infusion refers to the stage where it learns the actual content of the responses. The IFS can be used to track these stages and provide insights into the training process.

Evaluating Models with IFS and ObjecQA

The researchers evaluated several publicly available models and found that their instruction following scores were below 50%. This indicates that there is significant room for improvement in the field of instruction following. They also used prompt engineering, which involves adding extra prompt suffixes or wrappers around instructions, to encourage models to follow instructions. This technique significantly improved instruction following scores.

Supervised Fine-Tuning and Performance Plateau

The researchers conducted supervised fine-tuning on two LLaMa models using an instruct dataset. They found that the models' instruction-tuning capabilities stabilized at around 0.9-0.95 after seeing approximately 8k examples. This suggests that there is a limit to how much a model can improve its instruction following abilities through supervised fine-tuning.

Future Directions and Applications

The research suggests future work on composable feature blocks to achieve desired alignment aspects without unexpected downgrades. These are modular components of a model that can be adjusted independently to optimize performance.

The response tone classifier developed in the study can serve as a starting point for designing chat interfaces for foundation models. This could lead to the development of more effective and user-friendly AI chatbots.

In conclusion, the Instruction Following Score and the ObjecQA metric introduced in this research paper provide valuable tools for evaluating and improving the performance of AI models. These metrics can guide the training process, optimize model performance, and pave the way for future advancements in AI technology.

Notes on Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning