latentbrief
Back to news
Research2w ago

AI Training Reveals Surprising Patterns in Learning and Forgetting

arXiv CS.LG

In brief

  • AI researchers have discovered that a popular method for fine-tuning large language models, called LoRA, causes the models to "unlearn" certain examples.
    • This happens specifically with items where human annotators disagreed on the correct answer.
  • During training, these contested examples showed increased loss, meaning the model became less accurate at predicting them over time.
    • This finding is significant because it highlights a potential issue in how AI models are trained and evaluated.
  • The study looked at six different models-four that use encoders and two that are decoder-only-and found consistent patterns across all of them.
  • The strongest effects were seen in decoder-only models, which showed the most correlation between annotation disagreement and loss during training.
  • The researchers suggest this might be due to noise introduced during the fine-tuning process, but more investigation is needed to fully understand why this happens.
  • For now, developers and researchers should pay close attention to how their models handle contested data to avoid unintended forgetting of important information.

Terms in this brief

LoRA
Low-Rank Adaptation — a method for efficiently fine-tuning large language models by updating only a small subset of their parameters, making the process faster and more resource-efficient. This technique helps in adapting models to specific tasks without retraining the entire model from scratch.

Read full story at arXiv CS.LG

More briefs