Edit model card

mt5-small-finetuned-inshorts-news-summary

This model is a fine-tuned version of google/mt5-small on the [inshorts-news-summary dataset] (https://huggingface.co/datasets/sandeep16064/news_summary). It achieves the following results on the evaluation set:

  • Loss: 1.5399
  • Rouge1: 54.613
  • Rouge2: 31.1543
  • Rougel: 50.7709
  • Rougelsum: 50.7907

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
3.3244 1.0 5511 1.8904 51.0778 28.3112 47.4136 47.404
2.2747 2.0 11022 1.7450 51.8372 28.9814 48.0917 48.0965
2.0745 3.0 16533 1.6567 52.518 29.7276 48.727 48.7504
1.9516 4.0 22044 1.6210 54.2404 30.8927 50.4042 50.3996
1.8714 5.0 27555 1.5971 53.8556 30.6665 50.112 50.1177
1.8112 6.0 33066 1.5649 54.179 31.0178 50.407 50.4281
1.7644 7.0 38577 1.5605 54.3104 30.7997 50.4555 50.4861
1.7265 8.0 44088 1.5447 54.5593 31.0283 50.6343 50.6605
1.7013 9.0 49599 1.5440 54.7385 31.3073 50.9111 50.9334
1.6864 10.0 55110 1.5399 54.613 31.1543 50.7709 50.7907

Framework versions

  • Transformers 4.32.1
  • Pytorch 2.1.0
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
68

Finetuned from

Dataset used to train sandeep16064/inshorts-news-summary