|
Including news articles, scientific documents, patents, stories, emails, legal documents and guidance on how to proceed, thus demonstrating that the model adapts to a wide variety of topics. While PEGASUS showed notable performance with large datasets, it is surprising to find that in learning the model it did not require a large number of fine-tuning examples to achieve near-state-of-the-art performance.
Summaries of human quality While automatic metrics Bosnia and Herzegovina Mobile Number List Like ROUGE are useful for measuring progress during model development, they provide only limited information and don't tell us the whole story. For example, they don't tell us if the text is fluid or if it compares favorably with human performance. To this end, a human evaluation was conducted, in which the evaluators were asked to compare the summaries of the PEGASUS model with those written by humans without knowing whether the summary was the work of a person or a machine.
The experiment was carried out with 3 different datasets and it was found that human raters do NOT always prefer human summaries to those produced by PEGASUS! Furthermore, models trained with only 1000 examples performed almost as well. In particular, with the much studied XSum and CNN/Dailymail datasets the model achieves human-like performance using only 1000 examples.
|
|