banner

Automatic text summarization of scientific articles using transformers—A brief review

Seema Aswani, Kabita Choudhary, Sujala Shetty, Nasheen Nur

Abstract


Learning how to read research papers is a skill. The researcher must go through many published articles during the research. It is a challenging and tedious task to go through numerous published articles. The research process would sped up by automatic summarization of scientific publications, which would aid researchers in their investigation. However automatic text summarization of scientific research articles is difficult due to its distinct structure. Various text summarization approaches have been proposed for research article summarization in the past. After the invention of transformer architecture, it has created a big shift in Natural Language Processing. The models based on transformers are able to achieve state-of-the-art results in text summarization. This paper provides a brief review of transformer-based approaches used for text summarization of scientific research articles along with the available corpus and evaluation methods that can be used to assess the model generated summary. The paper also discusses the future direction and limitations in this field.


Keywords


natural language processing; long document summarization; transformers; multi-headed attention; scientific article summarization

Full Text:

PDF

References


1. Radev DR, Hovy E, McKeown K. Introduction to the Special Issue on Summarization. Computational Linguistics. 2002, 28(4): 399-408. doi: 10.1162/089120102762671927

2. Hima Bindu Sri S, Dutta SR. A Survey on Automatic Text Summarization Techniques. Journal of Physics: Conference Series. 2021, 2040(1): 012044. doi: 10.1088/1742-6596/2040/1/012044

3. Kadry S, Yong H, Choi J. Applied sciences Improved Text Summarization of News Articles Using GA-HC. 2021.

4. Joshi A, Fidalgo E, Alegre E, et al. SummCoder: An unsupervised framework for extractive text summarization based on deep auto-encoders. Expert Systems with Applications. 2019, 129: 200-215. doi: 10.1016/j.eswa.2019.03.045

5. Wang Q, Liu P, Zhu Z, et al. A Text Abstraction Summary Model Based on BERT Word Embedding and Reinforcement Learning. Applied Sciences. 2019, 9(21): 4701. doi: 10.3390/app9214701

6. Sharma G, Sharma D. Automatic Text Summarization Methods: A Comprehensive Review. SN Computer Science. 2022, 4(1). doi: 10.1007/s42979-022-01446-w

7. Rush JW, Alexander M., Sumit Chopra. A Neural Attention Model for Sentence Summarization Alexander. Conference on Empirical Methods in Natural Language Processing, 2017. 5(3): 379-389.

8. Teufel S, Moens M. Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status. Computational Linguistics. 2002, 28(4): 409-445. doi: 10.1162/089120102762671936

9. Bhatia S, Caragea C, Chen HH, et al. Specialized Research Datasets in the CiteSeerX Digital Library. D-Lib Magazine. 2012, 18(7/8). doi: 10.1045/july2012-bhatia

10. Zhang JG, Li JP, Li H. Language Modeling with Transformer. In: Proceedings of the 2019 16-th International Computer Conference on Wavelet Active Media Technology and Information Processing, December 2019. pp. 249–253.

11. Tomas Mikolov SK, Karafiat M, Burget L. Recurrent neural network based language model. Proceedings of the Annual Meeting of the Association for Computational Linguistics, September 2020. pp. 8093–8104.

12. Luo H, Jiang L, Belinkov Y, et al. Improving neural language models by segmenting, attending, and predicting the future. ACL 2019—57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, pp. 1483–1493, 2020.

13. Vaswani A, Shazeer N, Parmar N. Attention is all you need. ArXiv 2023; arXiv:1706.03762.

14. Anitha J, Raahavi M, Rehapriadarsini M, Sudarshana SS. Abstractive Text Summarization. Journal of Xidian University, 2020. 14(6): 854–857. doi: 10.37896/jxu14.6/094

15. Nenkova A. Automatic Summarization. Foundations and Trends® in Information Retrieval. 2011, 5(2): 103-233. doi: 10.1561/1500000015

16. Luhn HP. The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development. 1958, 2(2): 159-165. doi: 10.1147/rd.22.0159

17. Deerwester S, Dumais ST, Furnas GW, et al. Indexing by latent semantic analysis. Journal of the American Society for Information Science. 1990, 41(6): 391-407. doi: 10.1002/(sici)1097-4571(199009)41: 6<391: : aid-asi1>3.0.co, 2-9

18. Gong Y, Liu X. Generic text summarization using relevance measure and latent semantic analysis. In: SIGIR Forum (ACM Special Interest Group on Information Retrieval), pp. 19–25, 2001.

19. Mihalcea R, Tarau P. TextRank: Bringing order into texts. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, EMNLP 2004—A meeting of SIGDAT, a Special Interest Group of the ACL held in conjunction with ACL 2004, vol. 85, pp. 404–411, 2004.

20. Page LMR, Brin S. The Page Rank Citation Ranking: Bringing Order to the Web.

21. Saggion H, Poibeau T. Automatic Text Summarization: Past, Present and Future To cite this version: HAL Id: hal-00782442 Automatic Text Summarization: Past, Present and Future. 2016.

22. El-Kassas WS, Salama CR, Rafea AA, et al. Automatic text summarization: A comprehensive survey. Expert Systems with Applications. 2021, 165: 113679. doi: 10.1016/j.eswa.2020.113679

23. Ganesan K, Zhai CX, Han J. Opinosis: A graph-based approach to abstractive summarization of highly redundant opinions. In: Coling 2010—23rd International Conference on Computational Linguistics, Proceedings of the Conference, vol. 2, no. August, pp. 340–348, 2010.

24. Genest PE, Lapalme G. © In: 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012—Proceedings of the Conference; July 2012; Jeju Island, Korea. pp. 354–358.

25. Bahdanau D, Cho KH, Bengio Y. Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations, ICLR 2015—Conference Track Proceedings, pp. 1–15, 2015.

26. Giarelis N, Mastrokostas C, Karacapilidis N. Abstractive vs. Extractive Summarization: An Experimental Review. Applied Sciences. 2023, 13(13): 7620. doi: 10.3390/app13137620

27. Cohan A, Goharian N. Scientific document summarization via citation contextualization and scientific discourse. International Journal on Digital Libraries. 2017, 19(2-3): 287-303. doi: 10.1007/s00799-017-0216-8

28. Cohan A, Goharian N. Scientific article summarization using citation-context and article’s discourse structure. In: Conference Proceedings—EMNLP 2015: Conference on Empirical Methods in Natural Language Processing; 1 September 2015; pp. 390–400.

29. Jha R, Abu-Jbara A, Radev D. A system for summarizing scientific topics starting from keywords. ACL 2013—51st Annual Meeting of the Association for Computational Linguistics. In: Proceedings of the Conference, vol. 2, pp. 572–577, 2013.

30. Khurana D, Koli A, Khatter K, et al. Natural language processing: state of the art, current trends and challenges. Multimedia Tools and Applications. 2022, 82(3): 3713-3744. doi: 10.1007/s11042-022-13428-4

31. Cohan A, Dernoncourt F, Kim DS, et al. A discourse-aware attention model for abstractive summarization of long documents. In: NAACL HLT 2018—2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies—Proceedings of the Conference, vol. 2, pp. 615–621, 2018.

32. Cachola I, Lo K, Cohan A, Weld DS. TLDR: Extreme summarization of scientific documents. Find. Assoc. Comput. Linguist. Find. ACL EMNLP 2020. pp. 4766–4777. doi: 10.18653/v1/2020.findings-emnlp.428

33. Lu Y, Dong Y, Charlin L. Multi-XScience: A large-scale dataset for extreme multi-document summarization of scientific articles. In: EMNLP 2020—2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 8068–8074, 2020.

34. Gupta A, Chugh D, Anjum, et al. Automated News Summarization Using Transformers. Lecture Notes in Electrical Engineering, vol. 840, pp. 249–259, 2022.

35. Zhang J, Zhao Y, Saleh M, Liu PJ. PEGASUS: Pre-Training with extracted gap-sentences for abstractive summarization. In: 37th International Conference on Machine Learning, ICML 2020, vol. PartF16814, pp. 11265–11276, 2020.

36. Dodge J, Sap M, Marasović A, et al. Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus. In: EMNLP 2021—2021 Conference on Empirical Methods in Natural Language Processing, Proceedings, no. Table 1, pp. 1286–1305, 2021.

37. Raffel C. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 2020. 21: 1–67.

38. Phang J, Zhao Y, Liu P. Investigating Efficiently Extending Transformers for Long Input Summarization. 2022.

39. Zhuang F, Qi Z, Duan K, et al. A Comprehensive Survey on Transfer Learning. Proceedings of the IEEE, 2021, 109(1): 43–76. doi: 10.1109/JPROC.2020.3004555

40. Xue L, Constant N, Roberts A, et al. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer. In: NAACL-HLT 2021—2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, pp. 483–498, 2021.

41. Guo M, Ainslie J, Uthus D, et al. LongT5: Efficient Text-To-Text Transformer for Long Sequences. In: Findings of the Association for Computational Linguistics: NAACL 2022—Findings, pp. 724–736, 2022.

42. Lewis M, Liu Y, Goyal N, et al. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880, 2020.

43. Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: NAACL HLT 2019—2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies—Proceedings of the Conference, vol. 1, no. Mlm, pp. 4171–4186, 2019.

44. Brown TB. Language models are few-shot learners. Adv Neural Inf Process Syst, 2020.

45. Liu Y, Gu J, Goyal N, et al. Multilingual Denoising Pre-training for Neural Machine Translation. Transactions of the Association for Computational Linguistics. 2020, 8: 726-742. doi: 10.1162/tacl_a_00343

46. Xiong W, Gupta A, Toshniwal S, et al. Adapting Pretrained Text-to-Text Models for Long Text Sequences. 2022.

47. Ivgi M, Shaham U, Berant J. Efficient Long-Text Understanding with Short-Text Models. Transactions of the Association for Computational Linguistics. 2023, 11: 284-299. doi: 10.1162/tacl_a_00547

48. Zaheer M. Big bird: Transformers for longer sequences. Adv Neural Inf Process Syst, 2020.

49. Lin CY, Hovy E. Automatic evaluation of summaries using N-gram co-occurrence statistics. In: Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, HLT-NAACL 2003. June 2003, pp. 71–78.

50. Wentzel G. Funkenlinien im Röntgenspektrum. Annalen der Physik. 1922, 371(23): 437-461. doi: 10.1002/andp.19223712302




DOI: https://doi.org/10.32629/jai.v7i5.1331

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Seema Aswani, Kabita Choudhary, Sujala Shetty, Nasheen Nur

License URL: https://creativecommons.org/licenses/by-nc/4.0/