banner

A comparative analysis of lexical-based automatic evaluation metrics for different Indic language pairs

Kiranjeet Kaur, Shweta Chauhan

Abstract


With the rise of machine translation systems, it has become essential to evaluate the quality of translations produced by these systems. However, the existing evaluation metrics designed for English and other European languages may not always be suitable or apply to other Indic languages due to their complex morphology and syntax. Machine translation evaluation (MTE) is a process of assessing the quality and accuracy of the machine-translated text. MTE involves comparing the machine-translated output with the reference translation to calculate the level of similarity and correctness. Therefore, this study evaluates different metrics, namely, BLEU, METEOR, and TER to identify the most suitable evaluation metric for Indic languages. The study uses datasets for Indic languages and evaluates the metrics on various translation systems. The study contributes to the field of MT by providing insights into suitable evaluation metrics for Indic languages. This research paper aims to study and compare several lexical automatic machine translation evaluation metrics for Indic languages. For this research analysis, we have selected five language pairs of parallel corpora from the low-resource domain, such as English–Hindi, English-Punjabi, English-Gujarati, English-Marathi, and English-Bengali. All these languages belong to the Indo-Aryan language family and are resource-poor. A comparison of the state of art MT is presented and shows which translator works better on these language pairs. For this research work, the natural language toolkit tokenizers are used to assess the analysis of the experimental results. These results have been performed by taking two different datasets for all these language pairs using fully automatic MT evaluation metrics. The research study explores the effectiveness of these metrics in assessing the quality of machine translations between various Indic languages. Additionally, this dataset and analysis will make it easier to do future research in Indian MT evaluation.


Keywords


automatic machine evaluation; evaluation metrics; Indic languages; machine translation; natural language processing

Full Text:

PDF

References


1. Andrabi SAB, Wahid A. Machine Translation System Using Deep Learning for English to Urdu. Computational Intelligence and Neuroscience. 2022, 2022: 1-11. doi: 10.1155/2022/7873012

2. Khan NJ, Anwar W, Durrani N. Machine translation approaches and survey for Indian languages. arXiv. 2017, arXiv:1701.04290.

3. Hendy A, Abdelrehim M, Sharaf A, et al. How good are GPT models at machine translation? A comprehensive evaluation. arXiv. 2023, arXiv:2302.09210.

4. Rivera-Trigueros I. Machine translation systems and quality assessment: a systematic review. Language Resources and Evaluation. 2021, 56(2): 593-619. doi: 10.1007/s10579-021-09537-5

5. Rei R, Guerreiro NM, Treviso M, et al. The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Published online 2023. doi: 10.18653/v1/2023.acl-short.94

6. Sahaya V, Singh P. Evaluation of Performance Metric of Automatic Machine Translation. International Journal of Computer Science and Software Engineering. 2015, 1: 49-57.

7. Mrinalini K, P V, Thangavelu N. SBSim: A Sentence-BERT Similarity-Based Evaluation Metric for Indian Language Neural Machine Translation Systems. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2022, 30: 1396-1406. doi: 10.1109/taslp.2022.3161160

8. Garje GV, Bansode A, Gandhi S, et al. Marathi to English Sentence Translator for Simple Assertive and Interrogative Sentences. International Journal of Computer Applications. 2016, 138(5): 42-45. doi: 10.5120/ijca2016908837

9. Ramesh A, Parthasarathy VB, Haque R, et al. Comparing Statistical and Neural Machine Translation Performance on Hindi-To-Tamil and English-To-Tamil. Digital. 2021, 1(2): 86-102. doi: 10.3390/digital1020007

10. Hasler E, de Gispert A, Stahlberg F, et al. Source sentence simplification for statistical machine translation. Computer Speech & Language. 2017, 45: 221-235. doi: 10.1016/j.csl.2016.12.001

11. Xia Y. Research on statistical machine translation model based on deep neural network. Computing. 2019, 102(3): 643-661. doi: 10.1007/s00607-019-00752-1

12. Choudhary H, Rao S, Rohilla R. Neural Machine Translation for Low-Resourced Indian Languages. arXiv. 2020, arXiv:2004.13819.

13. Papineni K, Roukos S, Ward T, et al. BLEU. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics—ACL ’02. Published online 2001. doi: 10.3115/1073083.1073135

14. Ananthakrishnan R, Bhattacharyya P, Sasikumar M, Shah RM. Some issues in automatic evaluation of English-hindi MT: More blues for BLEU. 2007.

15. Denkowski M, Lavie A. Meteor Universal: Language Specific Translation Evaluation for Any Target Language. Proceedings of the Ninth Workshop on Statistical Machine Translation. Published online 2014. doi: 10.3115/v1/w14-3348

16. Banerjee S, Lavie A. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In: Goldstein J, Lavie A, Lin CY, Voss C. Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Association for Computational Linguistics; 2005. pp. 65-72.

17. Snover MG, Madnani N, Dorr B, et al. TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate. Machine Translation. 2009, 23(2-3): 117-127. doi: 10.1007/s10590-009-9062-9

18. Kandimalla A, Lohar P, Maji SK, et al. Improving English-to-Indian Language Neural Machine Translation Systems. Information. 2022, 13(5): 245. doi: 10.3390/info13050245

19. Dewangan S, Alva S, Joshi N, et al. Experience of neural machine translation between Indian languages. Machine Translation. 2021, 35(1): 71-99. doi: 10.1007/s10590-021-09263-3

20. Philip J, Namboodiri VP, Jawahar CV. A baseline neural machine translation system for Indian languages. arXiv. 2019, arXiv:1907.12437.

21. Sai BA, Dixit T, Nagarajan V, et al. IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation Metrics for Indian Languages. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Published online 2023. doi: 10.18653/v1/2023.acl-long.795

22. Google Translate. Available online: https://translate.google.com/ (accessed on 28 July 2023).

23. Bing Microsoft Translator. Available online: https://www.bing.com/translator (accessed on 29 July 2023).

24. Yandex Translator. Available online: https://translate.yandex.com/ (accessed on 31 July 2023).

25. ImTranslator. Available online: http://imtranslator.com/ (accessed on 31 July 2023).

26. AI4Bharat Open-Source Dataset. Available online. https://ai4bharat.iitm.ac.in/datasets/ (accessed on 26 July 2023).




DOI: https://doi.org/10.32629/jai.v7i4.1393

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Kiranjeet Kaur, Shweta Chauhan

License URL: https://creativecommons.org/licenses/by-nc/4.0/