banner

New Combined Method to Improve Arabic POS Tagging

Mohamed Labidi

Abstract


One of the important tasks in Natural language processing is the part of speech tagging. For the Arabic language we have a lot of works but their performances do not rise to the required level, due to the complexity of the task and the Arabic language characteristics. In this work we study a combination between twodifferent approaches for Arabic POS-Taggers. The first one isa maximum entropy-based one, and the second is a statistical/rule-based one. Furthermore, we add a knowledge-based method to annotate Arabic particles. Our idea improves the accuracy rate. We passed from almost 85% to almost 90% using our combined method, which seem promoter.

Keywords


POS-Tagger, Natural language processing, Arabic language

Full Text:

PDF

References


1. Ababou N, Mazroui A (2016) A hybrid Arabic POS tagging for simple and compound morpho-syntactic tags. International Journal of Speech Technology 19:289302.

2. Algrainy S, AlSerhan H M, Ayesh A (2008) Pattern-based algorithm for part-of-speech tagging. In International Conference on Computer Engineering and Systems, ICCES, pages 119-124.

3. Ann Bies. http://www.ircs.upenn.edu/arabic/Jan03release/arabic-POStags-collapseto-PennPOStags.txt

4. Berger A, Della Pietra S, Della Pietra V J (1996) A maximum entropy approach to natural language processing. Computational Lingustics, 22(1):39-71.

5. Brill E (1995) Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational Lingustics, 21(4):543-566.

6. Jurafsky D, Martin J (2009) Speech and Language Processing. Pearson Education.

7. Khoja S (2001) Apt: Arabic part-of-speech tagger. In Proceedings of the Student Workshop at the Second Meeting of the North American Chapter of the Association for Computational Linguistics.




DOI: https://doi.org/10.32629/jai.v1i2.30

Refbacks

  • There are currently no refbacks.


Copyright (c) 2019 Mohamed Labidi

License URL: https://creativecommons.org/licenses/by-nc/4.0