Employing Dependency Tree in Machine Learning Based Indonesian Factoid Question Answering System

  • Irfan Afif Professional
  • Ayu Purwarianti

Abstract

We proposed the usage of dependency tree information to increase the accuracy of Indonesian factoid question answering. We employed MSTParser and Universal Dependency corpus to build the Indonesian dependency parser. The dependency tree information as the result of the Indonesian dependency parse is used in the answer finder component of Indonesian factoid question answering system. Here, we used dependency tree information in two ways: 1) as one of the features in machine learning based answer finder (classifying each term in the retrieved passage as part of a correct answer or not); 2) as an additional heuristic rule after conducting the machine learning technique. For the machine learning technique, we combined word based calculation, phrase based calculation and similarity dependency relation based calculation as the complete features. Using 203 data, we were able to enhance the accuracy for the Indonesian factoid QA system compared to related work by only using the phrase information. The best accuracy was 84.34% for the correct answer classification and the best MRR was 0.954.

References

[1] Harabagiu, S. M., M. A. Pasca and S. J. Maiorano. “Experiments with Open-Domain Textual Question Answering,” Proceedings of the 18th International Conference on Computational Linguistics (COLING), Saarbruken, Germany, 2000.
[2] A. A. Zulen and A.Purwarianti, “Using phrase-based approach in machine learning based factoid Indonesian question answering,” Proceedings of CISAK, 2013.
[3] I. Afif, “Study And Performance Comparison Of Cyk Algorithm And Earley Algorithm In Parser Using Simple Indonesian Probabilistic Context Free Grammar,” Bachelor’s Program Thesis, Institut Teknologi Bandung, 2011.
[4] M. Kamayani and A. Pirwarianti, “Dependency Parsing for Indonesian with GULP,” in Proceeding of The 3rd ICEEI, 2011.
[5] F. Ferdian and A. Purwarianti, “Implementation of Semantic Analyzer in Indonesian Text-Understantding Evealuation System,” in Proceeding of CYBERNETICSCOM, 2012.
[6] A. Purwarianti, M. Tsuchiya and S. Nakagawa. “A Machine Learning Approach for Indonesian Question Answering System,” Proceedings of the 25th IASTED International Multi-Conference: Artificial Intelligence and Applications, Innsbruck, Austria, 2007.
[7] S.D. Larasati, dan R. Manurung, “Towards a Semantic Analysis of Bahasa Indonesia for Question Answering,” Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics, Melbourne, Australia, 2007.
[8] Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., . . . Marsi, E., “MaltParser: A Language-Independent System for Data-Driven Dependency Parsing,“ Natural Language Engineering. 13, hal. 95-135. Cambridge Univ Press, 2007.
[9] McDonald, R., Pereira, F., Ribarov, K., dan Hajič, J., “Non-Projective Dependency Parsing Using Spanning Tree Algorithms,” Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing (hal. 523-530). Association for Computational Linguistics, 2005.
[10] A. Rahman, “Pemanfaatan Teknik Ensemble Untuk Pengurai Dependensi Bahasa Indonesia,” Bachelor’s Program Thesis, Institut Teknologi Bandung, 2011.
[11] Comas, P. R., Turmo, J., Marquez, L., “Using dependency parsing and machine learning for factoid question answering on spoken documents,” 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010.
[12] A.F., Wicaksono and A., Purwarianti, “HMM Based Part-of-Speech Tagger for Bahasa Indonesia,” Proceedings of the 4th- International MALINDO Workshop (MALINDO2010), Jakarta, Indonesia, 2010.
[13] Chu, Y.-J., and Liu, T.-H. “On Shortest Arborescence of a Directed Graph”, Scientia Sinica, 1396, 1965.
[14] Edmonds, J., “Optimum Branchings”. Journal of Research of the National Bureau of Standards B, 233-240. 1967.
[15] Green N, Larasati S and Žabokrtský Z, “Indonesian dependency treebank: Annotation and parsing,” In: Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation, Faculty of Computer Science, Universitas Indonesia, Bali,Indonesia, pp 137–145, 2012.
[16] A. A. Zulen and A. Purwarianti, “Study and Implementation of Monolingual Approach on Indonesia Question Answering for Factoid and Non-Factoid Question,” in Proceedung of The 25th PACLIC, 2011.
[17] A Purwarianti, A Saelan, I Afif, F Ferdian, A. F. Wicaksono, “Natural Language Understanding Tools with Low Language Resource in Building Automatic Indonesian Mind Map Generator,” International Journal on Electrical Engineering and Informatics 5 (3), 256, 2013.
[18] M. McCandles, E. Hatcher and O. Gospodnetic, “Lucene in Action,” 2nd ed,. USA: Manning Publication, 2010.
Published
2019-03-31
How to Cite
AFIF, Irfan; PURWARIANTI, Ayu. Employing Dependency Tree in Machine Learning Based Indonesian Factoid Question Answering System. Jurnal Linguistik Komputasional, [S.l.], v. 2, n. 1, p. 28 - 33, mar. 2019. ISSN 2621-9336. Available at: <http://inacl.id/journal/index.php/jlk/article/view/9>. Date accessed: 21 oct. 2019. doi: https://doi.org/10.26418/jlk.v2i1.9.
Section
Articles