Analisis Penggabungan Korpus dari Hadits Nabi dan Alquran untuk Mesin Penerjemah Statistik

  • Hafidz Ardhi Pontianak
  • Herry Sujaini Universitas Tanjungpura
  • Arif Bijaksana Putra Universitas Tanjungpura

Abstract

Each region has different language to communicate. A communication can run well if each other can understand the language that use in communication process. Machine translation is an automatic translation machine to translate a text from a language to another language. In the machine translation there will be an automatic evaluation. Automatic evaluation is needed to measure the quality of translation text from machine translation using automatic metric. The metric is use to determine score toward quality in various ways until get percentage at the final result. Evaluation of translation machine system using automatic metric is quick, easy and inexpensive way rather that human evaluation. BLUE is a common metric used by researcher to evaluate machine translation. For this research, researcher used Arabic Languange. Corpus that used are corpus of Al-quran, corpus of hadith, and combined corpus. The corpus will be tested with the type of sentence and 4 level numbers of sentences. The test will be done in two times. First, test without MADAMIRA. Second, test using MADAMIRA. The result of tested without MADAMIRA produce BLEU score for corpus of Al-quran in amount of 10,56%, corpus of hadith 27,65%, and combined corpus 15,41%. In the other hand, the result of tested corpus used MADAMIRA got result of BLEU for corpus of Al-quran 1,44%, corpus of hadith 32,90% and combined corpus 41,46%.

References

[1] S. Ebrahim, D.Hegazy, M.G.M.Mostafa, and El-Beltagy S.R. (2015) English-Arabic Statistical Machine Translation: State of the Art. In: Gelbukh A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science, vol 9041. Springer, Cham.
[2] S. Mandira, H. Sujaini dan B. P. Arif, “Perbaikan Probabilitas Lexical Model untuk Meningkatkan Akurasi Mesin Penerjemah Statistik”, Jurnal Edukasi dan Penelitian Informatika (JEPIN), Vol. 2, No. 1, 2016.
[3] Y. Jarob, H. Sujaini dan N. Safriadi, “Uji Akurasi Penerjemahan Bahasa Indonesia – Dayak Taman dengan Penandaan Kata Dasar dan Imbuhan”, Jurnal Edukasi dan Penelitian Informatika (JEPIN), Vol. 2, No. 2, 2016.
[4] R.A. Nugroho , T.B. Adji, dan B.S. Hantono, Penerjemahan Bahasa Indonesia dan Bahasa Jawa Menggunakan Metode Statistik Berbasis Frasa, dalam Seminar Nasional Teknologi Informasi dan Komunikasi 2015 (SENTIKA 2015), Yogyakarta, 2015.
[5] M, Christopher D. dan Schutze, Hinrich. 2000. Foundations Of Statistical Natural Language Processing. London : The MIT Press Cambridge Massachusetts.
[6] H. Sujaini, A,B, Putra. 2015. Analysis of Extended Word Similarity Clustering based Algorithm on Cognate Language. Gujarat: ESRSA Publications Pvt. Ltd.
[7] S. Hunston. 2002. Corpora in Applide Linguistics. Cambridge: Cambrigde University Press.
[8] K. Papineni, dkk. 2002. BLEU: a Method for Automatic Evaluation of Machine Translation. USA: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July 2002, pp. 311-318.
[9] H. Tanuwijaya. 2009. Penerjemahan Inggris-Indonesia Menggunakan Mesin Penerjemah Statistik Dengan Word Reordering dan Phrase Reordering. Jakarta : Universitas Indonesia.
Published
2018-04-30
How to Cite
ARDHI, Hafidz; SUJAINI, Herry; PUTRA, Arif Bijaksana. Analisis Penggabungan Korpus dari Hadits Nabi dan Alquran untuk Mesin Penerjemah Statistik. Jurnal Linguistik Komputasional, [S.l.], v. 1, n. 1, p. 31 - 37, apr. 2018. ISSN 2621-9336. Available at: <http://inacl.id/journal/index.php/jlk/article/view/1>. Date accessed: 31 mar. 2020. doi: https://doi.org/10.26418/jlk.v1i1.1.
Section
Articles