Uji Coba Korpus Data Wicara BPPT sebagai Data Latih Sistem Pengenalan Wicara Bahasa Indonesia

  • Made Gunawan Pusat Teknologi Informasi dan Komunikasi, Badan Pengkajian dan Penerapan Teknologi
  • Elvira Nurfadhilah Pusat Teknologi Informasi dan Komunikasi, Badan Pengkajian dan Penerapan Teknologi
  • Lyla Ruslana Aini Pusat Teknologi Informasi dan Komunikasi, Badan Pengkajian dan Penerapan Teknologi
  • M. Teduh Uliniansyah Pusat Teknologi Informasi dan Komunikasi, Badan Pengkajian dan Penerapan Teknologi
  • Gunarso - Pusat Teknologi Informasi dan Komunikasi, Badan Pengkajian dan Penerapan Teknologi
  • Agung Santosa Pusat Teknologi Informasi dan Komunikasi, Badan Pengkajian dan Penerapan Teknologi
  • Juliati Junde Pusat Teknologi Informasi dan Komunikasi, Badan Pengkajian dan Penerapan Teknologi

Abstract

Kami menyajikan hasil uji coba pengenalan wicara menggunakan Korpus Data Wicara BPPT yang dikembangkan tahun 2013 (KDW-BPPT-2013) dengan menggunakan anggaran DIPA tahun 2013. Korpus ini digunakan sebagai data latih dan data uji. Korpus ini berisi ujaran dari 200 pembicara yang terdiri dari 50 laki-laki dewasa, 50 laki-laki remaja, 50 perempuan dewasa, dan 50 perempuan remaja dengan masing-masing mengucapkan 250 kalimat. Total lama ujaran data wicara ini sekitar 92 jam.  Uji coba dilakukan dengan menggunakan Kaldi dan menghasilkan Word Error Rate (WER) GMM 2,52 % dan DNN 1,64%.

References

[1] D. Povey, et al. "The Kaldi speech recognition toolkit." IEEE 2011 workshop on automatic speech recognition and understanding. No. EPFL-CONF-192584. IEEE Signal Processing Society, 2011.
[2] B. Popovi?, et al. "Deep neural network based continuous speech recognition for Serbian using the Kaldi toolkit." International Conference on Speech and Computer. Springer, Cham, 2015.
[3] H. Geoffrey, et al. "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups." IEEE Signal processing magazine 29.6 (2012): 82-97.
[4] J. Yamagishi (2010) “CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit”. [Online]. Available: http://homepages.inf.ed.ac.uk/jyamagis/
[5] D. Povey (2015). “LibriSpeech ASR corpus”. [Online]. Available: www.openslr.org/12/
[6] F. Fernandez, et al (2018). “Corpus: TED-LIUM Release 3”. [Online]. Available: https://lium.univ-lemans.fr/en/ted-lium3
[7] UCSB Linguistics (2010). “Resources”. [Online]. Available: http://www.linguistics.ucsb.edu/resources
[8] VoxForge (2009). “VoxForge Download” [Online]. Available: http://www.voxforge.org
[9] N Halabi (2018) "Arabic Speech Corpus" online available http://en.arabicspeechcorpus.com
[10] Speech Resources Consortium (2001), "University of Tsukuba Multilingual Speech Corpus (UT-ML)", online available http://research.nii.ac.jp/src/en/UT-ML.html
[11] Speech Resources Consortium (2015) "Spoken Language" and the DSR Projects Speech Corpus (PASL-DSR), online available: "http://research.nii.ac.jp/src/en/PASL-DSR.html
[12] St?nescu, Miruna, et al. "ASR for low-resourced languages: building a phonetically balanced Romanian speech corpus." Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European. IEEE, 2012.
[13] The Production of Speech Corpora, Florian Schiel, Christoph Draxler, Version 2.5: June 1, 2004 https://www.bas.uni-muenchen.de/forschung/BITS/TP1/Cookbook/Tp1.html
[14] M. Habib, F. Alam, R. Sultana, S.A. Chowdhury, M. Khan, "Phonetically balanced Bangla speech corpus," HLTD 2011, Alexandria, Egypt, 2011.
[15] S. Mandal, B. Das, P. Mitra, A. Bas, "Developing Bengali Speech Corpus for Phone Recognizer Using Optimum Text Selection Technique," IALP 2011, Penang, Malaysia, 2011.
[16] V. Pylypenko, V. Robeiko, M. Sazhok, N.Vasylieva, O. Radoutsky, "Ukrainian Broadcast Speech Corpus Development," SPECOM 2011, Kazan, Russia, 2011.
[17] A.A. Raza, S. Hussain, H. Sarfraz, I. Ullah, Z. Sarfraz, "Design and development of phonetically rich Urdu speech corpus," Oriental COCOSDA 2009, Urumqi, China, 2009.
[18] Suyanto, "Modified Least-to-Most Greedy Algorithm to Search a Minimum Sentence Set" in proc. TENCON, Hong Kong, 2006
[19] Lee, Akinobu, and Tatsuya Kawahara. "Recent development of open-source speech recognition engine julius." Proceedings: APSIPA ASC 2009: Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference. Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference, International Organizing Committee, 2009.
[20] Peluncuran "Perisalah". 2010. Retrieved July 10, 2016 from http://www.bumn.go.id/inti/berita/37/Peluncuran.
[21] Kaldi (2018). “kaldi-asr/kaldi”. [Online]. Available: https://github.com/kaldi-asr/kaldi/tree/master/egs/wsj/s5/local
Published
2018-09-24
How to Cite
GUNAWAN, Made et al. Uji Coba Korpus Data Wicara BPPT sebagai Data Latih Sistem Pengenalan Wicara Bahasa Indonesia. Jurnal Linguistik Komputasional, [S.l.], v. 1, n. 2, p. 45 - 50, sep. 2018. ISSN 2621-9336. Available at: <http://inacl.id/journal/index.php/jlk/article/view/8>. Date accessed: 26 oct. 2020. doi: https://doi.org/10.26418/jlk.v1i2.8.
Section
Articles