Short Answer Grading Using Contextual Word Embedding and Linear Regression
Abstract—One of the obstacles in an efficient MOOC is the evaluation of student answers, including the short answer grading which requires large effort from instructors to conduct it manually. Thus, NLP research in short answer grading has been conducted in order to support the automation, using several techniques such as rule and machine learning based. Here, we’ve conducted experiments on deep learning based short answer grading to compare the answer representation and answer assessment method. In the answer representation, we compared word embedding and sentence embedding models such as BERT, and its modification. In the answer assessment method, we use linear regression. There are 2 datasets that we used, available English short answer grading dataset with 80 questions and 2442 to get the best configuration for model and Indonesian short answer grading dataset with 36 questions and 9165 short answers as testing data. Here, we’ve collected Indonesian short answers for Biology and Geography subjects from 534 respondents where the answer grading was done by 7 experts. The best root mean squared error for both dataset was achieved by using BERT pretrained, 0.880 for English dataset dan 1.893 for Indonesian dataset.