Shared Task

Shared Task for Chinese Grammatical Error Diagnosis (CGED)

Registration

Participants need to register in order to obtain the training and test data.
To register, please send the following information to Lung-Hao Lee ([email protected])

Team name (identified abbreviation for your organization e.g., NTNU)
Organization (affiliation e.g., National Taiwan Normal University)
Contact person (name and Email)

Important Dates

Registration open: June 25, 2016
Release of training data: June 25, 2016
Registration close: September 25, 2016
Release of test data: October 3, 2016
Testing results submission due: October 5, 2016
Release of evaluation results: October 7, 2016
Technical report submission due: October 20, 2016
Report reviews returned: October 25, 2016
Camera-ready due: October 30, 2016
Workshop dates: December 12, 2016.

Task Description

The goal of this shared task is to develop computer-assisted systems to automatically diagnose Chinese sentences, in which may contains grammatical errors written by learners of Chinese as a foreign language. The input sentence may contain at least one of defined error types, i.e., redundant word (denoting as a capital letter ‘R’), missing word (‘M’), word selection error (‘S’), and word ordering error (‘W’). The developed system should indicate which kind of error type is embedded in the given sentence and its occurred positions. If the input sentences contain no grammatical errors, the system should return: sid, correct. The output format should be a quadruple: sid, start_off, end_off, error_type, if an input sentence consists of a grammatical error. In this format, sid means the unique sentence identifier, start_off and end_off represent the positions of starting and ending character where a grammatical error occurs, in which each character or punctuation occupies 1 for counting positions, and error_type should be one of the defined errors: R, M, S, W. Examples are shown as follows.

Examples

TOCFL (Traditional Chinese)

Example 1:
Input: (sid=A2-0007-2) 聽說妳打算開一個慶祝會。可惜我不能參加。因為那個時候我有別的事。當然我也要參加給你慶祝慶祝。
Output: A2-0007-2, 38, 39, R
(Note: “參加” is a redundant word)

Example 2:
Input: (sid=A2-0007-3) 我要送給你一個慶祝禮物。要是兩、三天晚了，請別生氣。
Output: A2-0007-3, 15, 20, W
(Note: "兩、三天晚了" should be "晚了兩、三天")

Example 3:
Input: (sid=A2-0011-1) 我聽到你找到工作。恭喜恭喜！
Output: A2-0011-1, 2, 3, S
A2-0011-1, 9, 9, M
(Notes: "聽到" should be "聽說". Besides, a word "了" is missing. The correct sentence should be "我聽說你找到工作了")

Example 4:
Input: (sid=A2-0011-3) 我覺得對你很抱歉。我也很想去，可是沒有辦法。
Output: A2-0011-3, correct

HSK (Simplified Chinese)

Example 1:
Input: (sid=00038800481) 我根本不能了解这妇女辞职回家的现象。在这个时代，为什么放弃自己的工作，就回家当家庭主妇？
Output: 00038800481, 6, 7, S
00038800481, 8, 8, R
(Notes: “了解” should be "理解". In addition, "这" is a redundant word.)

Example 2:
Input: (sid=00038800464) 我真不明白。她们可能是追求一些前代的浪漫。
Output: 00038800464, correct

Example 3:
Input: (sid=00038801261) 人战胜了饥饿，才努力为了下一代作更好的、更健康的东西。
Output: 00038801261, 9, 9, M
00038801261, 16, 16, S
(Notes: "能" is missing. The word "作" should be "做". The correct sentence is "才能努力为了下一代做更好的")

Example 4:
Input: (sid=00038801320) 饥饿的问题也是应该解决的。世界上每天由于饥饿很多人死亡。
Output: 00038801320, 19, 25, W
(Notes: "由于饥饿很多人" should be "很多人由于饥饿")

Evaluation Metrics

The criteria for judging correctness are:

Detection level: binary classification of a given sentence, i.e., correct or incorrect should be completely identical with the gold standard. All error types will be regarded as incorrect.
Identification level: this level could be considered as a multi-class categorization problem. In addition to correct instances, all error types should be clearly identified.
Position level: besides identifying the error types, this level also judges the positions of erroneous range. That is, the system results should be perfectly identical with the quadruples of gold standard.

The following metrics are measured in both levels with the help of the confusion matrix.

False Positive Rate = FP / (FP+TN)
Accuracy = (TP+TN) / (TP+TN+FP+FN)
Precision = TP / (TP+FP)
Recall = TP / (TP+FN)
F-Score = 2*Precision*Recall / (Precision+Recall)

System Results
Positive Negative
Positive TP (True Positive) FN (False Negative)
Gold Standard
Negative FP (False Positive) TN (True Negative)

Data Sets

We will provide mutually exclusive data sets selecting from the TOCFL Learner Corpus (Traditional Chinese) and the HSK Learner Corpus (Simplified Chinese).

Training Set: The sentences contain grammatical errors accompanying with their corrections will be provided for training purpose. The training set will be released using the SGML format shown as follows.

<DOC>
<TEXT id="A2-0005-1">
我聽說你打算開一個慶祝會。對不起，我要參加，可是沒有空。你開一個慶祝會的時候我不能會參加，是因為我在外國做工作。
</TEXT>
<CORRECTION>
我聽說你打算開一個慶祝會。對不起，我要參加，可是沒有空。你開慶祝會的時候我不能參加，是因為我在外國工作。
</CORRECTION>
<ERROR start_off=”31” end_off=”32” type=”R”></ERROR>
<ERROR start_off=”42” end_off=”42” type=”R”></ERROR>
<ERROR start_off=”53” end_off=”53” type=”R”></ERROR>
</DOC>

Test Set: We will provide at least 3000 testing instances selecting to cover different error types for official performance evaluation.

Policy: Shared task participating teams are allowed to use other publicly available data for system developmet. Use of other data should be specified in the final system report. Here are the links to download the data sets of the two previous editions for this shared task.

NLP-TEA 2015 CGED Shared Task: http://ir.itc.ntnu.edu.tw/lre/nlptea15cged.htm
NLP-TEA 2014 CFL Shared Task: http://ir.itc.ntnu.edu.tw/lre/nlptea14cfl.htm

Technical Report

Each participating team must submit a technical report to describe developed method and its testing results. Please follow the COLING 2016 template to prepare the report. Non-conforming submissions would not be considered for review. Accepted reports that conform to the specified length and formatting requirements would be included in the NLP-TEA 2016 workshop proceeding. At least one author of each accepted report would be required to register for presenting the developed system. This is the most valuable part of participation, as authors will be able to engage attendees in extended conversations about their work.

References

Lee, Lung-Hao, Liang-Chih Yu, and Li-Ping Chang. 2015. Overview of the NLP-TEA 2015 shared task for Chinese grammatical error diagnosis. In Proceedings of the 2nd Workshop on Natural Language Processing Techniques for Educational Applications (NLP-TEA 2015). 1-6.
Yu, Liang-Chih, Lung-Hao Lee, and Li-Ping Chang. 2014. Overview of grammatical error diagnosis for learning Chinese as a foreign language. In Proceedings of the 1st Workshop on Natural Language Processing Techniques for Educational Applications (NLP-TEA 2014). 42-47.

NLPTEA 2016 & CGED