Data

 The Training and Test Data

1. Lexical Sample Task: Training Data and the Dic

Test Data.

2. All Words Task: the Dic

Test Data.

3 .ADV Task: Training Data and the Dic

Test Data.

 

To Lexical Sample and All WordsTask, The data are annotated with the Hownet(http://www.keenage.com) . We also extracted a sub-set of Hownet and formed file Dic for each Task;

Lexical Sample Task.  The "senseid" attribute value of the "answer" tag in training data is the right sense answer whereas it is empty in Testing Data. This task needs participant to fill the blank with the right sense(the DEF value in the Dic).

All Words task. There are no training data.  We just listed the ambiguous words(which in Hownet they are) here and their tag of XML is "head". The "senseid" attribute value of the "head" tag are empty. This task needs participant to fill the blank with the right sense(the DEF value in the Dic).

ADV Task. We annotated the data using the  Chinese Function Word Usages' Knowledge Base (CFKB) . We also provided a Dic. This task is similar to the Lexcical Sample Task execpt the different Dic. 

According to the schedule, the test and all-words data will appear in 15 Feb 2013. In order to download the test data you should register first.