SemEval-2010 Word Sense Induction & Disambiguation Task

Download Trial Data & Task Description

The trial data & task description files can be downloaded by clicking here.

Download Training, testing datasets & evaluation scripts

  • Evaluation scripts & keys available here.
  • Training available here.
  • Testing available here.
  • WSI systems results avalaible here.

Testing Dataset Information

The testing dataset is part of the OntoNotes (Hovy et al, 2006). Each test instance consisted of a maximum of three sentences. The texts come from various news sources including the Wall Street Journal, CNN, ABC and others.

Verbs' Testing Dataset Description Information
Lemma Instances ITA Senses
accommodate.v 12 0.75 3
sniff.v 15 0.93 3
cheat.v 16 0.81 2
presume.v 16 0.81 2
reap.v 16 0.94 2
haunt.v 17 0.82 2
cultivate.v 17 0.82 4
frame.v 19 0.89 4
level.v 20 0.75 4
regain.v 20 0.9 2
bow.v 22 0.82 5
root.v 23 0.78 4
shave.v 26 0.96 2
owe.v 29 0.9 3
analyze.v 29 0.9 2
swim.v 31 0.9 2
mount.v 32 0.94 5
signal.v 34 0.91 2
assemble.v 37 0.81 2
assert.v 37 0.81 3
straighten.v 37 0.84 3
deploy.v 40 0.78 2
expose.v 41 0.9 2
swear.v 44 0.98 5
weigh.v 46 0.98 6
pour.v 47 0.89 4
separate.v 51 0.9 2
relax.v 53 0.87 3
divide.v 58 0.91 5
slow.v 59 0.9 2
appeal.v 66 0.85 4
commit.v 71 0.9 3
pursue.v 73 0.92 2
observe.v 76 0.78 4
conclude.v 76 0.8 4
figure.v 78 0.81 5
stick.v 79 0.8 4
question.v 82 0.8 2
violate.v 83 0.96 2
defend.v 94 0.91 2
lay.v 107 0.77 6
reveal.v 122 0.88 2
apply.v 123 0.93 4
insist.v 124 0.87 2
deny.v 133 0.86 3
introduce.v 142 0.87 3
operate.v 190 0.81 2
lie.v 208 0.97 4
wait.v 346 0.97 2
happen.v 581 0.97 4

Nouns' Test Dataset Information
Lemma Instances ITA Senses
access.n 48 1 8
accounting.n 31 0.94 5
address.n 37 0.92 10
air.n 174 0.89 8
body.n 190 0.89 10
camp.n 33 1 8
campaign.n 148 0.88 5
cell.n 84 0.99 8
challenge.n 72 0.89 7
chip.n 112 0.93 13
class.n 132 1 9
commission.n 50 0.9 7
community.n 189 1 8
dealer.n 67 0.94 5
display.n 40 0.97 6
edge.n 32 0.97 6
entry.n 45 1 6
failure.n 66 1 7
field.n 155 0.88 11
flight.n 107 1 11
foundation.n 52 0.9 8
function.n 35 1 7
gap.n 51 1 5
gas.n 123 0.98 6
guarantee.n 58 1 5
house.n 162 0.91 14
idea.n 200 1 7
innovation.n 33 0.91 3
legislation.n 70 1 4
margin.n 60 1 6
mark.n 70 0.96 9
market.n 865 0.78 7
mind.n 111 1 7
moment.n 143 0.99 6
movement.n 63 0.94 7
note.n 96 0.93 8
office.n 332 1 7
officer.n 187 0.97 4
origin.n 23 1 5
park.n 43 0.93 7
promotion.n 27 1 5
rally.n 46 1 5
reputation.n 28 1 4
road.n 138 0.89 6
screen.n 28 0.93 10
shape.n 46 0.85 6
speed.n 52 0.87 9
television.n 161 1 4
threat.n 140 0.98 4
tour.n 30 1 5

Acknowledgements

We gratefully acknowledge the support of the EU FP7 INDECT project, Grant No. 218086, the National Science Foundation Grant NSF-0715078, Consistent Criteria for Word Sense Disambiguation, and the GALE program of the Defense Advanced Research Projects Agency, Contract No. HR0011-06-C-0022, a subcontract from the BBN-AGILE Team.

Eduard Hovy, Mitchell Marcus, Martha Palmer, Lance Ramshaw, and Ralph Weischedel. 2006. Ontonotes: the 90% solution. In Proceedings of NAACL, Companion Volume: Short Papers on XX, pages 57-60. ACL.