Word Sense Induction (WSI) seeks to identify the different senses (or uses) of a target word in a given text in an automatic and fully-unsupervised manner. It is a key-enabling technology that aims to overcome the limitations associated with traditional knowledge-based & supervised Word Sense Disambiguation (WSD) methods, such as:

  • their limited adaptation to new languages and domains
  • the fixed granularity of senses
  • their inability to detect new senses (uses) not present in a given dictionary




Past evaluations (Agirre & Soroa, 2007; Manandhar et al. 2010) have focused on evaluating WSI methods in a: (1) clustering task, i.e. by comparing how well induced clusters matched the Gold Standard (GS) labelings, and (2) in a WSD setting by mapping induced clusters to GS senses and then measuring their WSD performance.

Both evaluation approaches have assumed that each occurrence of a word is best labeled with a single sense. However, human annotators often disagree about which sense is present (Passonneau et al., 2010), especially in cases where some of the possible senses are closely related (Chugur et al., 2002; McCarthy, 2006; Palmer et al., 2007).

Erk et al. (2009) have recently shown that in cases of sense ambiguity, a graded notion of sense labeling may be most appropriate. While a single sense classification may capture the most salient meaning, related senses may also be perceived by a reader. Consider the following sentence and the target word win

  • The athlete won the gold metal due to her hard work and dedication.

 WordNet 3.0 senses:

  1. win (be the winner in a contest or competition; be victorious)
  2. acquire, win, gain (win something through one’s efforts)
  3. gain, advance, win, pull ahead, make headway, get ahead, gain ground (obtain advantages,such as points, etc.)
  4. succeed, win, come through, bring home the bacon, deliver the goods (attain success or reach a desired goal)

Sense 1 appears to be the most appropriate label. However, sense 2 also seems present due to the athlete acquiring the medal itself, though not to the same degree as sense 1. The relatedness of the senses combined with the ambiguity of the context creates the opportunity to perceive multiple senses concurrently.


Aims & Goals


The aim of this task is to fully explore the possibility of perceiving multiple senses in a single contextual instance by evaluating systems that:

  1. induce the number and meaning of the different senses of a target word
  2. label each instance of a word with the senses that are present by assigning to each sense a perceptibility score.

To facilitate a broader comparison of WSI systems and evaluation methodologies, the full task will be made of three subtasks:

  1. Non-graded Word Sense Iduction Subtask
  2. Graded Word Sense Induction Subtask
  3. Lexical Substitution SubTask

The description of each subtask can be found here.




Eneko Agirre and Aitor Soroa. 2007. Semeval-2007 task 02: Evaluating word sense induction and discrimintation systems. In Proceedings of the Fourth International Workshop on Semantic Evaluations, pages 7–12. ACL, June.


Suresh Manandhar, Ioannis P. Klapaftis, Dmitriy Dligach, and Sameer S. Pradhan. 2010. SemEval-2010 task 14: Word sense induction & disambiguation. In Proceedings of the 5th International Workshop on Semantic Evaluation, pages 63–68. Association for Computational Linguistics.


Rebecca J. Passonneau, Ansaf Salleb-Aoussi, Vikas Bhardwaj, and Nancy Ide. 2010. Word sense annotation of polysemous words by multiple annotators. In Proceedings of Seventh International Conference on Language Resources and Evaluation (LREC-7).


Irina Chugur, Julio Gonzalo, and Felisa Verdejo. 2002. Polysemy and sense proximity in the senseval-2 test suite. In Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions - Volume 8, WSD ’02, pages 32–39, Stroudsburg, PA, USA. Association for Computational Linguistics.


Diana McCarthy. 2006. Relating WordNet senses for word sense disambiguation. Making Sense of Sense: Bringing Psycholinguistics and Computational Linguistics Together, page 17.


Martha Palmer, Hoa Trang Dang, and Christiane Fellbaum. 2007. Making fine-grained and coarse-grained sense distinctions, both manually and automatically. Natural Language Engineering, 13(02):137–163.


Katrin Erk, Diana McCarthy, and Nicholas Gaylord. 2009. Investigations on word senses and word usages. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1, pages 10–18. Association for Computational Linguistics.


Contact Info


David Jurgens
School of Computer Science, McGill University,

Ioannis P. Klapaftis
Microsoft Search Technology Centre, Bing, London, UK,

Other Info


  • Trial Data have been released.
  • All data and system submissions are now available here
  • Task 13 paper is online here
  • Please consult this errata for the official scores in the multi-sense experiment