Word2Sense: Sparse Interpretable Word Embeddings

Published at Association for Computational Linguistics (ACL), 2019

We present an unsupervised method to generate Word2Sense word embeddings that are interpretable — each dimension of the embedding space corresponds to a fine-grained sense, and the non-negative value of the embedding along the j-th dimension represents the relevance of the j-th sense to the word. The underlying LDA-based generative model can be extended to refine the representation of a polysemous word in a short context, allowing us to use the embeddings in contextual tasks. On computational NLP tasks, Word2Sense embeddings compare well with other word embeddings generated by unsupervised methods. Acrosstaskssuchaswordsimilarity, entailment, sense induction, and contextual interpretation, Word2Sense is competitive with the state-of-the-art method for that task. Word2Sense embeddings are at least as sparse and fast to compute as prior art.

The paper has been accepted for Oral presentation at the conference. Oral - 270/3000 submissions ≈ 9% Acceptance Rate.

Please find the below resources

  1. Proceedings and paper.
  2. Video Presentation.