Upcoming lecture: Gábor Berend (SZTE)

9th April 2019

Title: Towards cross-lingual and human-interpretable word representations

Date: 9th April 2019, 1PM

Venue: HAS CSS RECENS, Meeting Room (T.1.40)

Address: H-1097 Budapest Tóth Kálmán street 4. T building 1st Floor Room 40.

Abstract:

The application of distributed word embeddings has become the standard
in natural language processing applications, including question
answering, syntactic/semantic parsing, named entity recognition and
other structured prediction task. Despite of their intriguing properties
and wide-spread popularity, human interpretation of the emerging word
embeddings is typically cumbersome. We argue, that by performing a
special form of matrix decomposition, dense word embeddings can be
efficiently turned into sparse vectors with increased interpretability.
Our empirical investigation has shown that not only these sparse word
representations are more interpretable compared to their dense
counterparts, but often deliver better performance when integrated into
applications. We shall also introduce an extension of the proposed
algorithm for facilitating the determination of cross-lingually
comparable sparse word representations based on a (potentially small)
set of translated word pairs between languages.