Berend Gábor (SZTE) előadása: Towards cross-lingual and human-interpretable word representations

2019. április 9.

Mindenkit szeretettel várunk az MTA TK "Lendület" RECENS hálózati előadás-sorozatának következő alkalmára 2019. április 9-én (kedden), melyen Berend Gábor (SZTE) tart előadást "Towards cross-lingual and human-interpretable word representations" címmel. Az előadás nyelve angol.

Az előadás megrendezésére az MTA TK "Lendület" RECENS Kutatócsoport tárgyalótermében (MTA Humán Tudományok Kutatóháza, 1097 Budapest, Tóth Kálmán utca 4., T. épület, 1. emelet, 40. szoba) kerül sor 13:00-as kezdettel.

Abstract:

The application of distributed word embeddings has become the standard in natural language processing applications, including question
answering, syntactic/semantic parsing, named entity recognition and other structured prediction task. Despite of their intriguing properties
and wide-spread popularity, human interpretation of the emerging word embeddings is typically cumbersome. We argue, that by performing a
special form of matrix decomposition, dense word embeddings can be efficiently turned into sparse vectors with increased interpretability.
Our empirical investigation has shown that not only these sparse word representations are more interpretable compared to their dense
counterparts, but often deliver better performance when integrated into applications. We shall also introduce an extension of the proposed
algorithm for facilitating the determination of cross-lingually comparable sparse word representations based on a (potentially small) set of translated word pairs between languages.