Investigating the cross-lingual translatability of VerbNet-style classification
Murakami A.; Laippala V.; Majewska O.; McCarthy D.; Korhonen A.; Vulić I.; Huang Y.
https://urn.fi/URN:NBN:fi-fe2021042717968
Tiivistelmä
VerbNet—the most extensive online verb lexicon currently available for
English—has proved useful in supporting a variety of NLP tasks. However,
its exploitation in multilingual NLP has been limited by the fact that
such classifications are available for few languages only. Since manual
development of VerbNet is a major undertaking, researchers have recently
translated VerbNet classes from English to other languages. However, no
systematic investigation has been conducted into the applicability and
accuracy of such a translation approach across different, typologically
diverse languages. Our study is aimed at filling this gap. We develop a
systematic method for translation of VerbNet classes from English to
other languages which we first apply to Polish and subsequently to
Croatian, Mandarin, Japanese, Italian, and Finnish. Our results on
Polish demonstrate high translatability with all the classes (96% of
English member verbs successfully translated into Polish) and strong
inter-annotator agreement, revealing a promising degree of overlap in
the resultant classifications. The results on other languages are
equally promising. This demonstrates that VerbNet classes have strong
cross-lingual potential and the proposed method could be applied to
obtain gold standards for automatic verb classification in different
languages. We make our annotation guidelines and the six
language-specific verb classifications available with this paper. © 2017
The Author(s)
Kokoelmat
- Rinnakkaistallenteet [19207]