TY - GEN
T1 - Identification of multi-word expressions by combining multiple linguistic information sources
AU - Tsvetkov, Yulia
AU - Wintner, Shuly
PY - 2011
Y1 - 2011
N2 - We propose an architecture for expressing various linguistically-motivated features that help identify multi-word expressions in natural language texts. The architecture combines various linguistically-motivated classification features in a Bayesian Network. We introduce novel ways for computing many of these features, and manually define linguistically-motivated interrelationships among them, which the Bayesian network models. Our methodology is almost entirely unsupervised and completely language in dependent; it relies on few language resources and is thus suitable for a large number of languages. Furthermore, unlike much recent work, our approach can identify expressions of various types and syntactic constructions. We demonstrate a significant improvement in identification accuracy, compared with less sophisticated baselines.
AB - We propose an architecture for expressing various linguistically-motivated features that help identify multi-word expressions in natural language texts. The architecture combines various linguistically-motivated classification features in a Bayesian Network. We introduce novel ways for computing many of these features, and manually define linguistically-motivated interrelationships among them, which the Bayesian network models. Our methodology is almost entirely unsupervised and completely language in dependent; it relies on few language resources and is thus suitable for a large number of languages. Furthermore, unlike much recent work, our approach can identify expressions of various types and syntactic constructions. We demonstrate a significant improvement in identification accuracy, compared with less sophisticated baselines.
UR - http://www.scopus.com/inward/record.url?scp=80053230451&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:80053230451
SN - 1937284115
SN - 9781937284114
T3 - EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
SP - 836
EP - 845
BT - EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
T2 - Conference on Empirical Methods in Natural Language Processing, EMNLP 2011
Y2 - 27 July 2011 through 31 July 2011
ER -