Abstract
We address the task of native language identification in the context of social media content, where authors are highly-fluent, advanced nonnative speakers (of English). Using both linguistically-motivated features and the characteristics of the social media outlet, we obtain high accuracy on this challenging task. We provide a detailed analysis of the features that sheds light on differences between native and nonnative speakers, and among nonnative speakers with different backgrounds.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 |
| Editors | Ellen Riloff, David Chiang, Julia Hockenmaier, Jun'ichi Tsujii |
| Publisher | Association for Computational Linguistics |
| Pages | 3591-3601 |
| Number of pages | 11 |
| ISBN (Electronic) | 9781948087841 |
| State | Published - 2018 |
| Event | 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 - Brussels, Belgium Duration: 31 Oct 2018 → 4 Nov 2018 |
Publication series
| Name | Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 |
|---|
Conference
| Conference | 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 |
|---|---|
| Country/Territory | Belgium |
| City | Brussels |
| Period | 31/10/18 → 4/11/18 |
Bibliographical note
Publisher Copyright:© 2018 Association for Computational Linguistics
ASJC Scopus subject areas
- Computational Theory and Mathematics
- Computer Science Applications
- Information Systems