Identifying the L1 of non-native writers: The CMU-Haifa system

Yulia Tsvetkov, Naama Twitto, Nathan Schneider, Noam Ordan, Manaal Faruqui, Victor Chahuneau, Shuly Wintner, Chris Dyer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We show that it is possible to learn to identify, with high accuracy, the native language of English test takers from the content of the essays they write. Our method uses standard text classification techniques based on multiclass logistic regression, combining individually weak indicators to predict the most probable native language from a set of 11 possibilities. We describe the various features used for classification, as well as the settings of the classifier that yielded the highest accuracy.

Original languageEnglish
Title of host publicationProceedings of the 8th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2013
EditorsJoel Tetreault, Jill Burstein, Claudia Leacock
PublisherAssociation for Computational Linguistics (ACL)
Pages279-287
Number of pages9
ISBN (Electronic)9781937284473
StatePublished - 2013
Event8th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2013 - Atlanta, United States
Duration: 13 Jun 2013 → …

Publication series

NameProceedings of the 8th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2013

Conference

Conference8th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2013
Country/TerritoryUnited States
CityAtlanta
Period13/06/13 → …

Bibliographical note

Funding Information:
This research was supported by a grant from the Israeli Ministry of Science and Technology.

Publisher Copyright:
© 2013 Association for Computational Linguistics.

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Identifying the L1 of non-native writers: The CMU-Haifa system'. Together they form a unique fingerprint.

Cite this