Abstract
We show that it is possible to learn to identify, with high accuracy, the native language of English test takers from the content of the essays they write. Our method uses standard text classification techniques based on multiclass logistic regression, combining individually weak indicators to predict the most probable native language from a set of 11 possibilities. We describe the various features used for classification, as well as the settings of the classifier that yielded the highest accuracy.
Original language | English |
---|---|
Title of host publication | Proceedings of the 8th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2013 |
Editors | Joel Tetreault, Jill Burstein, Claudia Leacock |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 279-287 |
Number of pages | 9 |
ISBN (Electronic) | 9781937284473 |
State | Published - 2013 |
Event | 8th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2013 - Atlanta, United States Duration: 13 Jun 2013 → … |
Publication series
Name | Proceedings of the 8th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2013 |
---|
Conference
Conference | 8th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2013 |
---|---|
Country/Territory | United States |
City | Atlanta |
Period | 13/06/13 → … |
Bibliographical note
Funding Information:This research was supported by a grant from the Israeli Ministry of Science and Technology.
Publisher Copyright:
© 2013 Association for Computational Linguistics.
ASJC Scopus subject areas
- Computer Science Applications
- Information Systems
- Computational Theory and Mathematics