Abstract
In contrast to many decades of research on oral code-switching, the study of written multilingual productions has only recently enjoyed a surge of interest. Many open questions remain regarding the sociolinguistic underpinnings of written code-switching, and progress has been limited by a lack of suitable resources. We introduce a novel, large, and diverse dataset of written code-switched productions, curated from topical threads of multiple bilingual communities on the Reddit discussion platform, and explore questions that were mainly addressed in the context of spoken language thus far. We investigate whether findings in oral code-switching concerning content and style, as well as speaker proficiency, are carried over into written code-switching in discussion forums. The released dataset can further facilitate a range of research and practical activities.
Original language | English |
---|---|
Title of host publication | EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference |
Publisher | Association for Computational Linguistics |
Pages | 4776-4786 |
Number of pages | 11 |
ISBN (Electronic) | 9781950737901 |
State | Published - 2019 |
Externally published | Yes |
Event | 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019 - Hong Kong, China Duration: 3 Nov 2019 → 7 Nov 2019 |
Publication series
Name | EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference |
---|
Conference
Conference | 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019 |
---|---|
Country/Territory | China |
City | Hong Kong |
Period | 3/11/19 → 7/11/19 |
Bibliographical note
Publisher Copyright:© 2019 Association for Computational Linguistics
ASJC Scopus subject areas
- Computational Theory and Mathematics
- Computer Science Applications
- Information Systems