Do not have enough data? Deep learning to the rescue!

Ateret Anaby Tavor, Boaz Carmeli, Esther Goldbraich, Amir Kantor, George Kour, Segev Shlomov, Naama Tepper, Naama Zwerdling

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Based on recent advances in natural language modeling and those in text generation capabilities, we propose a novel data augmentation method for text classification tasks. We use a powerful pre-Trained neural network model to artificially synthesize new labeled data for supervised learning. We mainly focus on cases with scarce labeled data. Our method, referred to as language-model-based data augmentation (LAMBADA), involves fine-Tuning a state-of-The-Art language generator to a specific task through an initial training phase on the existing (usually small) labeled data. Using the fine-Tuned model and given a class label, new sentences for the class are generated. Our process then filters these new sentences by using a classifier trained on the original data. In a series of experiments, we show that LAMBADA improves classifiers performance on a variety of datasets. Moreover, LAMBADA significantly improves upon the state-of-The-Art techniques for data augmentation, specifically those applicable to text classification tasks with little data.

Original languageEnglish
Title of host publicationAAAI 2020 - 34th AAAI Conference on Artificial Intelligence
PublisherAAAI Press
Pages7383-7390
Number of pages8
ISBN (Electronic)9781577358350
StatePublished - 2020
Event34th AAAI Conference on Artificial Intelligence, AAAI 2020 - New York, United States
Duration: 7 Feb 202012 Feb 2020

Publication series

NameAAAI 2020 - 34th AAAI Conference on Artificial Intelligence

Conference

Conference34th AAAI Conference on Artificial Intelligence, AAAI 2020
Country/TerritoryUnited States
CityNew York
Period7/02/2012/02/20

Bibliographical note

Publisher Copyright:
© 2020 The Twenty-Fifth AAAI/SIGAI Doctoral Consortium (AAAI-20). All Rights Reserved.

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Do not have enough data? Deep learning to the rescue!'. Together they form a unique fingerprint.

Cite this