Automating a framework to extract and analyse transport related social media content: The potential and the challenges

Tsvi Kuflik, Einat Minkov, Silvio Nocera, Susan Grant-Muller, Ayelet Gal-Tzur, Itay Shoor

Research output: Contribution to journalArticlepeer-review


Harnessing the potential of new generation transport data and increasing public participation are high on the agenda for transport stakeholders and the broader community. The initial phase in the program of research reported here proposed a framework for mining transport-related information from social media, demonstrated and evaluated it using transport-related tweets associated with three football matches as case studies. The goal of this paper is to extend and complement the previous published studies. It reports an extended analysis of the research results, highlighting and elaborating the challenges that need to be addressed before a large-scale application of the framework can take place. The focus is specifically on the automatic harvesting of relevant, valuable information from Twitter. The results from automatically mining transport related messages in two scenarios are presented i.e. with a small-scale labelled dataset and with a large-scale dataset of 3.7 m tweets. Tweets authored by individuals that mention a need for transport, express an opinion about transport services or report an event, with respect to different transport modes, were mined. The challenges faced in automatically analysing Twitter messages, written in Twitter's specific language, are illustrated. The results presented show a strong degree of success in the identification of transport related tweets, with similar success in identifying tweets that expressed an opinion about transport services. The identification of tweets that expressed a need for transport services or reported an event was more challenging, a finding mirrored during the human based message annotation process. Overall, the results demonstrate the potential of automatic extraction of valuable information from tweets while pointing to areas where challenges were encountered and additional research is needed. The impact of a successful solution to these challenges (thereby creating efficient harvesting systems) would be to enable travellers to participate more effectively in the improvement of transport services.

Original languageEnglish
Pages (from-to)275-291
Number of pages17
JournalTransportation Research Part C: Emerging Technologies
StatePublished - 1 Apr 2017

Bibliographical note

Funding Information:
The research was partially funded through the Worldwide Universities Network (WUN) scheme and by the University of Haifa.

Publisher Copyright:
© 2017 Elsevier Ltd


  • Mining Twitter for transport information
  • Opinion mining
  • Social media
  • Text mining
  • Twitter

ASJC Scopus subject areas

  • Transportation
  • Automotive Engineering
  • Civil and Structural Engineering
  • Management Science and Operations Research


Dive into the research topics of 'Automating a framework to extract and analyse transport related social media content: The potential and the challenges'. Together they form a unique fingerprint.

Cite this