Skip to content

Data Mining and Social Sciences in Spain

Data Mining and Social Sciences in Spain

Conclusions on the state-of-the-art methods for political forecasting with Twitter

Author: Xavier Arque

Director TFM: Dr. Josep Cobarsí Morales

For any social scientist to have socioeconomic quality information is crucial and because the content of Twitter messages (or any SNS) plausibly reflects the offline political landscape it’s important for social scientist to be able to tap this information. But in a digital world, with data being generated massively and in an unstructured format, the classical tools to capture information for socio-political analysis have been partly outdated. In this work I pretend to introduce Social Science researchers to the opportunities and challenges of using Twitter data to analyze political trends. I review the literature and the researchers profile, I discuss the use of microblogging message content as a valid indicator of Spanish political sentiment, and assess the relations between Social Sciences and Computer Science.

The main contributions of this paper are two-folds:

First; traditional polls accuracy is still higher. Although the gap is narrowing in some areas, like big tendencies detection, there is a need for better tools and better theoretical frameworks.

Second; the results display a research landscape occupy by Computer Science researchers and few Social Science researchers, mostly working in the USA. This is bad because it atomizes the research. Looks like most of the Social Scientists, the ones that have to create the theoretical framework, don’t have the tools or the knowledge to work with Big Data.

Submitted in the partial fulfillment of the requirements for the degree of MASTER GEICO
[Gestió Estratégica de la Informació i el Coneixement] University – UOC


Twitter, politics, Social Sciences, forecast, meta-review

This was intended to be a three articles paper but I couldn’t finish it and some parts are just a bit more than drafted.

If I get some spare time I will try to finish the three articles properly

Download Final Paper

All questions and answers

Resume all results

Results in csv format