A Long-Term Analysis of Polarization on Twitter


Social media has played an important role in shaping political discourse over the last decade. It is often perceived to have increased political polarization, thanks to the scale of discussions and their public nature. Here, we try to answer whether political polarization in the US on Twitter has increased over the last eight years. We analyze a large longitudinal Twitter dataset of 679,000 users and look at signs of polarization in their (i) network - how people follow political and media accounts, (ii) tweeting behavior - whether they retweet content from both sides, and (iii) content - how partisan the hashtags they use are.


  • Our data collection starts with a set of seed users. This set consists of (i) presidential/vice presidential candidates and their parties over the last 8 years., (ii) popular partisan media accounts, from the Pew Research report on media consumption habits. Based on these seed users, collected two datasets for the project:
  • Followers data: For each seed user, we obtained all their followers. The com- bined set of all followers for all seed accounts gave us a total of 140M users. We estimated the time when a user fol- lowed a particular seed account using the method proposed by Meeder et al.. This data dates back to follow times estimated since 2009.
  • Retweeters data: For the set of seed politicians, we obtained all their public, historic tweets. The earliest tweets in this collection date back to 2006. For each collected tweet, we used the Twitter API to collect up to 100 retweets. This gave us a set of 1.3M unique users who retweeted a political entity since 2006. We randomly sampled 50% of these users (679,000), and used the Twitter API to get 3,200 of their most recent tweets in December 2016. This gave us more than 2 billion tweets, dating back to 2007.
  • The raw dataset is a few hundred gigabytes compressed. If you want to access full/part of it, or have comments/ideas on what could be done, please get in touch with Kiran, at kiran.garimella XX aalto.fi