News Progagation During the 2018 Brazilian Presidential Election

The political debate was highly polarized and led to the victory of the candidate with the most "negative" stories in the news.

Elaborated by Aécio Santos, Anas Elghafari, and Sonia Castelo. December, 2018.

In 2018, Brazil had its 8th direct democratic presidential election since the end of the military dictatorship in 1985. It was arguably one of the most controversial elections since the beginning of the young Brazilian democracy. During the pre-election period, we monitored hundreds of Brazilian news sources and compiled a database with thousands of election-related news stories that mentioned the presidential candidates. In this article, we showcase various aspects of the election through the lenses of data.

The Data

Our database comprises web pages from 132 Brazilian news websites. Those include various major news websites (such as g1.globo.com, terra.com.br, and uol.com.br) as well as a sample of not-so-mainstream news sites.

132
News
Websites

The election had 2 rounds. The first round was on October 7th and the second round (the run-off between the top two candidates) was on October 28th. Our data cover the final 3 months of the campaign.

3 Months
August
September
October

While our data collection yielded hundreds of thousands of web pages, we focused on those that had text content (which exclude video pages), covered the election and the candidates, and were part of a story. There were 56,172 article web pages that matched those criteria.

56,172
webpages

The same story, the same event, is often covered by many websites. While an article is simply a web page from a specific site, our goal is to understand clusters of articles: collections of articles that are reporting about the same event, using similar language. We used text clustering to group articles from different websites that are similar to each other. The results were 16,409 clusters, ranging in size from 29 articles in the biggest cluster to 2 in the smallest.

16,409
article clusters

The news mention medalists

According to our news database, Jair Bolsonaro was by far the most popular candidate in terms of mentions in article headlines. His polarizing and controversial views have been described as far-right and conservative, drawing both praise and criticism in Brazil.

The second most popular is Lula, the former 35th Brazilian president (2003-2011). Shortly after announcing his candidacy, he was disqualified from running by the Superior Electoral Court (STF) on 31 August 2018 for being previously convicted on corruption charges in a controversial court ruling.

Fernando Haddad was the 3rd most frequently mentioned. He was named as Lula's replacement on September 1st on the Worker's Party coalition, which was officially approved on September 25th. His candidacy has been criticized for its strategy of trying to inherit Lula's popularity.

PSL (Social Liberal Party)
PT (Worker's Party)
Others

Number of presidential candidates mentions in article headlines during the pre-election period of the 2018 Brazilian Elections.

What was the news talking about?

To gain insight into the topics that bolstered candidate's news mentions, we delved into the textual content of the stories. The following chart shows the most frequent words occurring in the article headlines from our database. In conformance with candidate mentions, Bolsonaro is the most common word.

Word cloud of most frequent words in article headlines.

We also examined frequent words that appear associated exclusively with each of the top 3 candidates. For instance, when we look at Bolsonaro-related articles, his tragic stabbing in September 6, 2018, explains a large fraction of the most frequent terms (e.g., albert, eistein, adélio, cirurgia, agressor, intestino, UTI, esfaqueado, lesão, sangue, estáve, and others). Additionally, we can see words such as racismo, gays, and negras that allude to his heinous statements concerning minorities.

Word-cloud of most frequent words in articles that mention Bolsonaro.

When we look at the left-leaning candidates, Lula and Haddad, we see a different picture. For Lula, most terms seem to be judge names (e.g., Favreto, Fachin, Gebran, Lewandowski, Zanin), juridic terms related to his judicial process (habeas, corpus, TRF, apartamento, triplex, soltura), or related to his candidacy rejection (rejeitou) .

Word-cloud of most frequent words in articles that mention Lula.

For Haddad, the terms seem to be related to particular cases that involved him. For example, UTC refers to an engineering company involved in a corruption proceeding, in which Haddad has been accused of receiving bribes while he was a São Paulo major. The term desinformação (disinformation) is linked to articles associated with an accusation that the marketing campaign phrase "Haddad is Lula" is misleading and produces disinformation.

Word-cloud of most frequent words in articles that mention Haddad.

The spread of news online

Websites do not publish original content only. To measure the extent to which articles propagate on the Web, we tracked news stories that were re-published in different sites from its original source. We then grouped these re-published articles in clusters, which are shown in the next chart. The colors represent the main candidate mentioned in the cluster's stories headlines. The height and bubble size to represent the cluster size, i.e., the number of different sites that re-published the stories. The cluster size can be seen as a way to describe the reach of a story. Clearly, some stories got more traction than others and therefore became more popular.

Clusters of news articles that were re-published in websites different from its original source. Each bubble represents a cluster of similar articles. The color represents the candidate that each article cluster mentions in its headline.

News traveled fast

News spread quickly after its publication. In fact, most of the stories are re-published in other sites within a few minutes. The next chart illustrates the speed which news stories spread to other sites by showing how the number of publications within a cluster grows over time. Each line represents a cluster of stories and the colors represent the main candidate mentioned in the story headline. Most clusters follow a pattern of quick growth in the beginning and a slow growth toward the end.

Top-100 clusters

News mentions and the opinion polls

We now move on to investigate the relationship between the news coverage mentions and the public opinion polls. Next charts show how the opinion polls results and the candidate headlines fluctuated over the duration of the campaign. We can see that there is some correlation between both data. Most significant increases in the public opinion poll results for Bolsonaro and Haddad are preceded by changes in the news popularity as well. The most prominent spikes occurred on September 6-7 and on October 29, respectively, Bolsonaro's stabbing day and the final election day. After the first spike, there was a slight increase in poll results for Bolsonaro as well.

You can filter the visualization by clicking on the legend. To reset the visualization,
please click on the last clicked legend.

Conclusion

We can see that Bolsonaro has dominated the news coverage throughout the campaign period. What is interesting is that a lot of the news coverage of him was not positive. For example, two of the biggest news stories we found were critical of Bolsonaro (one story concerned a Bolsonaro misinformation campaign targeting Haddad and the other story was about the UN high commissioner criticizing Bolsonaro on the issue of human rights.) This tells us that there was no shortage of information about his positions and his actions and that the media tried to do its part in exposing fake news. However, despite this, we can see from the opinion polls that Bolsonaro never dropped from 1st place at any point during the final 3 months of the campaign. This is an interesting phenomenon that has happened in other countries and other elections. Perhaps, we are coming closer to a world where winning elections is mainly about generating controversy and going viral. This would be a situation where the old saying will hold true: there is no such thing as bad publicity.