Abstract
The extent of media bias determines the information available to the public and can affect public opinion and decision-making. Social media, such as blogs, powered by the growth of the Internet and related technologies, is envisioned as a form of grassroots journalism that blurs the line between producers and consumers and changes how information and opinions are distributed. They are often seen as democratic entities that allow more voices to be heard than the conventional mainstream media as well as a balancing force against the arguably slanted mainstream media.
Do social media exhibit more or less bias than mainstream media and, if so, to what extent? A systematic comparison between social and mainstream media is critical but challenging due to the scale and dynamic nature of modern communication.
Our major contribution is that we propose empirical measures to quantify the extent and dynamics of "bias" in mainstream and social media (hereafter referred to as "News" and "Blogs", respectively). Our measurements are not normative judgment, but examine bias by looking at the attributes of those being mentioned, against a null model of "unbiased" coverage.
We focus on the number of times a member of the 111th US congress was "referenced", and study the distribution and dynamics of the references within a large set of media outlets. We consider "the unbiased" as a configurable baseline distribution and measure how the observed coverage deviates from this baseline, with the measurement uncertainty of observations taken into account. We demonstrate bias measures for slants in favor of specific political parties, popular front-runners, or certain geographical regions.
Using these measures to examine newly collected data, we have observed distinct characteristics of how News and Blogs cover the US congress. Our analysis of party and ideological bias indicates that Blogs are not significantly less slanted than News. However, their slant orientations are more sensitive to exogenous factors such as national elections. In addition, blogs' interests are less concentrated on particular front-runners or regions than news outlets.
While our measures are independent of content, we further investigate two aspects of the content related to our measures: the hyperlinks embedded in articles and sentiments detected from the articles. The hyperlink patterns suggest that outlets with a Democrat-slant (D-slant for short) are more likely to cite each other than outlets with a Republican-slant (R-slant). The sentiment analysis suggests there is a weak correlation between negative sentiments and our measures.
To better understand the distinctive slant structures between the two media, we propose to use a simple "wealth allotment" model to explain how legislators gain references from different media. The results about blog media's inclination to a rich-get-richer mechanism indicates they are more likely to echo what others have mentioned. This observation does not contradict our measures of bias -- compared with news media, blogs are weaker adherents to particular parties, front-runners or regions but are more susceptible to the network and exogenous factors. This simple generative model helps reveal differences in the process of coverage selection between the two media.