Data Science: What the Facebook Controversy is Really About


Facebook has always 'manipulated' the results shown in its users' News Feeds by filtering and personalizing for relevance. But this weekend, the social giant seemed to cross a line, when it announced that it engineered emotional responses two years ago in an 'emotional contagion' experiment, published in the Proceedings of the National Academy of Sciences (PNAS).


Since then, critics have examined many facets of the experiment, including its . Each of these tacks tacitly accepts something important, though: the validity of Facebook's science and scholarship. There is a more fundamental question in all this: What does it mean when we call proprietary data research design, methodology, approval process, and ethics data science?


As a society, we haven't fully established how we ought to think about data science in practice. It's time to start hashing that out.


Data by definition is something that is taken as 'given,' but somehow we've taken for granted the terms under which we came to agree that fact. Once, the professional practice of 'data science' was called business analytics. The field has now rebranded as a science in the context of buzzwordy 'Big Data,' but unlike other scientific disciplines, most data scientists don't work in academia. Instead, they're employed in commercial or governmental settings.


The Facebook Data Science team is a prototypical data science operation. In the company's own words , it collects, manages, and analyzes data to 'drive informed decisions in areas critical to the success of the company, and conduct social science research of both internal and external interest.' Last year, for example, it self-censorship-when users input but do not post status updates . Facebook's involvement with data research goes beyond its in-house team. The company is actively recruiting social scientists with the promise of conducting research on 'recording social interaction in real time as it occurs completely naturally.' So what does it mean for Facebook to have a Core Data Science Team, describing their work-on their own product-as data science?


Contention about just what constitutes has been around since the start of scientific practice. By claiming that what it does is data science , Facebook benefits from the imprimatur of an established body of knowledge. It looks objective, authoritative, and legitimate, built on the backs of the scientific method and peer review. Publishing in a prestigious journal, Facebook legitimizes its data collection and analysis activities by demonstrating their contribution to scientific discourse as if to say, 'this is for the good of society.'


'A data scientist is a statistician who lives in San Fransisco' #monkigras http://ift.tt/1cze3UE


- Jeremy Jarvis (@jeremyjarvis) January 30, 2014

So it may be true that Facebook offers one of the ever compiled, but all of its studies-and this one, on social contagion- largest samples of social and behavioral data only describe things that happen on Facebook . The data is structured by Facebook, entered in a status update field created by Facebook, produced by users of Facebook, analyzed by Facebook researchers, with outputs that will affect Facebook's future News Feed filters, all to build the business of Facebook. As research, it is an over-determined and completely constructed object of study, and its outputs are not generalizable.


Ultimately, Facebook has only learned something about Facebook.


Means and Ends

For-profit companies have long conducted applied science research. But the reaction to this study seems to suggest there is something materially different in the way we perceive commercial data science research's impacts. Why is that?


At GE or Boeing, two long-time applied science leaders, the incentives for research scientists are the same as they are for those at Facebook. Employee-scientists at all three companies hope to produce research that directly informs product development and leads to revenue. However, the outcomes of their research are different. When Boeing does research, it contributes to humanity's ability to fly. When Facebook does research, it serves its own ideological agenda and perpetuates Facebooky-ness.


But data scientists don't just produce knowledge about observable, naturally occurring phenomena; they shape outcomes. A/B testing and routinized experimentation in real time are done on just about every major website in order to optimize for certain desired behaviors and interactions. Google designers infamously tested up to 40 shades of blue . Facebook has already experimented with the effects of social pressure in getting-out-the-vote, raising concerns about selective digital gerrymandering . What might Facebook do with its version of this research? Perhaps it could design the News Feed to show us positive posts from our friends in order to make us happier and encourage us to spend more time on the site? Or might Facebook show us more sad posts, encouraging us to spend more time on the site because we have more to complain about?


Should we think of commercial data science as science? When we conflate the two, we assume companies are accountable for producing generalizable knowledge and we risk according their findings undue weight and authority. Yet when we don't, we risk absolving practitioners from the rigor and ethical review that grants authority and power to scientific knowledge.


We need to be more critical of the production of data science, especially in commercial settings. The firms that use our data have asymmetric power over us. We do them a favor unquestioningly accepting their claims to the prestige, expertise, and authority of science as well.


Ultimately, society's greatest concerns with science and technology are ethical: Do we accept or reject the means by which knowledge is produced and the ends to which it is applied? It's a question we ask of nuclear physics, genetic modification-and one we should ask of data science.


Entities 0 Name: Boeing Count: 2 1 Name: Google Count: 1 2 Name: PNAS Count: 1 3 Name: San Fransisco Count: 1 4 Name: GE Count: 1 5 Name: academia Count: 1 6 Name: Jeremy Jarvis Count: 1 7 Name: Facebook Data Science Count: 1 8 Name: National Academy of Sciences Count: 1 9 Name: Core Data Science Team Count: 1 Related 0 Url: http://ift.tt/1qCchHD Title: Facebook tinkered with users' feeds for a massive psychology experiment Description: Scientists at Facebook have published a paper showing that they manipulated the content seen by more than 600,000 users in an attempt to determine whether this would affect their emotional state. The paper, "Experimental evidence of massive-scale emotional contagion through social networks," was published in The Proceedings Of The National Academy Of Sciences .

Post a Comment for "Data Science: What the Facebook Controversy is Really About"