Tuesday, May 26, 2015

Kind of a Big Fake

Some cool looking data from LaCour and Green's study
By Finbarr Curtis 

In a scene from The Legend of Ron Burgundy, the journalist Brian Fantana anoints himself with a special cologne made with "bits of real panther." The cologne's pungent gasoline aroma does not shake Fantana's confidence in its seductive powers.  As he explains, "They've done studies, you know. 60% of the time, it works every time."  Fantana's data make no sense, of course, but this is beside the point.  What matters is that "they" have done "studies."

The seductive magic of studies hit the interwebs this past week when it was revealed that a graduate student named Michael LaCour faked the data in an article entitled "When Contact Changes Minds: An Experiment on Transmission of Support for Gay Equality."  The study showed that canvassers working on behalf of marriage equality could change people's minds after relatively short conversations.  The essay also compared the persuasive power of straight and gay activists, suggesting that contact with gay canvassers produced longer and more sustainable changes in political attitudes.

LaCour co-authored the article with a professor of political science named Donald Green.  While Green helped to write the study, LaCour gathered all of the data and snookered his co-author into thinking it was real.  Green was not the only one fooled.  The findings made their way to Ira Glass's This American Life, which discussed the article in a story entitled "The Incredible Rarity of Changing Your Mind." The study was appealing because it confirmed liberal ideas about the sources of social conflict: that social divisions are caused by personal prejudices that can be dispelled if only people could get to know each other.  In addition, LaCour's data assured us that people are persuadable.  The takeaway from the study is that voters might be a lot nicer and reasonable then we might have thought.

None of this necessarily means that the findings have been proven wrong.  Ironically, activists who worked to pass a recent referendum for marriage equality in Ireland used the LaCour and Green study as a template for their own political strategy.  If LaCour had not been a quantitative social scientist, he could have simply written the study without the data.  If he was delivering a TED talk or writing an op-ed column, he could have said the same thing and possibly received critical acclaim and invitations to lucrative speaking engagements.

But LaCour inhabits an academic universe in which faking data is a cardinal sin.  Some have concluded that the current scandal proves that the system worked and confirms the importance of reliable data gathering.  As David Brookman, one of two UC Berkeley graduate students who discovered to the fake data when they tried to craft a similar study, explains:
The nature of the work that we do as quantitative researchers is that you allow the data to tell you what you think the truth should be. You don’t take your views and then apply those to the data; you let the data inform your views.
Brookman's faith in data is itself an interesting datum.  The LaCour affair seems to show that data themselves aren't what persuade people.  LaCour recognized that he just needed to have some data, that if he could produce sophisticated charts, graphs, and numbers it was unlikely that anyone would check.

For his part, Green affirms Brookman's commitment to the purity of data: 
I don’t really care how the study comes out, I just want to know how the experiment comes out. It comes out the way it comes out — I just want it to come out the same way twice, however it comes out, so that other people will find the same thing I’m finding. Then they can do replications and extensions and new directions.  I guess there was this view that maybe you had to make the findings especially spicy for people to sit up and take notice, but I don’t think I — I hope I never conveyed that view to him. That’s one of the things that’s a real head-scratcher now for me.
While this is an honest description of the professional ethos of a social scientist, Green's statement makes perfectly clear that he knows why his co-author did what he did.  Like any talented confidence man, LaCour astutely exploited the gap between what people say they want and what they really want.  Political scientists say they just want data.  What they really want is a compelling story, especially with findings that have practical political consequences.  A rising academic star like LaCour understood that few co-authors sign on to projects that have accurate numbers but little significance.  This is not to single out Green.  LaCour hoodwinked far more than a single political science professor.  His ability to "make the findings especially spicy for people to sit up and take notice" nearly got him a PhD and landed him a tenure-track job at Princeton University within six years of graduating from college.

LaCour understood that the data themselves were less important than who authorized the data.  This is why Green's name was invaluable.  Invoking the legacy of Ron Burgundy, Ira Glass reports that Green is "kind of a big deal":
LaCour is a grad student but Donald Green is kind of a big deal. Columbia Professor. Meticulous and respected. One professor told us “I trust anything Don Green publishes.”
In fairness, it is impossible not to depend upon this sort of trust to some extent or another.  When I accept scientific consensuses about global climate change or the efficacy of vaccines, for example, I trust the authority of scientists who are gathering and analyzing the data.  There is good reason to do this because the data would be entirely unintelligible to me.  I also marvel at the amazing things people can do with numbers. Anyone who remembers the predictions of 538 in the last couple national elections can attest to the accuracy of statistical analysis of polling data.

But as someone who dabbles in the study of American politics, I can also think of counter-anecdotes to the LaCour and Green study.  One could point out that their findings conflict with much of what we know about American history, which is full of examples where racial and sexual discrimination existed among populations where people interacted closely with those they feared or exploited.  Not to say that this is always the case.  If I was to draw on my own powers of anecdotal reasoning, I would say that some people might change their minds after a short conversation while others might dig in their heels.  I would also hazard a guess that some canvassers are more persuasive than others for a variety of personal reasons.  If you were to ask me what I thought of LaCour and Green's findings, then, I would probably say something wishy-washy like: "It would depend on the context."  I would need to know more information about particular people in particular situations talking about particular issues.

But a quantitative study like LaCour and Green's isn't interested in distinguishing among contexts.  They seek to control for contexts through numbers.  Before you can let the data tell you what to think, you need to decide what sorts of personal data you are going to ignore.  To this end, the faked study's abstract describes the methodological work of counting gay or straight messengers as nothing other than gay or straight messengers:
A randomized placebo-controlled trial assessed whether gay (n = 22) or straight (n = 19) messengers were effective at encouraging voters (n = 972) to support same-sex marriage and whether attitude change persisted and spread to others in voters’ social networks. 
For people to become statistical averages, they need to be randomized and anonymized.  The ability to reduce messengers to gay or straight identities depends upon eliminating lots of relevant data so that you have something consistent you can measure.  This invents datum like "gay messenger" or "straight messenger" in ways that conceal the anecdotal nature of these categories.  Examinations of how short conversations change attitudes do not assess the cultural histories of systems of classification that produce singular datum like "straight" or "gay" or "conservative" or "liberal" or "conversation."

To critical theorists who analyze these sorts of categories, LaCour and Green's data seem bizarrely naive, untheorized, and lacking in rigor.  I'm not saying that these categories are naive, untheorized, or unrigorous; I'm saying they will look that way to people whose disciplinary orientation leads them to seek out complexity, nuance, subtlety, and difference.  For scholars who seek to understand the cultural logic of systems of classification, measuring already existing classifications misses the most interesting part of the analysis.

For the kind of scholarship that I do as a critical theorist and occasional historian, a collection of anecdotes is more useful than a data set because anecdotes contain a lot more information.  I have the luxury of working with sources that are not randomized or anonymized.  Because this approach offends the axiom that "the plural of anecdote is not data" social scientists would conclude that people like me have no data at all.  They are largely correct; I have no measurable data I could separate from my own analysis.  To quantitative social scientists, therefore, critical theorists just string together a bunch of words signifying nothing; we are all LaCour.

I would accept the critique that I lack stable systems of measurement.  But this is not because my work is any less rigorous or precise than quantitative social science.  There are different measures for academic rigor because learning an academic discipline means deciding what sorts of things you are going to be imprecise about.  This is why I am not so sure about Brookman's assertion that "you let the data inform your views."  Professions of faith that the data tell their own story ignore the culturally specific choices that inform what counts as a datum in the first place.  Part of why LaCour was successful was because he was able to take advantage of uncritical beliefs that ignored how disciplinary knowledge is produced and authorized. But then again, I'm just making this up.


  1. Quite apart from the bogus data, Curtis gives us a really nice scientific critique of the methods of the study – namely he raises the methodological question of “validity” (Engler/Stausberg, 8): are the categories “gay” and “straight” empirically valid? What version of reality do they capture something about? Do the categories capture something about reality that the researchers intend, or do they have a naïve view of the nature of language and identity? These are scientific questions as much as they are questions implied by a humanist historian.

    In that sense, I disagree with what I take to be one of the conclusions Curtis draws. Both humanists (or post-humanists) and scientists think data is important. Virtuous methodologies from both sides of the academic spectrum will recognize that all data is constructed and that some constructions are better than others. Both should also recognize that constructions are real, or at least that some are more real than others. Of course,data informs the view of humanists. Humanists tend to shy away from quantitative data. I am not sure if this is a happy or unhappy accident or something more essential about the humanities. So humanist use different type of data to “inform their views.” The problem is when either “side” does not recognize the constructedness of the data. Both “sides” have standards of truth-telling. The disagreement is sometimes about whose standards are better, and this is perhaps where Curtis comes in.

    (more later, I hope)

    1. I'm not sure if Gabe is saying that I'm saying that data are important or unimportant. In any case, I think data are important. What I'm interested in is the investment that Brookman, Green, and others have in the idea that quantitative data themselves (as opposed to who authorizes the data) are what convinces them to accept an argument. So they're the ones saying that data are real because they're not constructed. I think constructions are real. This is why I spend so much time analyzing textual evidence. But where I am different from the quantitative political scientists is that I think anecdotes are real too. Anecdotes are interesting to me because they have a lot more empirical information about the real world than do the kinds of data sets that LaCour and Green produce. After all, people who tell anecdotes often making super-empirical claims (ie., when someone says I know this because I say it with my own eyes). What interests me is how much empirical information has to be eliminated in order for something to be quantified. It is in the elimination of data, and not the gathering of data, that the datum takes shape.