Random samples, convenience samples, and moral hazards

Does MTurk provide crappy sampling’? This post by Andrew Gelman suggests some people use the ‘moral hazard’ argument to suggest that adjusting opt-in samples is sloppy…or that it will not encourage people to even try to get representative samples.

(The moral hazard argument) goes like this. If it becomes widely accepted that properly adjusted opt-in samples can give reasonable results, then there’s a motivation for survey organizations to not even try to get representative samples, to simply go with the sloppiest, easiest, most convenient thing out there. Just put up a website and have people click. Or use Mechanical Turk. Or send a couple of interviewers with clipboards out to the nearest mall to interview passersby. Whatever. Once word gets out that it’s OK to adjust, there goes all restraint.

It’s the same reason why we shouldn’t put air bags and bumpers on cars—it just encourages people to drive recklessly.

I don’t find the moral hazard argument particularly convincing—for one thing, I worry about the framing in terms of bad opt-in samples and good probability samples, as I think it encourages a dangerous complacency with probability samples.

And, for that matter, I’m not a fan of crappy sampling: the worse the sampling, the more of a burden it puts on the adjustment. That’s why I think we should be emphasizing sampling design, practical sampling, careful measurement, and comprehensive adjustment as complementary tools in surveys. You want to do all four of these as best you can.

And this post by Thomas Lumley suggests that we need to ‘signal’ ways for people to know that our samples are not bogus:

Today, response rates are much lower, cell-phones are common, links between phone number and geographic location are weaker, and the correspondence between random selection of phones and random selection of potential respondents is more complicated. Random-digit dialling, while still helpful, is much less important to survey accuracy than it used to be. It still has a lot of value as a signalling mechanism, distinguishing Gallup and Pew Research from Honest Joe’s Sample Emporium and website clicky polls.

Signalling is valuable to the signaller and to consumer, but it’s harmful to people trying to innovate. If you’re involved with a serious endeavour in public opinion research that recruits a qualitatively representative panel and then spends its money on modelling rather than on sampling, you’re going to be upset with the spreading of fear, uncertainty, and doubt about opt-in sampling.

If you’re a panel-based survey organisation, the challenge isn’t to maintain your principles and avoid doing bogus polling, it’s to find some new way for consumers to distinguish your serious estimates from other people’s bogus ones. They’re not going to do it by evaluating the quality of your statistical modelling.

It seems like the number of studies using MTurk should legitimize the sampling technique yet we probably need a few more comparison studies to support the validity.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s