Why we like Mechanical Turk for short experiments
If you haven’t yet checked out Amazon Mechanical Turk for doing research with human subjects, you should! We like Turk for running short, simple studies. For instance, if you want to find out how several hundred people would describe a single picture, or survey a big crowd to find out which name they prefer for a new product you’re releasing — and you want the data really fast, Turk is a really useful tool.
When I did my thesis work in cognitive science, one question I was interested in was how people interpreted words like “tall” and “long” and “big” in different contexts. I was able to show thousands of people one drawing each and ask them to pick out the tall or long items in each picture. It was a fast and inexpensive way to get data, compared to bringing people into the laboratory.
So does it work well? Earlier this year, Language Log had a nice writeup of a linguistic sentiment experiment on Turk that they ran primarily to test Turk’s reliability for simple linguistics experiments. They found the results to be quite good. Another great analysis, courtesy of Dolores Labs, concludes that Turk is “fast, cheap, and good for machine learning data.”
Other questions that researchers have is who participates in tasks posted to Turk, and why. Panos Ipeirotis has some great data on why people participate in tasks on Turk. FloozySpeak provides more detailed information on what entices people to participate — and how much time they spend on Turk. And Ipeirotis (a great resource on Turk in general) does a nice job analyzing the demographics and of Turkers.
Mechanical Turk was not designed with scientists or researchers in mind, nor do they have human subjects’ rights on their mind in watching out for the participants. There are a lot of times when the system can be a bit frustrating when doing research — especially longer or more complex studies. But it can be a great tool for many things!