Introducing Amazon’s mighty Mechanical Turk

September 20th, 2011 by Yali

Many companies, both inside and outside the tech industry, invest significant resources in using tech to automate business processes. Some processes, however, are much better done by people than machines. This makes automating them and then scaling them difficult. For some of these business processes, however, Amazon’s Mechanical Turk provides a way to effectively “automate” the manual step, providing an incredibly powerful tool to build scalable systems and processes that rely on human input.

Amazon’s Mechanical Turk has been around for some time (it was first launched in 2005). However, we’re surprised by the number of businesses we encounter who are still not aware of Mechanical Turk and the potential ways using it could save them time and money, whilst opening up new product development opportunities.  In this blog post series, we look at what Mechanical Turk is, how it works, where it works best, how to start using it and how to build scalable systems around it that effectively yield accurate results.

So what is Mechanical Turk?

Mechanical Turk is two things:

  1. A huge market of remote workers eager to do well-defined tasks
  2. A set of tools for engaging with that worker community, making it easy to (i) set them tasks, (ii) collect the results, (iii) assess the accuracy of those results, (iv) pay them for their time

Mechanical Turk is perfect for situations where a company has a big job that needs doing and:

  1. The big job can be broken down into a large number of independent steps
  2. Those independent steps are better performed by a human than a computer. Typically this means applying a value-based judgement for instance: “is this suitable for work?” (content moderation), “is this in Italian?” (language detection), “is this current affairs?” (categorisation)
  3. Those steps are all of a particular type (or a set number of types).  In other words, a single set of instructions is suitable for a large volume of tasks / work

Amazon gives several examples of potential tasks, including categorising items and or moderating content. Categorising items is a famously difficult thing for machines to do (often requiring sophisticated semantic analysis), but it is something that people do very naturally. (As Wittgenstein noted in the case of working out what does and does not constitute e.g. a “game“.)

At Keplar, we have been using Mechanical Turk to check the language of specific items of content in a large data set, which itself is being used to train a semantic engine. Whilst it is possible to use computer programmes to ascertain the language of each content item (Google Translate, for example), these systems are not accurate enough for our purposes: only humans can provide the accuracy we require, and Mechanical Turk is one of the few places we can easily get the volume of people, with the requisite language skills, to quickly and efficiently check the large number of data points necessary to build a clean data set.

In the remaining three blog posts in this series, we will explore:

  1. How to get started with Mechanical Turk
  2. Approaches to ensuring accuracy
  3. Best practices, tools and technologies to build scalable systems that leverage Mechanical Turk

Leave a Reply