Saturday, January 5, 2013

Detecting #controversy through ratings histogram

   In looking at book reviews at I realized that the five-star rating histogram accurately reveals whether the product being reviewed is a controversial one.

   The 5-point scale can be interpreted in four ways: quite obviously, if the product is generally well-received then the histogram prevails at the top. If the reveal a downward-swelling trend, then product is generally disliked. The approximate normal distribution within the five bars might suggest that the product is so-so.

   But the really interesting interpretation is this: when two outer-most bars on the histogram have the highest values, it implies that we're dealing with a truly controversial product.

This seems to hold any book that's even slightly controversial, like Hitchens's The Missionary Position, E. L. James' Fifty Shades of Grey or Naomi Klein's The Shock Doctrine.

   I'm sure this observation can be applied to algorithmically fish out books that polarize the audience. Controversy attracts attention, so presumably a sort in descending order by controversy could be a useful feature for literary critics who seek to review controversial books. Just a thought.

