Teaching the Computer to Read--in Stages


So many times, I've heard that (pick one) only humans can identify sentiment in text, or the software is very good now. I don't call it a debate, because I don't see the sides talking. Setting aside the question of software maturity, what is it we want the computer to do, and how far along are the tools?

When I need to explain the concept of text analytics for the first time, I usually summarize it as "teaching the computer to read," which—no surprise—isn't an original phrase. It goes back at least to the early '90s. But reading for meaning is still more of a goal than the current reality. Today, the tools are somewhere on a continuum, which I think looks something like this:

  1. Content discovery
    The challenge is social media analysis starts with the attempt to "read the Internet" (all of it). The simple approach to selecting the part we care about is to use a keyword match or Boolean query, but probabilistic and semantic approaches are out there.

    Success criteria: recall (completeness) and speed

  2. Filtering for relevance
    Source data is cleaned, removing spam, duplicates, and off-topic items. Company names that are also words make relevance filtering important and a point of differentiation for some vendors.

    Success criteria: precision (% relevant content)

  3. Extracting concepts
    Natural-language processing (NLP) yields a list of key words and phrases, which generates those brand-association and leading topics reports. Also very useful for grouping items for end-users of the system. More advanced approaches group related topics and synonyms.

    Success criteria: usefulness (low noise), accuracy

  4. Extracting facts
    NLP identifies factual statements based on grammatical analysis of content. This is helpful for understanding the reason behind sentiment and potentially huge for competitive intelligence and finance applications.

    Success criteria: accuracy, useful summarization

  5. Determining opinion
    If you want to start a good argument, bring up sentiment (although I never seem to find opposing viewpoints on human vs. machine analysis in the same place). It's popular as a PR metric and useful as a filter, so it's one of the usual metrics in social media tools. Some vendors go beyond tonality (positive/negative) and provide an analysis of the emotional content of the text.

    Success criteria: accuracy, consistency, depth

  6. Reading for meaning
    What we really want: the computer reads mountains of text and, after accounting for source reliability and influence, delivers an accurate summary and metrics, cross-references sources, and synthesizes an accurate view of the situation.

    Success criteria: not holding my breath.

All of these—well, one through five—are in the market today. The debate, such as it is, centers on how well current tools perform these analyses, and frankly, I'm not sure anyone really knows. There's not much demand for a competitive test, and not much incentive for vendors to participate in one.

I hear claims of 90% and better accuracy on sentiment, but a test would require a comparison with imperfect, human coding. In any case, one text analytics provider I talked with said that specific accuracy rates are not a client concern. Their focus is on the value of the resulting analysis, and good enough is good enough.

I'm not a scientist, and somewhere out there is a computational linguist whose left eye is twitching over some mistake I've made. Comments are open—go for it.


Brilliant stuff Nathan!

On the part about interpretation and meaning (i.e. context) I've been on both sides of the machine vs human analysis argument, and I know which one I would choose to back with my own reputation.

We hear about influence barometers often when people come to us who have used other services, and while they have been successful in convincing their clients that links in/out, references/blogrolls, page views and compete stats do the trick, I personally would only ever feel comfortable in qualifying precision of infleunce as a metric through human review. I think part of the issue here is that most are following the flow, partly because its a metric that is constantly changing, subjective, unique to each client and is one of the most time consuming aspects (aggregating and reporting). However, it is also one of the most fundamental aspects of determining whether listening needs to be taken to the next stage (i.e. engagement). If everyone else is embracing it as the norm, few will think to question it in an objective way (i.e. this leads into the whole comments, not page views, as a proxy for engagement).

On the point regarding synthesis of machine analysis, we have made some significant advancements in this area in the last few years - specifically in terms of fine tuning our own risk radar. And a large player in the evolution process are the incidents themselves. One example which I'd like to use speaks really well on the limitations on machine analysis.

We had an online incident which described the colour of the blog authors dog (which happened to be a Lab) using the word "bomb" black. Ironically, the dog happened to share the same name as our client. While such an incident would normally have been marked with a medium-level of risk on the machine portion of our analysis (to be later reviewed in cue by a human for precision), there was a matter of timing and a very serious incident of activism leading to terrorism which factored into the machines heightened risk assessment. In a nutshell, the real value of analysis becomes evident when we are able to use the machine portion for content discovery, fact extraction and surface level interpretation, however IMHO the human review is the part that meets and exceeds clients expectations.

That's a good point, Joseph, and reminds me that the computer-analysis crowd usually recommends some degree of sanity checking by humans. In any case, the end result of analysis is a decision, which is still in the human realm. So the human vs. computer question happens upstream of the inevitable handoff to a person who wants the information and will act on it.

Unless you want to go in the ad targeting direction, where all kinds of automation fun happens.

Comments are now closed for this entry.

About Nathan Gilliatt

  • ng.jpg
  • Voracious learner and explorer. Analyst tracking technologies and markets in intelligence, analytics and social media. Advisor to buyers, sellers and investors. Writing my next book.
  • Principal, Social Target
  • Profile
  • Highlights from the archive


Monthly Archives