Human vs. machine analysis


How do you like your social media analysis? Do you want the speed and scalability of an automated process, or do you prefer the subtlety and insight of a human analyst? Companies offering these services disagree on whether software or people are better at the task, and they're taking different approaches to answer the question.

First, let's define analysis. For this discussion, I'm talking about the process of rating individual items—posts, comments, messages, articles—on things like topic, sentiment and influence. Summarizing the data from many items and creating charts and reports come later.

The extremes
One end of the spectrum is fully automated analysis. Some companies have invested significant time and capital in systems that automate text analysis. They use terms like patent-pending and natural-language processing to describe software that "reads" and scores social media. Automated processes are usually—but not always—behind client dashboards.

The other end of the spectrum is human analysis, for those not convinced that computers can accurately rate written materials. These companies talk about human insight, subtlety and the ability to identify sarcasm. Some make a big deal of the quality of their employees, which makes sense, since their services are the product of their analysts' thought processes.

That's a fairly clear contrast, but it's not that simple. Human analysts benefit from the speed of computers, and automated processes benefit from occasional oversight. Which brings us to two hybrid forms that a number of companies have adopted.

Software-assisted human analysis
The essence of human analysis is the decision making. It's not necessary to make the analyst do all the work when insight is the critical component. So some companies use software that organizes items and provides a user interface for the analyst. The system may even suggest preliminary scores for the analyst to confirm. Software-assisted human analysis uses the computer's speed to increase the efficiency of the human analyst.

Human-assisted software analysis
Software analysis is about speed, scale and predictability. The question is whether the resulting analysis is accurate enough to be useful. So some companies have human analysts audit the results. The process provides confirmation and feeds into machine-learning processes. Human-assisted software analysis uses human insight to check and improve the accuracy of the software.

In practice, most companies hedge their bets. Those with major software investments sell the benefit of automated analysis in their dashboards while offering human analysis and interpretation as separate services. In essence, they offer computer speed and scale and human subtlety and insight in separate packages. The human-analysis companies tend toward the software-assisted model for its efficiency benefits. Almost everyone offers custom research based on the combination of human insight and analytical software. When it's time to crunch the data, everyone seems to agree on that particular combination.




What's your prediction for the lifespan of these "web 2.0" media tools and services? Ultimately it boils down to some kind of search to gather and then software or people or combination to sort it out and make sense of it.

If we take a step back and look at the lifecycle of the blogosphere, and the need to understand it, how's it going to look going forward?

Compare it to the dot com bubble from 1998-2002. In the early stages tech companies with real products were hot, then everyone that was on the web or about the web was hot, money was flowing, then pop. Suddenly all the words of a "new economy" fell away. Investors came back to consider what kind of tangible product or service these companies really provided or were likely ever to provide and how much need there was going to be for it. They took their money out in mass. Reality disposed of many of the players.

Now consider that bell shaped curve with 2000 at the pinacle. Compare something like that to the growing awarness and commercialism of blogs. The sheer volume on the upward slope creates so much content opportunity that all these companies are springing up to help understand the content. With the content, awareness is growing, so the list of potential clients is growing. At some point, near the top, blog content will become so saturated, I predict that people will withdraw somewhat - it won't be sustainable. There will be disenfrachisement of some that have struggled to manage anything beyond the most macro of the trends - the ground swell of public response around major product, media, or social events. On the downward slope, as some people drop out, content of many sites becomes little more than compilations of feeds from other sites, mass aggregations, there will be less unique content to mine. The major players will consolidate as the number of clients that want to invest to understand diminishes, or those clients that still want to participate have run through their own learning curve and have evolved to a reduced, but sustainable level of investment comensurate with their ability to apply the information meaningfully to their respective business.

So, that's my thought. Do you agree there will be a lifecycle? How do you see things playing out?

I think that Gartner's hype cycle is a useful model for this sort of thing. Social media are on the initial upslope, which is followed by a crash in the interest level. The really interesting part is the slow build after the crash, when the real growth and lasting opportunity show up. The New Economy talk disappeared, but how many people use online shopping and banking now? How many people go to online sources first when they want information? The hype faded, but reality incorporated the new tools.

The growth of social media analysis follows the growth of social media, which is complicated in itself. I think the generational aspect gives social media staying power, though of course the details will change. The monitoring/analysis/research services will become normal, and in many cases will be a line item among companies' services. There's hype this year, but the hype cycle suggests that the real growth is still ahead.

Very interesting post Nathan, and an interesting first reply as well. One element to keep in mind I think, is that the blogosphere is but one platform where consumers are sharing their beliefs, opinions, and experiences.

Frankly, I am not at all surprised that it now appears that blogs are being called into relative question with regard to their true realm of influence across the virtual chasm into what I would call "the wider and more mainstream online conversing consumers," and with this slowly progressing trend in thinking it is important to remember that not all social media analysis is tied directly and only to blogs.

Threaded message boards have been around for a very long time, truly represent the most massive deposit of consumer insights online, and also truly represent an ongoing, mixed-constituency, consumer conversation. I think we cannot tie the future of this still-new form of research to the future of the blogosphere.

Also, to the original post, I think that in the overall balance between Human-Assisted Software, and Software-Assisted Humans, it is worth applying the same view to the ways in which any given vendor approaches the identification, qualification, and collection of their source and sample data. As the old saying goes, "garbage in, garbage out," and to claim that there is a lot of garbage (noise) out there across the global online consumer conversation would be an understatement.

What is the real signal-to-noise ratio out there in these consumer exchanges, and how is that being addressed? Where does the automation start and stop with regard to this source identification process? And then of course, the larger question, what is the current state of the balance between Automation and Accuracy in digitally-directed research?

Right with you, Josh. I was already planning a post on blogs vs other online sources, so I guess I'll have to quote your comment there. I'm careful not to describe this as blog monitoring, unless that accurately describes what a company does (not usually the case).

On source selection, I don't think there's much controversy in your statement. Some companies make more of a point of eliminating splogs or selecting sources for relevance, and they vary on using algorithms or people in the selection process. Similar categories probably apply, though the tools are necessarily different.

Nathan, I agree with Joshua, that most social media content is not in blogs, but is probably to be found in forums, which have been around for a long time on the web.

While blogs don't have the amount of content, and some would argue quality of content as forums, blogs are important because they have a unique place within many social media technologies. Where a forum has its members hold a discussion within the website. Blogs form part of a community. People post articles and link back to other people's blog articles. That linking helps with search engine ranking. And the starting point for many people in searching for answers on the web are search engines, blogs are important because a community of blogs can help a discussion to between them to rise to the top of the rankings in search engines. Dell Hell from Jeff Jarvis is an excellent example of this in action.

Nathan, I really think your image makes a lot of sense. From the discussion I've had with people using some of these tools, its a matter of using both software and human narrative analysis.

Thanks, John.

It's worth pointing out that the degree of automation in the analysis varies not only by company but by type of analysis. In the big grey area between the black-and-white extremes, companies are applying software where it works better and human analysts where the technology isn't as helpful. So, for example, topic extraction and link analysis might be more automated, while sentiment could be more manual.

Comments are now closed for this entry.

About Nathan Gilliatt

  • ng.jpg
  • Voracious learner and explorer. Analyst tracking technologies and markets in intelligence, analytics and social media. Studying complexity and futures.
  • Principal, Social Target


  • Subscribe by email