Can Analytics Be Taught?


I've pointed out some of the elements of the learning curve for social media analysts. In the middle of looking at almost 30 social media analysis platforms for my recent report, I realized that the software itself isn't the main challenge—developing the analytical mindset to know what to do with the tool is. The question is, how much of that mindset can be taught? How do we teach people to ask penetrating questions using a simple set of analytical tools?

How's your logic?
Here's an example of the challenge. Most social media analysis tools use keyword searches to define topics or to segment the data with subtopics. The query typically takes one of three forms: a simple search, a Boolean query, or an advanced search that simplifies the process of building the query. (Boolean logic isn't the only technique used to define topics, but other methods are more complex, and the companies that use them set them up for their clients.)

Search using Boolean logic seems simple. You use operators like AND, OR, and NOT to include or exclude keywords from your results. Some tools let you get fancy with proximity operators (x within n words of y), and you can nest your statements for finer control. But many of us think we understand how it works.

So it could be a bit of a shock to see the queries presented by Integrasco's Aleksander Stensby at Monitoring Social Media Bootcamp last week. This little one finds the telephone company Orange in English-language content:

(Orange OR subject:Orange -subject:light -light -"Clockwork Orange" -subject:"Clockwork Orange" -"orange box" -subject:"orange box" -juice -subject:juice -fruit -subject:fruit -peel -subject:peel -"Orange Wednesday" -subject:"Orange Wednesday" -"orange county" -subject:"orange county" -"clock work orange" -subject:"clock work orange" -"orange ink" -subject:"orange ink" -"bright orange" -subject:"bright orange" -"dark orange" -subject:"dark orange" -"light orange" -subject:"light orange" -("color orange"~3) -subject:("color orange"~3) - ("style orange"~3) -subject:("style orange"~3)) AND ( (SMS OR MMS OR HDSPA OR "Mobile Phone" OR GSM OR GPRS OR 3G OR SIM OR handset OR "Sony Ericsson" OR Nokia OR HTC OR Motorola OR BlackBerry OR iPhone OR PAYG OR "pay-as-you-go" OR "Network Provider" OR UMTS OR WAP OR PDA OR "PAC Code" OR Cellphone OR OFCOM OR phones4u OR voda OR vodafone OR tmobile OR tmob OR "T-mobile" OR T-Mob) OR subject:(SMS OR MMS OR HDSPA OR "Mobile Phone" OR GSM OR GPRS OR 3G OR SIM OR handset OR "Sony Ericsson" OR Nokia OR HTC OR Motorola OR BlackBerry OR iPhone OR PAYG OR "pay-as-you-go" OR "Network Provider" OR UMTS OR WAP OR PDA OR "PAC Code" OR Cellphone OR OFCOM OR phones4u OR voda OR vodafone OR tmobile OR tmob OR "T-mobile" OR T-Mob) )

He showed another one, about nine times as long, that finds discussions in multiple languages of the form factor of a particular mobile phone. You can see that endless query in Aleksander's presentation.

So, yeah, we know Boolean logic, but wow.

It's not difficult, just hard
These intensely focused queries illustrate the difference between the two learning curves. A query like this could be pasted into many—maybe most—of the available tools for social media analysis. Working out the nested Boolean logic is the trick.

Eric Garland puts a competitive spin on things with this note from a discussion of the future of intelligence at GWU:

Asymmetry of analysis will be more important than asymmetry of information—it’s not who collects the most data, but who is the best at deriving insights who will be most effective.

The question is, how easily can we develop the right combination of logic, curiosity, and perseverance in those who would analyze social media? Is it teachable, and how much of it depends on existing inclinations in future analysts? Or is there really, as someone at MSMBC suggested, a business opportunity in crafting complex queries as a service?

I wonder how many on-topic posts include exclusion keywords: "I called Orange to complain about my phone while eating a piece of fruit."


Great post. I'm very familiar with this kind of Boolean logic. It can drive one round the bend! There's only one company I had to give up on and that was mobile phone operator "3". OK, people say just look for "3" within so many words of 'telecom' related words. Yeah, we tried that: Nokia released 3 new mobile phones today...". That was a very tricky one!


[shaking head slowly]

Whatever you charged 3, it wasn't enough. Maybe we should add findability to .com availability in the brand evaluation guidelines...

Nice post Nathan - I've been chewing over the same question recently.

My view: analytical skills for social media can be taught, but with two big challenges:

1. There is no instruction manual. As the Orange example shows, there isn't one right way of doing things - just a lot of iteration; trial and error - and somewhere down the line, hopefully some inspiration. This is at the heart of my view that searching for a "standard" for social media analytics is a wild goose chase - it's a lot easier to build a meaningful method for social media analysis in the specific case, than in the general

2. Statistical science isn't much support. The idiosyncrasies and vagaries of social media, compared to other channels really gets in the way. It's often impossible to validate findings with confidence (and even impossible to get the data you would need for validation sometimes). This can be a huge challenge for individuals and organisations used to working within the comfort zone of traditional intention analysis

So these factors aren't deal-breakers, but they call for a very wide range of skills - to learn by trial and error, and have the juice to implement those findings; to effect culture change in the way research validation is viewed in the organisation; to be flexible and responsive to constant, fast-moving changes in the landscape of tools and resources available for analytics; to build a streamlined, flexible and adaptive methodology for collecting and analysing content.

I think it sounds like a great job! Realistically it doesn't exist in corporations right now, although I think it should. Feels to me that it's naturally a part of the brand insight function - and it makes more sense in the long run (because of the cumulative value of the time spent on iteration and experimentation) for it to sit here, than to be an agency outsource service.

I've thought about blogging on the topic of boolean logic myself. I learned it during my undergraduate in computer engineering - but the type of personality attracted to learning structured logic may not be the typical sort attracted to using social media tools. The task of teaching it is a tough one, but it's still preferable than relying on various "wizards" that are offered by many analytical systems, all of which are inherently limiting. At least, one can experiment with boolean queries in google or blogpulse until one feels comfortable with the results.

Boolean logic search profiles have taught me the value of a unique brand name; it is infinitely easier to search for something like "Verizon" than "National Phone Company".

The more metadata we have to work with, the less boolean logic we will need.


That (Orange) is a good one. We did a project for a major US rental care company and guess what the competitive set looks like

Hertz (easy)

Building the boolean logic to include the relevant data and exclude the irrelevant was a bit of a challenge.

But if you get the initial data wrong, the rest of the analysis is meaningless.


Ah, yes. GIGO still applies. :-)

Comments are now closed for this entry.

About Nathan Gilliatt

  • ng.jpg
  • Voracious learner and explorer. Analyst tracking technologies and markets in intelligence, analytics and social media. Advisor to buyers, sellers and investors. Writing my next book.
  • Principal, Social Target
  • Profile
  • Highlights from the archive


Monthly Archives