I'm working on a new research project, and I need your help. If monitoring or analyzing social media is part of your job—and you don't work for one of the software providers—please take part in a survey on the state of social media analysis at https://www.surveymonkey.com/r/sma-2015.

I'm looking into current trends and issues on the practice of social media analysis, as well as the technologies. Topics including adoption by industry and business function, successes and failures, and opinions about the state of available software tools. Results of the survey will be available in a free report, which will also include observations from a series of in-depth interviews and trends from Social Media Analysis, later this year.

The survey has 21 questions and should take around 5 minutes to complete. Your response is anonymous. I'm looking for a cross-section of brands, corporate social media listeners, agencies and consultants, in all regions of the world. If you have practical experience with listening to social media, please take a few minutes to share your observations.

You can find more on the project here and here.

Thanks for your help.

Shortly after last month's announcement of the new Facebook topic data service from DataSift, another kind of change showed up in my inbox: the impending disappearance of Facebook post data from social media monitoring tools. The search functions of the API that developers use to monitor public posts in Facebook are going away at the end of April, and the notices and workarounds are going out to customers now. I'm also hearing from software companies looking for alternative sources, though I have not heard of any such alternatives.

The vendor announcements say to expect fewer results with the switch to version 2.x of Facebook's Graph API (optional until April 30). The new version restricts access to information about users' friends, and it eliminates the Public Post search and News Feed search options from the Graph API. Monitoring of posts and comments on specified Facebook pages (including competitors' pages) is still supported, which creates a partial workaround.

As for the broader set of Facebook post data, DataSift's new PYLON API is the so-far exclusive source for most developers in the social media analysis business. The data includes private posts, but everything is anonymized and aggregated, and it doesn't include verbatim text. It's meant for broad analysis, not monitoring or engagement. Access is limited to the US and UK for now, so the answer for the European software developer who emailed me appears to be “no.”

Finding information about what a company doesn't do is tricky, especially with a segment of the industry that likes its trade secrets, but Facebook's announcement makes it pretty clear how they're thinking about user privacy in the data market:

We are not disclosing personally identifying information to anyone, including our partners and marketers. And, the results delivered to marketers are analyses and interpretations of the information, not actual topic data.
Facebook does offer more data through the Public Feed API and Keyword Insights API, but access is limited to high-profile mass media and a short list of developers who support them. For everyone else, it looks like Facebook doesn't want anyone monitoring their users' public posts but Facebook.

One of the nicest compliments I've received over the years came from a company founder who read one of my reports and said I'd summarized his company's work better than they did. It's just one of the things I do—take a pile of information and figure out what it's about. I summarize. So if you need to tease out the short version of something complicated, call me. But I've also been accumulating data on an industry for years, which gives me the material for a different view—the annual recap. Roll tape…

The Year in M&A, Social Media Analysis 2014
I've been tracking companies that extract meaning from social media data since 2006 (it stays interesting if you let the definitions evolve with the market). One way to tell how things are changing is to watch where the money goes, and in 2014, more money flowed to consolidation. VC and PE money funded multiple acquisitons by companies staking out hoped-for prominent positions. Big companies tucked SMA into their products and portfolios, and smaller companies chose "buy" over "build" for key capabilities.

Add some actual mergers and a few acquihires, and we get more transactions than in 2013. In other news, it takes longer to write a recap of 38 deals than one with 18 deals, which is how a year-end post shows up in early January. :-)

More Than $420 Million Invested in Social Media Analysis Companies in 2014
New investments in SMA companies were slightly below 2013 levels in dollar terms, although when you consider deals of unannounced size, we're probably close to the window of uncertainty on that. Some of that money has gone to fund acquisitions, and anybody who took a round of more than $20 million bears watching, but we're also still seeing funding for interesting and innovative companies in the space where social media and data analysis intersect.

Based on the last year's investment activity, look for continued product innovation and market evolution, in addition to ongoing consolidation.

So here's a summary: The opportunities in social media analysis are evolving, and heavy bags of money are being directed toward exploiting them. For the long version and its application to your situation, contact me about becoming a client.

Where do you find books to read? Do you ask your friends, follow reviews or seller recommendations, or just go for the bestsellers? Whether you like your books on paper or downloaded, you have to know it exists to read it, and because we're in the twenty teens, there's a social way to do it online.

Start where you are?
An obvious way to learn about books online is to ask your social networks—wherever you're connected to people online, just ask 'em. If you use different networks for different purposes, that should inform where you ask, but you have the connections. Sometimes it's just as easy as asking.

But asking doesn't always work. A discussion on Facebook about paper and ebooks this week included just such a request, but no responses. So what else can we do?

Networking for readers
How about a social network specifically for readers of books? Goodreads is exactly that, a social network built entirely upon books and the people who read them. You can look through reviews and recommendations organized by books and authors, or approach it socially, with its friends, followers and groups.

I'm getting great ideas from some very smart people I follow on Goodreads. Because of its tight focus on books, I find it easier to maintain a careful approach to connecting in Goodreads than in other networks. In addition, Goodread's updates are tied to specific books, so it doesn't have the noise problem of other networks.

On another level, Goodreads creates yet another opportunity for public image tailoring, because its entries aren't automatic. Some of us might be a bit selective in what we choose to share—more professionally relevant titles than pop fiction, for example—but that actually improves Goodreads as a socially powered recommendation engine. If people I follow choose to share only the good stuff, they're effectively curating the recommendation lists.

Gems from Twitter
Goodreads runs on effort from people in its network; what about suggestions from people who haven't joined? BookVibe takes a different approach, pulling book mentions from a user's Twitter stream to generate its lists. It's not as far along as Goodreads, and there's some overlap, but it does have the advantages of pulling its recommendations from a network you've already assembled and using existing behavior as its raw material.

BookVibe strikes me as a worthy experiment, another startup finding useful information by applying a novel analytical lens to the flood of Twitter data. In this case, the startup is Parakweet , a natural-language processing specialist that set up BookVibe as a technology demonstration.

Remember blogs?
I've seen a few blog posts with suggested reading lists, such as these from the Oxford Martin School and Mention. If you don't have a source on a topic, try searching for "reading list" and a relevant keyword or two. It's not an unusual topic for a blog post or web page.

What about the big dog?
You can't talk about books without mentioning Amazon (I checked—it's a law). I remember an analysis years ago about the many social components of an Amazon product page, although I can't find it now. Product reviews, lists and wish lists are fairly obvious features, and it's possible to find more suggestions by following the creators of reviews and lists. Just find someone you'd like to hear more from and click through to their profile for more of their reviews, lists and tags. It's sort of social, if a bit too much effort.

Amazon has the makings of a really good social network for readers, except that it's missing the social network to run it. That may change, since it bought Goodreads last year. Until then, you can do a bit of social exploration with Amazon's existing features and some manual effort.

Old skool
If all those networks can't suggest good books faster than you read them, then you read too fast. :-) Oh, and the book I'm reading now? I found it on the New Nonfiction shelf at my local library. Curator was a word long before online sharing tools borrowed it.

It's not never too late to add something to the summer reading pile. What are you reading that people should know about?

Surveillance whiteboardAs ubiquitous surveillance is increasingly the norm in our society, what are the options for limiting its scope? What are the levers that we might pull? We have more choices that you might think, but their effectiveness depends on which surveillance we might hope to limit.

One night last summer, I woke up with an idea that wouldn't leave me alone. I tried the old trick of writing it down so I could forget it, but more details kept coming, and after a couple of hours I had a whiteboard covered in notes for a book on surveillance in the private sector (this was pre-Snowden, and I wasn't interested in trying to research government intelligence activities). Maybe I'll even write it eventually.

The release of No Place to Hide, Glenn Greenwald's book on the Snowden story, provides the latest occasion to think about the challenges and complexity of privacy and freedom in a data-saturated world. I think the ongoing revelations have made clear that surveillance is about much more than closed-circuit cameras, stakeouts and hidden bugs. Data mining is a form of passive surveillance, working with data that has been created for other purposes.

Going wide to frame the question
As I was thinking about the many ways that we are watched, I wondered what mechanisms might be available to limit them. I wanted to be thorough, so I started with a framework to capture all of the possibilities. Here's what I came up with:

Constraints on personal data

The framework is meant to mimic a protocol stack, although the metaphor breaks down a bit in the higher layers. The lowest layers provide more robust protection, while the upper layers add nuance and acknowledge subtleties of different situations. Let's take a quick tour of the layers, starting at the bottom.

Hard constraints
The lowest layers represent hard constraints, which operate independently of judgment and decisions by surveillance operators:

  • Data existence
    If the data don't exist, they can't be used or abused. Cameras that are not installed, microphones that are not activated do not collect data. Unposted travel plans do not advertise absence; non-geotagged photos and posts are not used to track individual movements. At the individual level, countermeasures that prevent the generation of data exhaust will tend to defeat surveillance, as will the avoidance of known cameras and other active surveillance mechanisms.

  • Technical
    Data, once generated, can be protected, which is where much of the current discussion focuses. Operational security measures—strong passwords, access controls, malware prevention, and the like—provide the basics of protection. Encryption of stored data and communication links increase the difficulty—and cost—of surveillance, but this is an arms race. The effectiveness of technical barriers to surveillance depends substantially on who you're trying to keep out and the resources available to them.
Soft constraints
The upper layers represent soft constraints—those which depend on human judgment, decisionmaking and enforcement for their power. Each of these will tend to vary in its effectiveness by the people and organizations conducting surveillance activities.

  • Legal
    This is the second of two layers that contain most of the ongoing discussion and debate, and the default layer for those who can't join the technical discussion. The threat of enforcement may be a deterrent to some abuse. Different laws cover different actors and uses, as illustrated in the current indictment of Chinese agents for economic espionage.

  • Market
    In the private sector, there's no enforcement mechanism like market pressure—in this case, a negative reaction from disapproving customers. Companies have a strong motive to avoid activities that hurt sales and profits, and so they may be deterred from risking a perception of surveillance and data abuse. This is the layer least likely to be codified, but it has the most robust enforcement mechanism for business. In government, the equivalent constraint is political, as citizens/voters/donors/pressure groups respond to laws, policies and programs.

  • Policy
    At the organization level, policy can add limits beyond what is required by law and other obligations. Organization policy may in many cases be created in reaction to market pressure and prior hard lessons, extending the effectivenes of market pressure to limit abusive practices. In the public sector, the policy layer tends to handle the specifics of legal requirements and political pressures.

  • Ethical
    Professional and institutional ethics promise to constrain bad behavior, but the specific rules vary by industry and role, and enforcement is frequently uncertain. Still, efforts such as the Council for Big Data, Ethics, and Society are productive.

  • Personal
    Probably the weakest and certainly the least enforceable layer of all, personal values may prevent some abuse of surveillance techniques. Education and communication programs could reinforce people's sensitivity to personal privacy, but I include this layer primarily for completeness. Where surveillance operators are sensitive to personal privacy, abuses will tend not to be an issue.
Clearly, the upper layers of this framework lack some of the definitive protections of the lower layers, and they're unlikely to provide any protection from well-resourced government intelligence agencies (from multiple countries) and criminal enterprises. But surveillance (broadly construed) is also common in the private sector, where soft constraints are better than no constraints. As we consider the usefulness and desirability of the growing role of surveillance in society, we should consider all of the levers available.

One step at a time
This framework isn't meant to answer the big questions; it's about structuring an exploration of the tradeoffs we make between the utility and the costs of surveillance. Even there, this is only one of several dimensions worth considering. Surveillance happens in the private sector and government, both domestically and internationally. There's a meaningful distinction between data access and usage, and different value in different objectives. Take these dimensions and project them across the whole spectrum of active and passive techniques that we might call surveillance, and you see the scope of the topic.

Easy answers don't exist, or they're wrong. It's a complex and important topic. Maybe I should write that book.

If I write both the surveillance book and the Omniscience book (on the value that can be developed from available data), should I call them yin and yang?

Today's announcement that Twitter is buying Gnip raises big questions about the market for social media data. While it's too early to know how things will fall out, the deal changes the shape of the playing field for everyone involved—publishers, data resellers, software developers, and corporate customers.

Twitter has bought other companies in the social media analysis space—BackType (2011), Bluefin Labs (2013), Trendrr (2013)—but Gnip is a bigger deal. Gnip competes with other Twitter partners, and Twitter competes with other Gnip partners. If you weren't sure, things just got interesting.

As a reminder, here's my view of the social data ecosystem:

Social data ecosystem

Anyone who works with data from social media sources has an interest in how the rest of the ecosystem reacts to the Gnip acquisition. Here's my initial take on what to watch for:

  • Twitter competitors
    Twitter isn't the only data source for Gnip. Gnip's sources include full feeds from Tumblr, Foursquare, WordPress, and more. It also manages API access for Facebook, Google, and others that probably see Twitter as a competitor. How will these companies ("publishers" in the data market) react to the deal? Will access to data from Twitter competitors remain available through Gnip?

  • Gnip competitors
    Twitter has offered its data through multiple data partners; how will DataSift, Dataminr, and NTT Data fit into the revised model? What impact will that have on their customers? (In a post, DataSift says its "relationship, contract and data resyndication partnership" are unchanged.)

  • Other data providers
    There are other companies in the social data business, mainly those specializing in collecting data from blogs and forums. Will they add (or drop) services in response to the changing market?
I won't speculate on the answers to these questions today, but they're the questions I'm pondering in the wake of the announcement. Change reverberates, so these are things to watch.

I've asked Twitter for a comment, but I suspect we just have to wait for the answers.

Get the latest industry news at Social Media Analysis.

Poisoning the Online Well

Garbage in, garbage out. The latest from the ongoing Snowden/Greenwald revelation is a reminder that interested parties know how to plant false information on the Internet, and that some of them are probably doing it. It has implications for anyone looking for good information online, anyone with a reputation to protect, and—potentially—for everyone invested in the online world.

The piece itself is worth a look (How Covert Agents Infiltrate the Internet to Manipulate, Deceive, and Destroy Reputations). The details are more disturbing than surprising, but as you read it, ignore the focus on the British intelligence agency GCHQ. It doesn't matter whether you trust your own government's actions, and the common distinction between a country's own citizens and everyone else is also irrelevant. The same tactics are available to every government—and any other motivated group. If they don't do this already, the newly released document provides the suggestion.

For the government intelligence guys, this is just a continuation of the second oldest profession: Get your enemy's secrets; protect your own. Deceive your enemy; avoid deception. It's a challenge when multiple entities are simultaneously trying to (a) get useful information from open sources online and (b) plant deceptive information in the same sources. I wonder how much blue-on-blue deception happens between information operations and open-source intelligence gathering, anyway.

For everyone else, this latest report should serve as a reminder of some of the risks in social media:

  1. Data quality risk
    People tell lies online—I know, but it's true. Some of the false information out there may have been placed by a motivated adversary who wants to mislead you (maybe even you, specifically). The target may be your organization, a related organization or someone who wants to work with you.

    The information you find online can be a useful source, but it's not the only source. If you're informing significant decisions, use all of your available resources, and be alert to the possibility of intentional deception.

  2. Reputation risk
    We're familiar with the concept of online reputation risk; corporate risk managers seem to think it's almost synonymous with "social media." If your business has potential exposure to government opposition (from whatever country), your risk may come from a better organized and funded source than the usual unhappy former customer.

  3. Target risk
    As people conduct their personal and political lives online, they expose themselves to snooping and more. The threats to personal privacy and freedom by government agencies have made the ongoing revelations newsworthy, but these public and semi-public channels are equally exposed to anyone who disagrees.

  4. Collateral damage risk
    Some of these information operations happen in the same online venues as normal personal use. As competing governments start viewing the online world through the cyber battlespace lens, normal users and the platforms themselves could take some damage. Off the top of my head, I'm thinking of legal, market, and technical risks, but that's probably just a start.

    It's too much to go into in a post, but companies with significant exposure to covert online tactics would be well served to chase down the implications of those tactics, and don't limit the discussion to legal exposure. Beyond the specifics on any one program, the revelations of the last year indicate the willingness of government entities in multiple countries to use environments operated by private-sector companies in ways they weren't intended. The safe asumptions are that governments are doing more than we know, and so are other types of organizations.

Politically, it matters very much who is doing what to whom and why. As a practical matter, who and why don't much matter. It's enough to know that someone, somewhere is developing and using methods to use popular online tools against people and organizatons they don't like. If you depend on online tools and don't have a basic literacy in the concept of cyberwar, it's time to learn, so you can recognize it if it comes to your neighborhood.

One of the great strengths of the Internet is the way it overcomes the limitations of distance. A side effect is that it also does away with the concept of a safe distance from danger.


Updating the Highlights Reel

In 2007—has it really been so long?—I posted a list of older posts that I thought were worth remembering. The relentless updating of the reverse-chronological blog format was hiding some good stuff, and I wanted people to find it. Over time, some of those old posts became truly outdated, and I've gotten into some new themes. It was time for an update, and in the process, I was reminded of where we've been—and where we're going.

The complete list: Highlights from the Archive

History of social media
The updated list goes all the way back to 2006, when I first sketched out the role of the social media manager. It's not quite what I would write today, but I think it holds up reasonably well, especially given that the perceived need at the time was "blogger relations." Somewhat more recently, the posts on influence and the meaning of "Like" aren't exactly what everyone else had to say on those topics.

Social media analysis
From "listening" to the latest emerging tech for analytics, I've been watching and writing about SMA for years. A 2008 post on the building blocks of social media analysis set the stage for later lists of companies offering the various pieces. I still like the three buckets of social media data framework as a way of sorting out the many tools in the market, too.

I particularly enjoyed rereading Language Support in Social Media Analysis, a detailed look at all the different ways that a vendor might check the language box. In my public speaking, I tend to go high-level and generalize a lot, and this example shows why. When you get into the specifics, they get very specific, and heavily dependent on a client's situation.

Expanding horizons
For several years, there's been some tension between the blog that started with a strong emphasis on social media and the topics I find interesting more recently. I've hinted at some of the topics with the summer reading posts and some others, and now it's time to put more emphasis on the new stuff.

The whiteboard series of posts was a step toward sharing some of the speculation that develops on the literal whiteboards in my office. The Omniscience, computer attention, and learning ecosystem ideas from that series are themes that I need to revisit, and there are others in the drafts folder.

Expect more connecting of dots from diverse sources, such as last year's Simulations, Customer Journeys, and the Link Between What Could Happen and What Did Happen. I'm not sure why I'm still surprised to find connections between the seemingly unrelated topics I dig into. The latest example crosses long-term policy analysis, simulations, wargames, the mechanics of human insight, network science, and associative memory—my sources keep citing each other. There's no social media angle, just fascinating stuff.

I've been involved in working through the meaning and implications of new technologies for a long time, and there's less for me to do once a technology reaches mass adoption and people understand it. With the social media market maturing into something that holds fewer mysteries, I plan to write more about those new topics.


Social Media Analysis is my attempt at a sort of online industry trade journal covering the companies that work with social media data. Last year, I started a recap of the financial transactions in the business, so let's catch up with 2013.

2013 Saw More, Bigger Investments in Social Media Analysis
First, where the investment money went. And boy, did it go, more than $465 million. The champion fundraiser this year—by far—was HootSuite, with $165 million added to its runway.

The Year in M&A (and an IPO), Social Media Analysis 2013
Once all those companies are funded, some of them get acquired. One even went public. The big theme seems to be consolidation, as buyers picked up companies with complementary technology, products and people. At this rate, we should finish concentrating the industry by about 2080.

SMA would be better with more content, but I need help if it's going to get it. I have ideas for new sections, including opinion columns, product reviews, how-to articles and more. Anyone interested in becoming a contributor?

I'm going to do something old-school and blog about a couple of blog posts today. Consider it a break from the latest outragefest on the 'book. Instead, let's share bright ideas about large-impact innovation and how we've been looking for it in the wrong places. It's what happens when two posts, posted months apart, cross my desktop in the same morning.

First up: Jerzy Gangi's post from August, Why Silicon Valley Funds Instagrams, not Hyperloops, runs down the reasons that venture-funded startups keep launching relatively easy web-based software applications. It's worth a read. The short version is, that's what the investment system is looking for, and [insert Willie Sutton quote here].

Next is "Killer Apps" Evolve, Vinnie Mirchandani previewing Chunka Mui and Paul Carroll’s new book, The New Killer Apps: How Large Companies Can Out-Innovate Start-Ups. Google's self-driving cars are one example (built with investment from both corporate and government sources).

We shouldn't be surprised that startups and investors play by the rules of the game. Innovation and addressing the big issues of our time, however, are not the game they're playing.

The M&A market can be characterized as a giant distributed R&D department for major corporations.
— Jerzy Gangi

Remember corporate R&D? Bell Labs, PARC, Lockheed's Skunk Works? Big companies exist to take on projects and markets that are too big for small companies, and part of what they do is large-scale innovation. Whether they invent in their own labs or build from acquired startups, big changes that take place in the physical world will happen only when somebody puts serious capital behind them.

It's interesting that the old-school sources of innovation—university, government and corporate labs—are still out there, and despite long-term reductions, they're still at work. If we're looking for the world-changing innovations, maybe we just need to put more effort into learning about them and their projects.

About Nathan Gilliatt

  • ng.jpg
  • Voracious learner and explorer. Analyst tracking technologies and markets in intelligence, analytics and social media. Advisor to buyers, sellers and investors. Writing my next book.
  • Principal, Social Target
  • Profile
  • Highlights from the archive


Monthly Archives