"Build your own listening tool" has been a popular topic, with suggestions usually building on free combinations of search feeds and RSS tricks. Twitter, especially, has inspired a whole constellation of free tools. But "build your own" has a deep end of the pool, too. Whether you're building a customized tool for specific, internal requirements or realizing your vision of the perfect entrant to an overcrowded market, building your own tool involves a series of build-or-buy decisions, starting with where you will get your source data.
There's a list below, but first, some background.
I first wrote about the building blocks of social media analysis in 2008. The short version is this: any system for monitoring or measuring social media has three basic components: data collection, analytics, and application. The differences are in the details, and each component can be the subject of its own build-or-buy decision. If your goal is to beat the industry standard in a particular area, you build, but if industry-standard is good enough, you don't have to.
At the time of the original post, all of the components were available separately for companies who were building their own systems, either for their own use or for commercial product development. Now, more options are in the market, especially in data collection. Lots of search engines offer RSS feeds. This is something else: services that aggregate social media data from multiple sources for business or commercial users.
More than just aggregation
The data collection step is about more than mashing together multiple search feeds. For the professional-strength aggregator, the finished product—like paid products in other categories—does something the free tools don't offer.
Social media comes in lots of flavors, and aggregators need to keep up with the introduction of new services. Commercial aggregators can also deliver content that simply isn't available without a subscription, such as full feeds from traditional media.
Once you fill the pipe with incoming content, it's time to screen out the junk. Removing duplicate items is a start; removing near-duplicates (such as syndicated content or press releases) helps, too. Spam removal is a big deal.
Another kind of filtering is prescreening the content for relevance to the customer. How that works is part of the aggregator's secret sauce and will differ by provider.
- Metadata tagging
We spend a lot of time thinking about analyzing the unstructured text in social media, but most of the content also has structured data around it (such as the source, publication date, and number of comments). Aggregators can also pull information about posts from third-party sites to complete the picture.
Oh, and do all of the above quickly, please. Financial applications go to extremes to reduce latency (the lag between when content is posted and when it shows up in the aggregator), but it's a factor in less demanding environments, too. If you're monitoring Twitter, for example, you need to know in seconds or minutes, or you'll be too late to respond in that near-real-time environment.
Oh, yeah, this is a list post. Companies who offer social media content aggregation as a service (updated 30 Jan 2012):
- Collective Intellect
- Context Voice (UberVU)
- Dow Jones
- Effyis (Boardreader)
More posts in the "Build or Buy?" series:
- Building Blocks of Social Media Analysis
- Text Analytics in the Cloud
- The Rise of the "Influence" Peddlers
Photo by identity chris is.