Recently in Legal Category

Writing at Wired UK, Paul Wright has some concerns about the use of social media monitoring in law enforcement: Meet Prism's little brother: Socmint. I'll quote a couple of sections, but you need to read the whole piece; its tone is at least as important as its content.

For the past two years a secretive unit in the Metropolitan Police has been developing the tools for blanket surveillance of the public's social media conversations,

Operating 24 hours a day, seven days a week, a staff of 17 officers in the National Domestic Extremism Unit (NDEU) has been scanning the public's tweets, YouTube videos, Facebook profiles, and anything else UK citizens post in the public online sphere.

The intelligence gathering technique—sometimes known as Social Media Intelligence or Socmint—has been used in conjunction with an alarming array of sophisticated analytical tools. [emphasis added]

Wright has a fairly alarmist—but accurate—take on something that's obvious to anyone who thinks about it: outside of a few protected spaces, what we do in social media is public, and government security and law enforcement agencies are using that data. It's the details of what they do with it that will make some people uncomfortable.

The problem is that public is gaining new depth of meaning as information moves online, and we haven't sorted the implications.

Nothing changes, but everything's changed
The new public information is persistent, searchable, and rich with analytic potential. I wrote about this last year (Why Government Monitoring Is Creepy), and it's still where I think we need to start. People seem to be expecting a sort of semi-privacy online, but the technology doesn't have that distinction. Data is either public or private, and the private space is shrinking.

The "alarming array" of tools refers to all the interesting stuff we've been talking about doing with social media data for years: text analytics, social network analysis, geospatial analysis… For business applications, we've mostly talked about analysis on aggregate data, but if you apply the lens toward profiling individuals and don't care about being intrusive, you can start to justify the concerns.

But several privacy groups and think tanks—including Big Brother Watch, Demos and Privacy International—have voiced concerns that the Met's use of Socmint lacks the proper legislative oversight to prevent abuses occurring.

It's worth noting that Wright's piece is specifically about law enforcement use of social media data, and he points to others who are concerned about overreach by law enforcement agencies. Here are the organizations mentioned, along with links to some of their relevant work:

This is the social data industry's PRISM problem: the risk that the revelations of intelligence agency practices will raise broader privacy concerns that include the business use of public social media data. They're different issues, but the interest sparked by the NSA disclosures has people thinking about privacy.

In this case, Wired makes the connection explicit with their headline, calling social media intelligence "Prism's little brother." As Wright demonstrates in his article, open-source social media monitoring raises issues, too.

Legitimate questions, too
There's more going on here than a question of perception. If invasion of online privacy gains traction as an issue, the important distinction between public and private data is only part of the issue. If we limit the topic to public data, the question becomes, what are the limits to the use of public data?

An important part of answering that question will depend on understanding why there should be limits, which goes to what is being done with the data. It's going to be worth separating the concepts of accessing the data and using it. What you do in your analysis may be even more sensitive than the data you base it on.

People are sharing more than they realize, and analysts can do more with that data than people think. As monitoring becomes pattern detection becomes predictive modeling, it becomes more likely to make people uncomfortable. Last year's pregnant daughter is this year's precrime is next year's thoughtcrime, or so the thinking goes.

Will concerns like this lead to new restrictions by governments or the companies who control the data? Will people cut back on their public sharing? Or will these concerns fade when the next topic takes the stage (squirrel!)?

What are the constraints?
The existing limits on social media monitoring and analysis boil down to this: If it is technically possible, not illegal, and potentially useful, do it (depending on your affiliations, professional ethical standards may also apply). What we're seeing is that the unrestricted use of social data has the potential to make people uncomfortable, which could have consequences for those who would use the data.

It's worth thinking about the constraints on using social data, which involves more than the ethics question. I have some thoughts, which I'll share later.

Ethics and social media monitoring: so much at stake, but the existing standards are linked to specific business functions. Can we fix that? Converseon suggested some questions for clients to use in avoiding service providers with problematic practices. Let's go a step farther and think about appropriate ethical standards for companies that do the actual monitoring and analysis work, regardless of which functional silo they support.

I have a few suggestions:

  1. Obey applicable laws.
    Stay legal—always nice to include that in the code. This will be trickier than it sounds, because (a) the law that applies to online monitoring is "complicated, multi-faceted and unclear," and (b) the Internet is global. Whose laws apply in which situations should be good for generating legal fees somewhere.

  2. Match clients' regulatory obligations.
    In addition to government regulations that apply to them directly, service providers should comply with requirements that apply to their clients. Service providers shouldn't be in the business of doing work that clients are prevented from doing themselves. Yes, this requires learning about clients' regulatory environments.

    Clients should extend their own compliance standards to service providers working for them—if you can't do it, don't hire an outside company to do it for you.

  3. Honor sites' terms of service.
    Whether terms of service are enforceable is a legal question that will eventually be settled, but the strong ethical position is to monitor sites on their terms. If you need to hide your identity or play cat-and-mouse games with site admins, you're in the wrong.

  4. Be transparent in your monitoring.
    Don't conceal your identity, through either technical or non-technical means. Your IP address should map to your company. When using an individual profile to monitor or interact on a site, disclose the individual's affiliation with either the service provider or client.

  5. Respect privacy norms in closed settings.
    Blog monitoring was ok because blogs are publicly available. If an individual login is required and community norms are that information is to be kept within a community, don't use it. These sites create an expectation of personal privacy that should be respected.

  6. Don't overburden servers with automated requests.
    Sites exist to serve their users, or to reach an audience, or to conduct business. Manage your data collection activities to minimize negative impacts on servers.

  7. Where multiple codes of ethics may apply, observe the more restrictive code.
    Existing codes from other fields may impose extra requirements that still apply. For example, entering a community to observe it is ethnography, which has its own ethical standards.

  8. Be honest with clients.
    Don't make promises that your technology can't keep or present insights that aren't supported by the data. If the client wants something you can't do, admit it. If they want something you won't do (or shouldn't), educate them. As Converseon's list suggests, your ethics protect them, too.

  9. Don't freak out the natives.
    It's not good for your business, anyway. The more people think of what you do as creepy, the more likely you are to face regulatory pressure or other challenges. Besides, it's not nice.
I've already heard from an industry insider who's concerned about the potential impact of others' privacy violations on his business. He's right to be concerned. Credit card companies and credit bureaus have assembled vast databases from information that consumers can't control. We can be freaked out about it, but we can't do anything about it. Scare enough people about what happens with their information in social media, though, and they could stop using social media altogether (unlike consumer credit).

Do we need an industry standard?
Incidents like the one in yesterday's WSJ, and the attitudes exhibited in some of the quotes in the article, increase the likelihood of government intervention and externally imposed rules. Who'd rather create a clear and relevant ethical standard for the listening business before that happens?

I've already heard that this topic is too sensitive for an open discussion online. If you want to pursue this, let me know, and we can decide on the right venue.

Today's Wall Street Journal had Twitter abuzz about social media monitoring and privacy in closed communities ('Scrapers' Dig Deep for Data on Web). Specifically, a health discussion board and a social media analysis vendor using individual accounts to access personally identifiable health information. It's obviously an ethical question, but whose ethics apply? As far as I can tell? Nobody's (yet).

People are sharing personal stuff online, sometimes sharing more than they realize. We need to be careful about how we handle this information, but from what I can see, the ethical standards are just as siloed as the measurement standards. People brought along whatever ethics they subscribed to before they started dealing with social media, but the existing standards don't really cover the new activities.

Think about the different functional roles where you might find companies using social media data:

  • Market research
    Market researchers have strong ethical standards that come from social science research. They get into things like informed consent, but does that really apply to data mining of publicly available data? Do they apply if the data is aggregated, and no personally identifiable information is preserved? What ethical standards apply to desk research?

    Jeffrey Henning wrote about the etiquette of eavesdropping and presented a webinar on consumer attitudes towards social media market research. The short version is that people persist in expecting privacy in their online conversations, despite the public nature of the forums they use. But does their expectation of privacy online translate into an ethical obligation for researchers?

    Update: IMRO and CASRO guidelines may apply to social media research.

  • Public relations
    PR ethics say a lot of being honest and transparent in public statements, representing the client and the profession well… but what about the ethics of monitoring and measurement? A recent discussion of ethics in PR measurement suggests that that conversation has only just begun.

  • Marketing
    WOMMA takes strong positions on its members' marketing activities, but the closest it comes to mentioning monitoring or research is when it commits to "promote an environment of trust between the consumer and marketer." Other marketing codes I found had a similar emphasis on outbound marketing over inbound information collection.

    Update: WOMMA also calls for members to "respect the rights of any online or offline communications venue (such as a web site, blog, discussion forum, traditional media, and live setting) to create and enforce its own rules as it sees fit."

  • Customer service
    Is customer service sufficiently organized as a discipline to have its own code of ethics, or does it simply inherit the company's overall standards? I'll bet you that any existing ethics deal with one-on-one interactions with customers.

  • Human resources
    HR ethics related to personal information are based on information that companies aren't supposed to use in hiring decisions. danah boyd shared some thoughts on regulating the use of social media data in hiring.

  • Strategy/intelligence
    SCIP's code of ethics doesn't commit to much more than obeying the law. Other types of intelligence organizations get some leeway even on that. If you don't want competitors spying on you, your only real defense is to learn about INFOSEC.
Bottom line? I haven't seen an existing code of ethics that applies to monitoring, measuring, or mining social media sources. If you wanted to apply an existing standard, you'd have to decide which one. So, how do you pick? Are the rules determined by:

  • The source of the data?
  • What you do with it?
  • The job title/professional affiliation of the user? What if the labels themselves lack agreed definitions?
  • No ethics, just laws?
  • Nothing—there are no rules?
I have some ideas, which I'll share tomorrow. But first, what do you think? Is there an existing standard that you apply? How did you pick it?

Update: Is it time for Ethical Standards for Listening Vendors?


Photo by Thomas Hawk.

Bruce Schneier's taxonomy of social networking data (via Tim Finin) provides a helpful starting point for thinking about the various ways that personal information finds its way online.

Most of the continuing saga of Facebook's updated terms of service (TOS) has focused on the implications for personal privacy and ownership of personal information and content. I have a different question: how many companies are considering the TOS implications when they use Facebook for marketing campaigns? Are they casually handing over rights to their intellectual property, too?

I group online TOS "agreements" with the shrink-wrap end-user license agreements (EULA) that come with commercial software. They may technically be contracts, but most customers don't read them and don't really agree to them. It's not really possible to read all the agreements that come our way, and in any case, they're not negotiable. When interesting or useful online services offer take-it-or-leave-it terms, most of us take it.

Usually, things work out. In real life—not the world described in TOS and EULA legalese—we are able to function because terms aren't enforced to the limit. Company statements, such as those coming from Facebook this week, tacitly acknowledge that rational management doesn't enforce every right that Legal tosses into license terms. So while it may be possible for Facebook to assert ownership of users' content, they're smart enough to realize that wouldn't be a good idea.

Yes, but...
Commercial contracts, though, should be different. Companies really shouldn't agree to unpleasant terms just because they're hard to read (you have professionals for that task, right?). If the standard TOS makes claims on company content that go too far, they should be negotiated. The question is, are companies really doing that, or are they clicking "accept" and moving along, just like most individual users?

I don't have the answers on this one. I suspect that big brands are negotiating real contracts with Facebook and others, while smaller companies accept the TOS. My parting thought for you is that if your company is getting into social media, your legal folks should pay attention to the terms. If something's not right, fix it before you start. If it can't be fixed—what other ideas were you working on?

Although my wife and I cross out publicity waivers in our child's permission forms, I am not a lawyer. Anything that looks like legal advice here is just my personal opinion.

Copyright is a funny business. When taking words, music or film to market required expensive manufacturing and distribution operations, it was easy to make money by selling them. Now that everything is digital, copies fly around the world on the Internet, reproducing at will. It's harder to maintain the business—in fact, it's hard to justify a business based solely on duplicating and distributing the work of others. And it turns out that, to a copyright business, creative is a noun referring to their product, not an adjective that might apply to business strategy. If you haven't seen any blogs in the last day or two, you might think that I'm writing about the music or film distribution business, but now it's the Associated Press. In the face of a changing media environment, they've put Legal in charge of strategy.

Last time I heard about AP, they were suing Moreover over the redistribution of headlines and excerpts. Now, they're after bloggers, attempting to define away Fair Use. The new rules come down to this:

  1. Don't quote AP stories, not even a little.
  2. Don't write new headlines based on AP stories.
  3. Don't use AP headlines to link to AP articles.
  4. Unless you're willing to pay.
Never mind that companies don't make the laws, not even in 2008. Oh, wait—maybe they do.

Next, AP Sues Reader for Remembering News
The new position from the Associated Press is contrary to the public interest, not least because unlike AP v. Moreover, they started this round by asserting the primacy of commercial interest over political speech. The importance of free expression in the political sphere is sort of the point of the first amendment. Public policy aside, though, this is just a bad move by AP, reflecting a complete failure to understand the online environment:

  1. Bloggers are not entirely unaware of copyright law, including fair use (which is not defined by AP). If one blogger calls AP's C&D bluff and it goes to court, expect the defenders of the First Amendment to line up in the blogger's suppport.

  2. Facts are not subject to copyright. If a blogger writes a new headline for a news event, the blogger owns the copyright.

  3. Links are valuable on the Internet, to the extent that rational businesses pay for links in an attempt to improve their position on search engines. AP should thank bloggers for linking to them with their chosen keywords.

  4. When considering the effect of the use upon the potential market, remember that AP sells advertising on its own website. Part of the impact of excerpts and links on blogs is positive, driving more viewers to AP's ads.
You will recall that I'm not a lawyer, and it has been suggested that AP could have a case. However, if AP wins an actual lawsuit, the Internet will make a point to forget that AP ever existed, creating a tidy lose-lose scenario.

Maybe putting Legal in charge of publicity was a mistake
Legal just isn't good at dealing with bloggers. They're all about protecing the company (their job) at the expense of considering the market's reaction (not their job). But, as usual, someone with a clue about Internet culture should have been involved. It's clear that they weren't this time.

Maybe someone at AP just read an article on that word of mouth marketing thing and decided to get some. Maybe they remembered hearing something about publicity...

There is no such thing as bad publicity...
...and didn't realize that wasn't all.
...except your own obituary.
—Brendan F. Behan
As it stands, AP just gave emerging media a solid reason to prefer Reuters (NYSE:TRI)—or, even better: original sources. What's the purpose of intermediaries in the age of global distribution?

Hint: not one "AP" or "Associated Press" link above leads to the AP site. No point in linking to a company that doesn't like links.

Are you paying attention to The Associated Press v. Moreover Technologies, Inc. et al? I heard about it while interviewing the founder of a different company for the Guide to Social Media Analysis, my reference to the companies who monitor and measure social media. He was telling me that his company provides summaries and links back to original sources, in order to avoid the risk of copyright infringement issues. The interesting thing is, I had just heard from another company that they selected a data vendor specifically because of the full text clips in their feed.

So, what's the deal with aggregating media content for a commercial service? Does blog aggregation with full content feeds violate copyright? Is it a question of fair use (US—fair dealing elsewhere), or is there more to consider? I asked Eric Goldman, Assistant Professor and Academic Director of the High Tech Law Institute at Santa Clara University School of Law, who started by telling me, "the law in the area is complicated, multi-faceted and unclear."

Great. So much for wrapping things up with a tidy stroll through fair use considerations.

In addition to copyright, Goldman suggested these areas of potential concern (the usual disclaimers apply: this is not legal advice; check with your own lawyers):

  • Common law trespass to chattels
  • Computer Fraud & Abuse Act
  • State computer crime laws
  • Contracts
  • Trademark
Scraping web sites for content adds its own complications. Subscribing to RSS or XML feeds may improve things (legally), but then again, it may not. The existence of a feed doesn't necessarily mean that the content is freely available for commercial purposes.

Still with me?
So far, this is just the US perspective on an inherently international activity. My blog post was threatening to turn into a book, which I'm not even qualified to write (but I might want to read). So, let's go back to the current case that opened the topic, AP v. Moreover, or the Case of the Purloined Press. For extra credit, read the complaint (PDF).

This case isn't about social media monitoring; it's about redistributing traditional media content without a license. But it has similarities to other forms of media monitoring, in that a company is aggregating content for commercial purposes. How can a company avoid trouble while providing commercial content aggregation, and how does this translate to social media content with its millions of independent sources?

Potential solutions
One possible solution is to license the content. It's an established practice with traditional media that sets the terms of use, but it's not practical for decentralized, online media. Excerpting is another potential solution, which is being tested in the current case. The addition of metrics to raw content may help support a fair use/fair dealing argument. But with the unsettled state of the law, solutions are likely to be complicated and unclear, too.

When you get into the wilderness of social media sites, you encounter copyright, Creative Commons and terms of use that vary by site. This could be interesting. Oh, and complicated—and risky. We're not done with this topic, but for now, there's an ongoing case worth watching. I have my search feed running. Do you?

IMHO, IANAL, YMMV. I took an excellent course on communications law in grad school, and I enjoy a good conversation about policy, but I'm not a lawyer. You'd be an idiot to take anything I write or say as legal advice.

Ethics, open sources and CI

Are the ethics of listening culturally specific? A conversation with a European CEO got me wondering about the limits on using information found online, but when I talked to an American lawyer, he suggested that there are few—if any—boundaries about such information. The bottom line in the US? Once it's on the web, it's no longer secret.

The question came up in the context of using social media analysis for competitive intelligence. Should companies look for their competitors' secrets online? The SCIP code of ethics doesn't help much; all it really says on the matter is, "comply with all applicable laws, domestic and international."

So, what's the law? I am not a lawyer (IANAL), so I called one. Richard Horowitz concentrates in corporate, international, and security matters; he also teaches and writes in the area of competitive intelligence and security. He asked that I point out that his opinions here are general observations and should not be taken as legal advice.

A not-so-hypothetical example
I asked Richard about the legal limits on information found online (not including breaking into sites). To make things easier to follow, I used the recent example of an internal Wal-Mart presentation that was posted on Consumerist and later removed in response to a DMCA takedown notice. (If you're interested in the copyright angle on the story, see Jonathan Bailey's comments at Plagiarism Today.)

Consumerist took down the presentation, but it is available through other online sources. The presentation is now in the wild; anyone who wants to find a copy, can. But what of the legal and ethical considerations for companies who would find intelligence value in the presentation?

Our example features four players:

  • Wal-Mart, the company whose material was improperly released.
  • The person who gave the confidential material to Consumerist (the "Leaker").
  • Consumerist, the web-based publisher who received and published the material.
  • Our hypothetical competitor, who may find useful intelligence in the material.
The relevant areas of concern for the leaker and publisher are trade secret law and computer fraud statutes, depending on the details. But we're interested in the competitor who finds the information online.

The usual hypothetical for discovered secrets is a document found in the street, and the recommended practice is to return it to its owner, even if the company could legally keep it. Companies that receive leaked documents directly also tend to return them to their owners, in part because trade secret law addresses that situation.

Published secrets are not secret
Our example is different, because the information has been published. Generally speaking, information that has been published—and thus made public—loses its trade secret protection. If published information isn't a trade secret, trade secret law doesn't govern its use by any member of the public, including a competitor.

Richard didn't see any serious legal risks such that companies couldn't use confidential information that has been published and consequently is now in the public domain. (But you'll ask your own lawyer if you need actual legal advice.)

Ethical considerations
The SCIP code focuses on issues like conflict of interest and honesty, which don't inform the practice of gathering intelligence from open sources online. Once the legal concerns are out of the way, the ethics question seemed to be settled, too.

I asked Richard if he could see an ethical argument against using the information, which led to a discussion of the Prisoner's Dilemma. Essentially, there's no reason to think that other competitors will refrain from using the information, so why would one choose to avoid it?

Different cultures, different laws?
After reading some papers from SCIP and talking to Richard, I was getting a clear picture that companies in the US can probably use confidential information they get from open sources online.

Still, there was that discussion with the European CEO. He felt very strongly that he should not be snooping for information leaked from competitors on behalf of his clients. Are there stronger trade secret laws in Europe, or is this a cultural thing? I know a little (very little) about Europe's privacy laws. Is there something beyond social norms that presents legal or ethical constraints on the use of secrets found online?

I'm still looking for informed opinions to address that one.


The ethics of listening

Ethics is such a fun subject, or perhaps it keeps coming up because we're not quite sure which rules apply. Mom's rules (be honest, be nice) don't seem adequate in the commercial sphere, and so we have ethics guidelines. Lots to choose from, actually, depending on who you are and what you're up to. As it turns out, even reading blogs can have ethical implications.

The ethics of writing
Usually when people talk about ethics and social media, they're talking about writing, or creating, online content. Around the time that flog entered the lexicon as a contraction of fake blog, the Word of Mouth Marketing Association (WOMMA) came out with their ethical blogger contact guidelines, and we all talked about ethics for a while. Last month, the UK's Chartered Institute of Public Relations (CIPR) published their own social media guidelines—more wordy than WOMMA list and with a different slant, but another good source. Flogs, by the way, are still on the naughty list.

Bloggers may or may not have ethical standards, too, depending on who you ask. Reach back in time, and you'll find the CyberJournalist Bloggers' Code of Ethics (2003), although it's clear that not all bloggers are journalists. Caveat lector is the general rule, though some bloggers spell out their own personal codes of conduct.

The Occasionally Wild West of the online universe inspired the discussion on PR ethics and Wikipedia. The guidelines may be a little vague, and the enforcement uneven, but the warning signs are clear.

The ethics of listening
Listening to social media is one of my pet themes, because I'm convinced of the value that people and companies can find online. Listening online, like speaking online, takes many forms, from simple web browsing to high-end social media analysis. What they have in common is that you can collect useful information for a variety of purposes from open sources.

As it turns out, listening has ethical boundaries, too. Maybe.

Katie Paine reported some of Don Wright and Michelle Hinson's research from the Summit on Measurement, including this challenging bit:

While in 2005 79% thought employee blog monitoring was ethical, in 2007 only 27% saw it as ethical.
So even reading publicly available content is questionable—or at least debatable—under some circumstances. There was a related discussion in the HR/recruiting blogosphere last summer over the limits on using information from social media in hiring. The emerging consensus seemed to be that companies should be careful about how much information they collect, but that job candidates should be equally careful with what they leave for employers to find.

In talking with a social media analysis vendor today, I was reminded that the Society of Competitive Intelligence Professionals (SCIP) has a code of ethics, which can come into play when companies use data mining for competitive intelligence. But that brief code provides no direct guidance on the limits on intelligence gathering from open sources. A CIPR-style note would be useful.

It seems appropriate that some information really shouldn't be collected, even if it is readily available online. Because listening to social media works in multiple functional roles, we're going to see different standards—or at least different standards keepers—for those groups. Marketing and PR have some ideas. HR is thinking about it. Is CI next? Who else needs to update their standards for the new tools?

Update: Here's a legal view from the US.

About Nathan Gilliatt

  • ng.jpg
  • Voracious learner and explorer. Analyst tracking technologies and markets in intelligence, analytics and social media. Advisor to buyers, sellers and investors. Writing my next book.
  • Principal, Social Target
  • Profile
  • Highlights from the archive


Monthly Archives