Time to Change: Disruptive Innovation Panel at Xerox PARC

I had the opportunity to participate on a panel recently called “Time to Change – Culture and Brand Disruption Leading to Innovation.” It was hosted by Xerox PARC as part of their PARC forum series, and featured Todd Wilms of SAP (@toddmwilms), Michela Stribling of IBM  (@mstribling) and Bryan Kramer (@bryankramer) of PureMatter.

I probably shouldn’t admit this, but every time I hear the phrase “Time to Change,” it reminds me of a famous Brady Bunch episode, in which Peter’s voice begins to change, just as the family is about to record the classic “Sunshine Day.”

Poor Peter; he’s so embarrassed. But Peter’s disruption sparks Greg’s idea, which is to write the now-classic “Time to Change,” which celebrates Peter’s challenge and turns it into an opportunity.

So, drawing from both the lofty inspiration from the PARC panel and the more earth-bound lessons of The Brady Bunch, here are five quick points on disruptive innovation:

  1. Disruption is supposed to be painful. It challenges people, processes and established plans. But that disruption is a signal–like pain is in the body–that we need to attend to the underlying issues.
  2. It takes time. For example, we expect, less than a decade into social business, that it should be broadly accepted, mature and scalable. But it takes years–decades sometimes–for us to truly understand the characteristics of new technologies and media types; if you doubt this, take a look at this article published last year in MIT Technology Review.
  3. It requires rigor and discipline. Just because something is new doesn’t mean it requires less rigor to adopt. In many cases, it requires more clarity, as my colleague Charlene Li so clearly stated in Open Leadership. That means bounded experiments with hypotheses, and clear business plans with anticipated returns and outcomes.
  4. Learning is and should be a desired outcome. The reason trends are disruptive is that we don’t yet know how to unlock their value. How else can we discover that without experimenting and learning from the results? We need to act less like managers and more like scientists.
  5. Stupid ideas can be brilliant, or inform other, better ones. The US Patent Office is full of failed ideas that became celebrated innovations or provided unexpected insight. A few years ago, Netflix tried to split its business, a wildly unpopular decision. Now they’re creating award-winning programming.

Here’s the video:

For more on the panel, see Michelle Killebrew’s excellent recap in Click Z.

For more on what’s going to disrupt us in 2014, see Charlene’s post on trends to watch.

As always, I welcome your thoughts.


Posted in Altimeter, Innovation, Social media | 3 Comments

2014: The Year of Data Disruption

542192_61276739Linguist Geoff Nunberg’s annual “Word of the Year” posts offer an instructive peek into the American psyche. In 2012, he chose “Big Data”. In 2013, his pick was (no, not “twerk”) “selfie.” Nunberg makes his selections based on dominant news stories, or words that he believes tell us something important about the culture at a particular point in time. What appeals to me about the 2012 and 2013 choices is that they illustrate the increasing tension between our fascination with data, and our profound unease at its implications.

This plays out from pop culture to organizational culture, from The Economist to TMZ. In 2014, I’ll be looking at the increasing tension in several areas, as technology continues to overtax our ability to understand it (sentiment, video and image analysis) assimilate it (filter failure), act on it (business disruption) and define rules and ethics around it (security and privacy). Here’s what I’ll be thinking about throughout the year:

1. Data Diversity Requires Diversity of Expertise

The biggest “Big Data” challenge will continue to be the sheer variety of data types. Large brands want to know when their products or logos are used on the social web. Sentiment analysis, image recognition in both still and moving images, as well as text-to-speech and speech-to-text will continue to confound technologists, until and unless they more aggressively include linguists, social scientists, even neuroscientists, in their R&D processes. That isn’t to say that will solve everything, but as we bring technology and human communication closer together, it stands to reason that we need a far more multidisciplinary approach to understanding signals.

2. Clean Data is Happy Data 

With multiple data types comes increased demand for consistent interpretive standards, particularly as the need to view disparate data sets in tandem increases. We’ve seen the challenges of this with text-based social data but have not even scratched the surface for other data types, or the impact when they are viewed in conjunction with other data sets. Consistent sourcing, transparent methodology and interpretive standards will become a must-have for 2014. It may not be sexy, but it’s mission-critical.

3. Machine Learning is Table Stakes

The ability to deliver ever-more massive and heterogeneous data streams from devices, enterprise and social apps and other sources–often in real time–will place increasing pressure on organizations. Rather than continuing to segregate analysts, hand-code posts and manually interpret these data sets, machine learning will need to become an expectation rather than an exotic and costly addition to data analysis tools. We’re not talking Scarlett Johanssen in “Her,” (sorry, folks) but rather the ability to infuse learning into data processing technologies to reduce filter failure, improve relevance and move to higher-order analysis–at scale.

4. Data is the New Disruption

As data makes its way around increasingly permeable organizations, we’ll see  waves of disruption follow in its wake. While corporate initiatives can spark quite a bit of controversy over “who owns it” and “who funds it,” data is so elemental to organizational culture and operations that these questions will predominate. The next wave, “who gets to see it, interpret it and administer it” will only increase the need for direct, timely and clear agreements and governance as these data streams become business critical.

5. Contextual Privacy: the Useful/Creepy Conundrum

There has been too much of an inclination to treat privacy as a one-size fits all proposition, but what we are learning is that the complexity of data gathering and data sharing means that privacy can be a very situational concept. Thinkers like Danah Boyd deeply understand the contextual nature of privacy, and how one small adjustment can erode or even build trust. I’ll be focusing on this in 2014, with an emphasis on helping organizations and technology developers deliver relevant experiences without undermining the social contract between individual and organization.

This is one in a series of posts on Altimeter Group’s 2014 research focus. For more from my colleagues on what they’re planning for the year, please click here.

Posted in Altimeter, Big Data, Predictive Analytics, Quantified Self, Real-Time Enterprise, Research, Social Analytics, Social media measurement, Uncategorized | Tagged , , , , , | 9 Comments

An Industry Association for Social Data: The Big Boulder Initiative

Screen Shot 2013-12-06 at 10.09.37 AMA few weeks ago, I had the opportunity to participate in a working session of The Big Boulder Initiative, an industry association founded to promote understanding and development of the emerging  social data market.

It’s been an eventful week in the industry; Topsy was acquired by Apple earlier this week, and DataSift raised an impressive $42 million in their series C round of funding. With increasing momentum comes increasing complexity, and The Big Boulder Initiative has been convened  to identify, prioritize and begin to address the most pressing technology, business and consumer concerns affecting the future of the social data industry.

Here’s the video summarizing the event:

The Big Boulder Initiative from Gnip on Vimeo.

The issues we discussed were:

  • Privacy, Trust & Regulation
  • ROI & Value
  • Data Access
  • Data Standardization
  • Cost of Data
  • Data Quality & Validity

Here is a summary of what was discussed and agreed to at the meetings in Seattle, New York, Washington D.C. and San Francisco.

I’m honored to announce that I’ve been elected to the board, along with some of the most knowledgeable and thoughtful folks in the industry:

In the interest of transparency, I should state that while Gnip convened this group and Chris Moody, CEO of Gnip, is interim chairman, the first order of business for the board will be to elect a new chairperson from among the members and actively recruit more members–explicitly including direct competitors–to join the initiative. [Update from Chris: his role is currently interim but he intends to throw his hat in the ring for full-time chair.  The decision will be made by the board as a whole.]

The goal is for this to be an association that serves the industry rather than individual companies or agendas.

The first board meeting will occur in  January; in the meantime, if you have questions, would like to participate or would like to be included in future communications, please contact Chris Moody at Gnip.

I’m honored to be working with such an impressive group of people and look forward to rolling up my sleeves in 2014. More to come!

Posted in DataSift, Gnip, Social Analytics, Social Data, Topsy, Uncategorized | 4 Comments

Social Data Market Momentum: It’s Not About the Firehose

1034138_89844170In the past year, social data has continued to wend its way into organizations of all types, from large enterprise to small business to media and entertainment and the public sector. We’ve seen use cases far past marketing into product and service quality, entertainment programming, customer service, fraud detection and a host of other examples.

Yet the idea of social data as an asset that requires real enterprise rigor–quality control, curation and integration with other data sources—-is still nascent.

This week, Apple purchased Topsy, one of Twitter’s certified partners and a company that both resells and analyzes Twitter data. The acquisition had many scratching their heads initially, but a quick review of Apple’s acquisitions this year includes, according to AppleInsider, at least two companies with complementary technologies: AlgoTrim, a Swedish data compression company, and Matcha.tv, a second-screen startup. The combination of data compression, social data analysis and predictive capability suggest intriguing potential applications in the area of personalized recommendation, whether in iTunes or radio, or TV, or some other medium not yet revealed.

While Apple’s acquisition arguably takes Topsy out of the social data reseller business, the $42 million in Series-C funding raised by DataSift today demonstrates that the business of social data is gaining serious momentum. But this market, as it’s evolving, is not just a game of “Capture the Firehose”; it’s about taking this enormously complex, rich and challenging data set and turning it into insight that can be used to suggest trends that real people in real organizations can act on. It’s not about the firehose; it’s not even about the water. It’s about the fires the water can put out, and the things it can cause to grow.

This small collection of companies, which now effectively includes DataSift, Gnip and NTT Data in Japan, is forming the embryo of a market that will, for the first time, enable organizations to incorporate the customer’s voice–the raw, the spontaneous, the immediate–as a legitimate input into organizational decision-making. This is not a simple proposition: it requires tremendous expertise in big data processing and an ecosystem to promote growth and experimentation and leadership, among many other things. And, as Gnip has clearly understood, it requires educating the market as to the challenges and opportunities of social data.

All of these companies, in their different ways, have played a critical role in getting us to the starting line for social data. Now that 2013 is coming to a close, and 2014 is about to be upon us, I predict the following:

  • More demand among organizations for “enterprise ready” social data streams
  • Experimentation with new use cases for social data
  • Collaboration among IT and marketing as social data becomes a more valued enterprise asset.
  • Less emphasis on the “social” aspect of social data; after all, it really is the most authentic, vivid and vast collection of the voice of the customer, partners, consumers, investors; the community at large.
  • Acceptance of social data as a valued enterprise asset
  • A greater emphasis on social data ethics, compliance and best practices.

Congrats to Topsy and DataSift on their news this week. More to come.

Posted in DataSift, Gnip, NTT, Predictive Analytics, Real-Time Enterprise, Sentiment Analysis, Social Analytics, Social Data, Social media, Social media measurement, Topsy, Twitter | 3 Comments

From Shopping Carts to Poisoned Names, Every Data Point Tells a Story

542471_10200508386412185_1177284541_nEvery so often, I’d like to profile someone who’s doing interesting things with data. Meet Hilary Parker of Etsy (yes, that’s her in the photo).

While at Strata & Hadoop World last week, I had the chance to attend Ignite, a pecha-kucha-like event in which speakers present one idea, on twenty slides, in five minutes (no pressure). One of my favorites was by Hilary Parker, a data analyst at Etsy and Ph.D. in biostatistics who spends her days trying to understand how people use Etsy, guiding experiments, consulting with development teams and generating new hypotheses for further investigation.

One example of the types of questions Parker tries to answer is whether new features are performing as expected (do they increase conversions?) or whether they are causing other, unanticipated outcomes. The goal, essentially, is to get at the root of the user’s behavior; how she’s interacting with the website, and whether that’s different from what the team expected. It requires an open mind and a lot of curiosity, perseverance and attention to detail, not to mention some serious statistical modeling skills.

For example, one of the metrics that ecommerce companies like to measure is average shopping cart value, compared to average order value. If your shopping cart value (let’s say $250) consistently exceeds your actual order value (let’s say $75), that means that items are being added to carts but are being removed before checkout. Why would that be?

One possible reason, Parker posits, could be that people are adding items to the cart to bookmark them for later viewing. Another one (and one that I am personally guilty of) is inadvertently adding the same item multiple times. Either way, the end result should be a user interface change; perhaps to add a way to bookmark items or, in my case, alert me that I am about to spend the equivalent of the national debt on a standing army of home appliances.

Hilary’s talk at Strata, intriguingly entitled “Hilary: The Most Poisoned Baby Name in US History,” documents her investigation into the popularity (or extreme lack thereof) of her given name.  As a seven-year-old in 1992, she suddenly found herself being teased by other children, who called her “Hillary Clinton.”  Later, in college, she Googled her name and came across a blog post that said that Hilary was the most poisoned baby name, meaning that it had been severely undermined by the unpopularity of the then First Lady.

So she got curious. Earlier this year, Parker decided to perform her own analysis using data from the Social Security Administration, which initially revealed that Hilary was, in fact, only the 6th most poisoned baby name. So I asked Hilary what made her suspicious that this wasn’t telling the whole story. Her answer: “the names, for one. It was a somewhat peculiar list.”

To wit: numbers one through five were, in order, Farrah, Dewey, Catina, Deneen and Khadija. So she decided to graph the data to see what was going on.  Once she could visualize the data, she says, “I saw a crazy pattern. I started Googling the names and seeing why they were popular.” You can probably guess why and when Farrah became popular. Khadija was a no-brainer for me, as I clearly remember Queen Latifah in that role on the sitcom “Living Single” from 1993-1998. The rest you’ll have to read for yourself on Hilary’s blog

But, says Parker, “‘Hilary’…was clearly different than these flash-in-the-pan names. The name was growing in popularity (albeit not monotonically) for years.” So she decided to re-run the analysis using only names that were in the top 1000 for more than 20 years, and updated the graph accordingly. Here’s what she found:


So, says Parker, “I can confidently say that, defining “poisoning” as the relative loss of popularity in a single year and controlling for fad names, “Hilary” is absolutely the most poisoned woman’s name in recorded history in the US.”

I love this experiment because it shows the value of following hunches. It also shows the beauty of visualization for large data sets.  As Parker says, “statistics is as much an art as it is a science.”

You can read her full analysis here.

Her slides from Ignite are here.

You can ask her about the roller derby photo yourself.

Find her at hilaryparker.com.

Posted in Analytics, Data Science, Research, Uncategorized | Tagged , , , , , | 1 Comment

Social Data Intelligence: Survey Says

1352727_34572494Back in July, when we published Social Data Intelligence, we were curious to discover how organizations would rank themselves using the criteria in our maturity map. How many companies are in the “ad-hoc” stage? How many consider themselves to be   “formalized”? Who’s integrating social data with enterprise data? And has anyone reached the nirvana of the “holistic” category?

Several of my colleagues just completed Altimeter Group’s Digital Strategy Survey for Q313, and there are some interesting findings. The one I’d like to speak to today is in reference to social data maturity, because even as a self-reported finding it gives us some insight into organizations’ progress and aspirations when it comes to social data.

As a refresher, here’s the maturity map from Social Data Intelligence:

Maturity MapFig5 N2

No Judgment

I do want to emphasize that this is a “no judgment” maturity model: as I’ve said many times, the path to social data maturity is complex and rife with organizational and technical challenges. Each of these stages has value, from organizational learning in the first, to rigor about business outcomes in the second, to a more organization-wide view in the third, to scale in the fourth. They all have value and they all contribute–even if that contribution is hard-won–to organizational transformation around data.

Survey Says…

So, now that that’s out of the way, how did our survey respondents stack up?

Screen Shot 2013-10-23 at 11.02.58 AM

No big surprises here: the majority of companies we surveyed fall into the “ad-hoc” category, 29 percent into “formalized,” 11 percent into “integrated,” and five percent into “holistic.” To be honest, I want to drill into the self-reporting at the holistic stage, simply because the tools to facilitate scale (the key criterion) are still quite nascent. But that’s less important than the fact that, yes, we’re mostly learning how to do this and operationalize it–from a business, process and technical standpoint.

When I look at the chart above, I see two things coming up fast:

  1. A wall of blue water. Call it what you will: blue water, green field, but I’m speaking to companies daily–Ekho and Informatica most recently–who are tackling this integration challenge in different ways, seeking to facilitate the integration of social and other enterprise data and, most salient to business people, take a lot of the manual labor and interpretive squish out of the process. From a market perspective, expect more middleware players to articulate how they can become force multipliers in the social and big data universe.
  2. Social data makes strange bedfellows. I’ve worked in marketing organizations and I’ve worked in IT organizations, and I can tell you this: these communication challenges are nothing new. But now more than ever, IT and marketing need to find a common language to instill technical rigor into business planning, and business context into technology planning. IMO, there is no other option as social data, and other big data types, take up residence in enterprise organizations. We’ve heard a lot about the “consumerization of IT.” It works both ways: technology is driving business strategy too, and it has to be this way because of the complexity of the problems that need to be solved. There is no magic dashboard.

So this is why big data is so–that word again–disruptive. It really is changing organizational processes and decision-making and culture. The challenge, with apologies to Jimmie Dale Gilmore, is to decide whether you’re just the wave, or you’re the water.

Thanks to Jess Groopman, Christine Tran and the Altimeter team for fielding the research for the Digital Buyer Survey.

Posted in Analytics, Big Data, Predictive Analytics, Real-Time Enterprise, Social Analytics, Social Data, Social media, Uncategorized | Tagged , , , , | 2 Comments

The Emerging Social Data Ecosystem

1428874_36578555It’s Social Data Week, and I spent Monday at DataSift’s San Francisco conference. Like Big Boulder (which is produced by Gnip and is now entering its third year), Social Data Week is focused on the emerging dialogue around social data, its stakeholders, challenges, opportunities, use cases, best practices and, most critically, its emerging ecosystem.

To some degree, these recent conversations around social data remind me of food. (Stay with me; I have a point.) It’s hard to throw a rock in San Francisco these days without hitting a restaurant whose menu gives as much attention to its sources (Dirty Girl tomatoes, Star Route arugula, Point Reyes blue cheese) as it does to its preparation. And it’s in response to customer demand; today, many of us want to know where our food comes from, what’s in it, and, as importantly, what isn’t.

For business, the provenance of social data is becoming critically important because social media has proliferated across the enterprise.

Consider this. Social networks (Facebook, Twitter, Tumblr, LinkedIn, Pinterest, etc) collectively generate billions of interactions every day. The same goes for social software platforms such as Lithium and Jive that cater to specific customer or community groups. This is also true of enterprise collaboration platforms such as Chatter, Yammer and Socialcast.  The data generated can be a post, a tweet, a share, a like, a comment.  Some is structured, some is not.  It’s an enormous data set and it’s being created almost entirely outside the organization’s walls–and control.

Then there are social applications such as listening (Salesforce/Radian6, NetBase, Sysomos) social media management (Spredfast, Hootsuite, Sprinklr) and publishing platforms (Salesforce/Buddy Media, OfferPop, Wildfire/Google), who use that data to provide specific capabilities. And–as we saw in Social Data Intelligence–enterprise applications such as CRM, business intelligence, market research and others are endeavoring to integrate this social data to deliver better customer experience and make better-informed decisions. 

Here’s a very simple (and by no means exhaustive) representation of how the nascent ecosystem around social data is shaping up:

Screen Shot 2013-09-18 at 11.53.38 AM

Now that social data is becoming business-critical (hundreds of case studies by Altimeter and others illustrate this point), it must become enterprise-class.

This means that business people who rely on social data need to take a page from the sustainability movement and–for the health of their organization–make the effort to understand where their data comes from, and what that means for the quality of the downstream decisions it will inevitably inform.

Social Data Sources

This is a very basic summary of the various sources of social data. Note that here I’m talking only about the sources versus the tools that use social data, such as listening, publishing, engagement or analytics platforms.

Source Description Considerations
Directly from the social network, via public API API stands for “Application Program Interface.”  In the simplest terms, an API is a set of rules that govern the way software programs communicate with each other. In the social data world, this is important because it standardizes the way applications access data from social networks.For example, Twitter has a public API that enables developers to access approximately one percent of the Twitter “firehose” (every single tweet, delivered at or near real-time).

Facebook recently opened its public API to a small set of initial partners, but it is not widely available.

Pinterest does not as yet have a public API.

  • Not all social networks provide public API access, and the amount and type of data available varies among social networks.
  • Complexity of managing multiple data sources makes scalability a challenge
  • Limited data access may distort findings. For example, because the public Twitter API typically includes 1% of the full firehose of data, niche or B2B brands may not be able to detect sufficient conversation volume to make informed decisions.
  • Rules around API use change, which can make it challenging for developers to build and maintain apps that use that data.
Directly from the social network, via full firehose access Full firehose access directly from a social network delivers every single social post and action created on that platform. Not every social network offers full firehose access, and those who do admit it is rare and costly. For example, Twitter provides full firehose access to a limited group of partners, but cautions developers that it is hard to come by and quite expensive. For privacy reasons, the same would not be possible with Facebook (at least for data that is private or falls under the category of personally identifiable information, aka PII), so the samples are by nature quite different.
Via a social data platform or provider Key players: DataSift, Gnip, Topsy Labs.

These companies resell access to data from social networks. They support different sets of social networks, are built on different technologies and provide varying types and levels of data access. The most important distinction from public API and firehose access, however, is that they provide a level of consistency, as well as value-added services such as filtering, URL expansion and access to historical data.

  • Data quality and consistency
  • Single source of social data and standard formats reduce complexity
  • Breadth (soclal networks supported) and depth (public API versus firehose) of data sources. For example, Topsy is Twitter-only.
  • Type of data access provided: percentage or keyword-based, or both
  • Availability of historical as well as real-time data to enable time-based comparisons
  • Availability and fit of value-added and professional services
  • Ease of purchasing and working with the vendor versus with individual social networks
  • Cost and pricing model
Data/Screen Scraping A technique used to extract data from websites. Prone to error, not to mention potential ethical and legal repercussions. See this useful article in ReadWriteWeb and this useful article in the New York Law Journal for a fuller explanation of the tradeoffs and potential ramifications of data scraping.

The ecosystem around social data is just starting to take shape, and I’m happy to see the beginnings of an ecosystem, as well as critical industry conversation–with application developers, social networks, social data providers, data scientists, end users, academics, analysts and others–about this critical business asset. 

I’ll continue to think and write about the evolving social data landscape, so please feel free to comment, disagree, or add any additional perspective you have. I’ll also link to substantive discussions below, as I usually do.

Posted in Analytics, Big Data, Facebook, Social Analytics, Social Data, Social media measurement, Twitter, Uncategorized | 18 Comments