From Shopping Carts to Poisoned Names, Every Data Point Tells a Story

542471_10200508386412185_1177284541_nEvery so often, I’d like to profile someone who’s doing interesting things with data. Meet Hilary Parker of Etsy (yes, that’s her in the photo).

While at Strata & Hadoop World last week, I had the chance to attend Ignite, a pecha-kucha-like event in which speakers present one idea, on twenty slides, in five minutes (no pressure). One of my favorites was by Hilary Parker, a data analyst at Etsy and Ph.D. in biostatistics who spends her days trying to understand how people use Etsy, guiding experiments, consulting with development teams and generating new hypotheses for further investigation.

One example of the types of questions Parker tries to answer is whether new features are performing as expected (do they increase conversions?) or whether they are causing other, unanticipated outcomes. The goal, essentially, is to get at the root of the user’s behavior; how she’s interacting with the website, and whether that’s different from what the team expected. It requires an open mind and a lot of curiosity, perseverance and attention to detail, not to mention some serious statistical modeling skills.

For example, one of the metrics that ecommerce companies like to measure is average shopping cart value, compared to average order value. If your shopping cart value (let’s say $250) consistently exceeds your actual order value (let’s say $75), that means that items are being added to carts but are being removed before checkout. Why would that be?

One possible reason, Parker posits, could be that people are adding items to the cart to bookmark them for later viewing. Another one (and one that I am personally guilty of) is inadvertently adding the same item multiple times. Either way, the end result should be a user interface change; perhaps to add a way to bookmark items or, in my case, alert me that I am about to spend the equivalent of the national debt on a standing army of home appliances.

Hilary’s talk at Strata, intriguingly entitled “Hilary: The Most Poisoned Baby Name in US History,” documents her investigation into the popularity (or extreme lack thereof) of her given name.  As a seven-year-old in 1992, she suddenly found herself being teased by other children, who called her “Hillary Clinton.”  Later, in college, she Googled her name and came across a blog post that said that Hilary was the most poisoned baby name, meaning that it had been severely undermined by the unpopularity of the then First Lady.

So she got curious. Earlier this year, Parker decided to perform her own analysis using data from the Social Security Administration, which initially revealed that Hilary was, in fact, only the 6th most poisoned baby name. So I asked Hilary what made her suspicious that this wasn’t telling the whole story. Her answer: “the names, for one. It was a somewhat peculiar list.”

To wit: numbers one through five were, in order, Farrah, Dewey, Catina, Deneen and Khadija. So she decided to graph the data to see what was going on.  Once she could visualize the data, she says, “I saw a crazy pattern. I started Googling the names and seeing why they were popular.” You can probably guess why and when Farrah became popular. Khadija was a no-brainer for me, as I clearly remember Queen Latifah in that role on the sitcom “Living Single” from 1993-1998. The rest you’ll have to read for yourself on Hilary’s blog

But, says Parker, “‘Hilary’…was clearly different than these flash-in-the-pan names. The name was growing in popularity (albeit not monotonically) for years.” So she decided to re-run the analysis using only names that were in the top 1000 for more than 20 years, and updated the graph accordingly. Here’s what she found:

names_trimmed1

So, says Parker, “I can confidently say that, defining “poisoning” as the relative loss of popularity in a single year and controlling for fad names, “Hilary” is absolutely the most poisoned woman’s name in recorded history in the US.”

I love this experiment because it shows the value of following hunches. It also shows the beauty of visualization for large data sets.  As Parker says, “statistics is as much an art as it is a science.”

You can read her full analysis here.

Her slides from Ignite are here.

You can ask her about the roller derby photo yourself.

Find her at hilaryparker.com.

About susanetlinger

Industry Analyst at Altimeter Group
This entry was posted in Analytics, Data Science, Research, Uncategorized and tagged , , , , , . Bookmark the permalink.

One Response to From Shopping Carts to Poisoned Names, Every Data Point Tells a Story

  1. Pingback: From Shopping Carts to Poisoned Names, Every Da...

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s