Big data is a big buzzword, and as an emerging technology, sits at the peak of the 2013 Gartner Hype Cycle for Emerging Technologies.
We’re happy to announce Elastic Path has been recognized in Gartner’s “Hype Cycle for Digital Marketing, 2014” report for the third year in a row, named in the “Digital Commerce” and “Commerce Experiences” categories, along with our partner, SapientNitro.
Despite the projections that big data drives more ROI than other marketing investments, recent research by Gartner reports the average return per dollar is only 55 cents, thanks in part to immature technology, lack of skilled data scientists and poorly defined business use cases.
In Insights 2014: Connecting Technology and Story in an Always-On World SapientNitro suggests poor communication between data scientists and marketers and poor execution and management of data science efforts, and recommends best practices for communicating with and managing the data science team. The article from which this content is adapted is titled Delivering Big
Data’s Potential: A marketer’s primer by Bryan Smith, PhD, Director Marketing Strategy & Analysis, SapientNitro. The article is one of several within the Insights 2014 report, available for free download.
Key terms in data science
It’s helpful for marketers to familiarize themselves with a few terms from the “data scientist’s glossary:”
Machine learning
The process of training computers how to “learn” interesting patterns in data, and how to adapt automatically when data change over time.
With machine learning, algorithms are trained for a time in order to build models, which are used to direct automated marketing and operational decisions, adapting as more data is gathered once “live.” This learning can be supervised or unsupervised.
Supervised learning
Supervised learning algorithms take independent variable inputs such as user click stream data and dependent variable outputs like purchase data and create a model for how they relate to each other. Once a model is deemed “good”, it can be rolled out to new users as a predictive tool.
Unsupervised learning
Also called “data mining,” unsupervised learning looks for patterns and relationships within the data itself, without concern for outputs. Rather, unsupervised learning is useful to for segmentation and customer insight.
Machine learning use cases
Here’s the fun part…what is machine learning actually used for?
Supervised learning use cases
- How likely is a visitor to convert, based on demographic, geographic, social or other segmentation characteristics?
- What behaviors best predict customer churn? Which indicate a customer is likely to renew or upgrade?
- Who are your most valuable customers over time? Can you identify new users that are likely to become high lifetime value?
Unsupervised learning use cases
- Identify customer segments based on common characteristics
- Recognize products that a given segment has an affinity for, and suggest them to the right users using collaborative filtering
- Apply personalization to customer segments
- Use clickstream data to dynamically optimize the site experience for a given user, taking into account his or her identified segment(s)
- Use “basket analysis” (frequent items mining) to identify items or categories most likely to appeal to a given customer in cart, post-checkout or in remarketing / email
Best practice for closing the marketer / data scientist divide
Education, including familiarity with jargon, is the first step towards marketers understanding data scientists. Also required is clear communication. Like a creative brief, data scientists need clear and concrete questions from marketing to direct their efforts, and be efficient. Ambiguity only drags out the process, and risks unsatisfying results.
SapientNitro’s essential marketing questions to ask your data team:
1. What attributes define my most valuable customers?
2. What customer segments am I most likely to lose in the next 3 months?
3. What features of my product resonate with various segments of my user base?
4. How are my marketing campaigns being received in social media?
5. Are there identifiable differences among various groups who like or dislike my products?
6. Why was the click-through rate of my most recent campaign so high/low
7. Can I move stale inventory by identifying cross-promotion opportunities from existing user behaviors?
8. Are there any readily-identifiable common attributes of my most successful salespeople?
Experimentation is also a part of data science, which may require several iterations. Marketers must understand this, and not conclude that data scientists are failing in their process. Deadlines imposed on a data science team can short-cut their ability to properly test and tune their algorithms.
Best practices for managing the data science team
1. Ask specific questions
SapientNitro observes marketing teams spending too little time identifying and prioritizing questions for the data science team, especially when it comes to business objectives.
Take a recommendations engine, for example. Without context around the marketing mix, profit margins and supply chain, the data science team may build a data-driven system that works against the business.
Feed the data team questions like “how can we measure cross-category promotions activity across geographies?” or “what categories occur most often together in users’ carts?”
2. Create a sense of urgency
Despite the recommendation to avoid deadlines, activities shouldn’t be left too open-ended. A backlog of well-defined and time-limited experiments in which expected results are understood by both marketing and data science teams helps keep the project on track. SapientNitro advises experiments run 4-6 weeks, rarely exceeding 8 weeks.
Be prepared for failure. Expect it. The goal is to learn, acknowledge failures, and reprioritize the backlog. This approach borrows in many ways from Agile software development.
3. Define the owners
For both sides of the project, determine who is the ultimate owner, who owns provisioning and access, who will use the results of the analysis, who is the end consumer of the output, etc.
4. Keep control of information
While it’s common to outsource the collection and storage of your data to third party vendors, consider the long-term impact of allowing others access to your raw data streams.
If you must outsource, ensure data science and governance informs how this is done, and how data is accessed internally and externally.
Image credit: CC by Kevin Krejci