10 Things that Make a Data Business Valuable


This article examines the characteristics that make a data business valuable. It will be useful for entrepreneurs refining the strategy of an early stage data business, and those seeking to best position their data business for a successful funding round.

At Cue Ball, we are specialists at investing in data businesses. Over the years, we have invested in many data and analytics businesses including StyleSight (fashion), Knovel (engineering), Lex Machina (legal), SmartZip (real estate), and Redline (financial services). We have learned much from the entrepreneurs running these businesses.

In addition, we have the benefit of counting Dick Harrington as one of our investment partners. Over his 11 years as CEO of Thomson, he quadrupled cash flow, tripled the market value of the company, and created the world’s largest information media company. It is not an exaggeration to say that Dick is the world’s foremost expert on information businesses. I would like to thank him for his insights on this piece.

For this article, I have also co-opted another expert in the field, Matthew Burkley. Matthew is the CEO of Genscape, one of the most innovative data companies in the market. Matthew is brilliant and I am grateful for his insights on this topic as well.

What follows is a list of salient features we look for when evaluating an early stage data business. I will skip over generic items that make any business attractive (i.e. differentiation, large addressable market, high quality management team, and the like) and will focus on characteristics specific to data businesses.

  1. More Than Nice-to-Have

There are nice-to-have data sets and then there are the must-haves.

Characteristics of must-have-data include:

  • Informs a high-value decision
  • Adds significant value to that decision (not incremental)
  • That value can be measured
  • It is critical to do your job, i.e. it allows you to do something you can’t otherwise do
  • It is integrated into your workflow
  • Frequency of the need is high
  • If you had to cut costs, you would cut something else

In the nice to have category, newspapers are a great example. Although I sometimes feel like I must read the Wall Street Journal (my favorite paper), I could do my job just as well without it. I could read the New York Times business and technology sections instead, or I could even get the news from a variety of online sources for free. Another example in the nice to have category are university textbooks. These days, many university students don’t ever bother buying the recommended textbook, or merely buy an old edition at a discount.

Every industry, on the other hand, has examples of critical data vendors. In investment management, Bloomberg, FactSet, and Redline are prime examples. These vendors help you get the most real-time, accurate securities pricing data that enable faster and more profitable trading. For medical doctors, UpToDate and Epocrates are good examples. When a physician is in the ER and needs to get accurate drug interaction information for a patient whose life is at stake, even if the information may be available elsewhere, they need a definitive data source they can trust.

We search for businesses that are closer to the must-have side of the spectrum. This is ultimately the most basic quality that makes a data business valuable.

  1. Proprietary Data

A high-quality information business is generally built on proprietary data.

The problem with businesses that merely aggregate data from other sources is that the underlying data must generally be purchased. This creates a relatively low ceiling on achievable gross margins.

Similarly, companies we often struggle with are those whose analytics product is built entirely on the Facebook, Twitter, LinkedIn social graphs. The problem here is the risk that these data hoses can be turned off.

How is proprietary data created? I enumerate several ways below with examples.

  • Collect observable physical data: Genscape (uses sensors to collect power line usage, among other things); Mapflow (flood mapping data)
  • Crowdsource data: Premise Data (crowdsourced economic data); Placemeter (crowdsourced traffic data from video); I/B/E/S (research analyst earnings estimates)
  • Give-to-get data: Argus Information (aggregates and cleans data provided by bank members); Compstak (lease comps)
  • Collect data exhaust: large marketplaces like Uber and even financial exchanges like CME and ICE collect valuable data exhausts from their user activity
  • Create content: It is possible to take public / commodity data, add content to it, and then re-sell. Westlaw, for example, takes publicly available databases of case law, adds content, footnotes, and tags to enable search.

Proprietary data becomes valuable when there is a competitive moat around the data. To the extent there are barriers preventing the recreation of the data set, such as intellectual property protections (on the data or processes used to create the data) or significant lead times, the value of the business is enhanced.

  1. You Can Start Small

Creating a viable data business from scratch, as a startup, is different and more challenging than doing so as part of a large data conglomerate. Often, the cost of creating a commercially viable data set can bankrupt a small company.

When evaluating a data business, we try to determine whether it can be viable when small. One way to test this is to see if there is a so-called big pyramid on the value of the data. Does the data have high enough value to a few initial customers such that they are willing to write a large check? If so, the company will be better positioned to monetize additional customer opportunities ahead of them.

  1. View and Do Data

Many data businesses are view based. You see the data but don’t know what it means. We like businesses where the data you are presented with directly leads to an action. The data should be used within a business process and integrated into the user’s workflow.

One strategy to further embed your data into a customer’s workflow is to use a strategy that Dick used at Thomson called “three minutes.”

“What were end users of a product or service doing three minutes before they used it and three minutes after? What were they doing for the next three minutes? We kept asking that until we got a view of the full day. We wanted Thomson products to be a part of as much of that day as possible.”

Using this technique can help you better understand what prompted a customer to access your data, as well as what action it leads to. This insight can be used to expand into logical adjacencies to make the data more integrated into your customer’s workflow and, as a consequence, stickier and more valuable.

You can find an excellent article written by Dick and my partner Tony Tjan published on this topic in the Harvard Business Review here.

  1. Valuable Analytics Layer

Today, there is no such thing as a pure data company. The largest data companies are really software companies on top of data. In general, end users don’t just want data any more, they want software tools that can provide the answer and helps them act on the data. Content is key but is itself only a piece of a complete product.

A good data company needs to have enough software or analytics to add relevancy and structure to its data. Enough software is needed to provide a base level of how to search the data, manipulate it, and help the user make a more informed and valuable decision and ultimately do their job better.

In certain cases, a robust analytics layer can add significant value to a business with unstructured or even commodity data. When Dick was running Thomson, people said that Westlaw’s content would be threatened because much of the underlying court data was public. But because Thomson had added enough analytics to the offering (searchability, ability to add additional relevant information, ability to predict litigation success rates, etc.), the product was a leapfrog ahead of existing competitors and new entrants.

  1. Take Advantage of an Information Vacuum

We love businesses that provide data in a new space where there is a relative data vacuum.

It is a challenge to launch a business in a niche where there is a large incumbent player. If you are a small, niche player going against a Bloomberg or a Thomson, it will be hard to find a competitive opening. Additionally if Bloomberg or Thomson controls the market, they are probably your only strategic acquirers.

Additionally, when you go against an established product, it is hard to make customers change what they currently use and are used to. The new product has to be dramatically superior. This is hard to do.

When evaluating a business competing in an area with existing players, we have a rule of thumb that if the business can not ultimately achieve a relative market share that is at least half of the biggest player in the space (i.e. #1 should not have more than 2x our market share), then the new business is markedly less valuable.

On the other hand, creating data in a new area that has not been data rich and where there are gaps in the marketplace is valuable.

  1. The Business is Horizontally Scalable

We believe that the ability of a data business to scale horizontally is more important than its ability to scale vertically.

What does horizontal scalability mean? It means that the processes used to create your original data set can be used to create additional data sets for adjacent end markets or geographies. When performing our due diligence, we test to see if a data company can export existing and ideally patentable processes to create multiple data sets across repeatable horizontals.

Most of the early stage entrepreneurs we meet want to attempt to scale vertically before scaling horizontally. Sometimes this is a strategic decision. For example, an entrepreneur may want to build the ultimate destination providing everything needed for their end market sub segment (including adding a desktop portal, advanced analytics, and so on). In other cases, the businesses processes used to create the original data work at small scale but are by their nature not horizontally scalable.

The problem that sometimes arises when an early stage business pursues vertical scale is that costs are added to the business without always bringing more gross margin with it.

Horizontal scalability is a strategy that Matthew Burkley at Genscape has executed very well. Genscape started in the power market, then used their existing processes to expand into oil, then gas, and then water.

  1. The Data is Priced Intelligently

When a data business is revenue generating, there are a few things we look for in its pricing model to ascertain the value of the data.

  • Subscription pricing, not based on usage. We are generally skeptical when an early stage business charges based on usage. It is our experience that larger enterprises don’t want to pay “by the drink” because they want a fully budgeted item. Additionally, you should want your customers to use the product more and have a pricing structure that encourages this.
  • Charge a premium for an initial seat license and then ratchet down pricing on subsequent seats. This feature enhances the value of the business because when/if seats are rolled off, the less expensive seats roll off first. It also promotes deep penetration within the customer’s business.
  • The data is the product. The primary revenue from the product is generated from the subscription on the data rather than advertising or transaction revenue streams.
  • Cheaper does not always win.
  1. Sell into Big Customers

We generally prefer data businesses that currently sell into (or have the ability to sell into) large customers. The reason for this is that we want the ability to pursue the deepest profit pools available, and larger customers have more money to spend on data products. If a data company has to pursue smaller customers early on, that sometimes indicates a less valuable data set.

Additionally, while every business has fringe customers, pricing to get a fringe customer does not scale and is not very profitable.

  1. High Renewal Rates

Must have data businesses have high renewal rates. Nice-to-have data businesses have low renewal rates.

Ideally, we would look for data businesses in their second year of revenue generation and beyond to have 90-95% renewal rates.

Wrap Up

I am sure there are other important features that make data businesses valuable. I’d love to hear your thoughts.

Copyright 2015 Ali Rahimtula. All rights reserved.


3 thoughts on “10 Things that Make a Data Business Valuable

  1. Great post! One additional feature that comes to mind is the network effect of data businesses. With each incremental client, the value for all other clients grows in proportion. To your point #2, this can create a strong barrier to entry and can increase the proprietary nature of the data set. For example, a financial data business may offer indices to compare the performance of a given client against the rest of the install base.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s