Data or Algorithms?
Chris Harris
Coming from the data mining and machine learning field, a few friends of mine have been kicking around the idea of whether data or analysis is a more valuable asset. Obviously the two have a symbiotic relationship to be valuable together. If you can have both you’ll take it, but what if you had to choose? Is it generally true that in most areas of business you have to specialize - which means prioritizing certain activities, skills, or knowledge over others. Should you choose to get better access to the data or should you choose to get better at analyzing whatever data you have available to you? If you’re in an analytics heavy field - is there an inevitability that one or the other will win in the long run? If not, under what conditions might one be the better option?
I originally started thinking of this in terms of “Which would cost you more if you had to buy it?” Would it be cheaper to have the ability to analyze data well, and have to “purchase” the data from someone else - or would it be cheaper to have the ability to obtain, store, and access the data well, and have to purchase the analysis from someone else? Microeconomics says that value is all about scarcity, so this line of thinking led me to conclude that decent analysts will always be cheaper (on a for-hire basis) than access to all but the most plentiful data. Therefore, if the data is abundant then it might make sense to specialize in its analysis, but under almost any other circumstances you should choose to be the one who has better access to the data. Point for data. Algorithm design, analysis, whatever you want to call it is basically a particular example of human ingenuity. Therefore at any moment in time, someone has come up with the “best” way to analyze some data for a particular purpose. However, innovative people all over the world are hard at work to put that person’s idea out to pasture even before the idea has been tested. Therefore, cutting edge analysis has two properties relevant to this discussion. First, it can increase the value of the data by a quantum leap when a new theory emerges. This makes it a very valuable contribution for a period of time. However, inevitably, how long this lasts is hard to predict. It could be the decisive contribution for a year, five years, a decade, or a few decades. The only thing we know for sure is that it will not be king forever. Still, whoever owns the best analysis in the world for a given situation is arguably bringing a very unique value contribution to the table and if protected properly can be quite scarce. Point analysis.
Brad Burnham at Union Square Ventures considered the problem in a much better way. He posits that data has an increasing marginal utility. This is a very important characteristic which says that every additional piece of data you get is more valuable than the previous one you got. Why? Because with proper analysis, you can tie that single new piece of information together with potentially all of your previous data. This network effect of data is very insightful. It says that knowing your location and knowing what you search for on the web are both valuable in their own right, but knowing them together is even better (think relevant & localized ads). Nice work Brad. Another point for the data.
What about analysis, does it have any “network effects” or compounding effects to counter the effects we just saw that collecting more data has? I think it might. The data people have a cost structure problem. The cost of getting & storing a lot of data can be quite high. The trend is that data from transaction or event monitoring is growing exponentially. The cost to store the data is also declining exponentially, which is good news for the data guys, it means they have a chance at least. However, the costs to access & transmit the data is not decreasing nearly as fast. Also, the cost to power (literally, in terms of electricity) the storage systems is becoming a problem, and not decreasing nearly fast enough to be meaningful. Point analysis.
The more I think about, my gut instinct says that data is the only way to keep a lasting competitive advantage. However, if you want to make a quick strike against your competition, it seems that analysis may be the way to go. Perhaps the right strategy is to try to shift from one to the other? This could definitely use some more consideration. I’m going to be on the look out for some good case studies out there. Maybe Google, eBay, and even the phone companies provide good examples on what to care about here.
Posted in Innovation, Technology |




January 8th, 2008 at 1:55 am
I think it would make this discussion more concrete to have more examples of this data-vs-analysis.
How about stock market trading data? Simple data is widely available, so for the point of making it more scarce, let’s suppose the data included which mutual fund/portfolio/individual made each stock trade. In this example, it would seem that the data is hard to come by, and very valuable. It would be somewhat easier to hire some “quants” to build a system to profit from it. So I think data wins here, so long as it’s hard to get.
Also, to be clear, we’re talking about data *streams* here right, not just fixed data? For example, all that trading data for 2000-2007 on a DVD becomes less and less valuable over time, and eventually worthless. But I presume you’re talking about “data” as the ability to continuously get more fresh data.
If so, I’d say the value of the data-stream is (like you said) set by it’s scarcity. So I don’t know if there’s a general “which is better” answer.
However, I think data access usually follows the same pattern: it is initially difficult to come by, and in the hands of a wealthy few. And then as time goes on, the “commoners” figure out how to duplicate, steal, make the data cheaper, more accessible, or simply irrelevant. I’m thinking of things like access to printing presses, encyclopedic data, DVD decryption keys.
So I would consider data (or data streams) to be a quickly-depreciating asset. If the data is worth more initially than the analysis, at some point it won’t be.
Algorithmic analysis, on the other hand, I would say has a relatively fixed (and cheap) cost. If no-one else is doing it (perhaps you’re the only one with access to the data
then its value might be high. But to the extent that a competitor could easily do the same analysis, I’d say the true costs are actually quite low. I guess I’m assuming “analysis” is the process of applying smart people (and relatively cheap tools) to a set of data. I suppose this implies that I think you can always find smart people for cheap 
January 15th, 2008 at 2:55 am
[…] Chris Harris presents Data or Algorithms? – New venture outsourcing blog posted at New venture outsourcing blog, saying, “Coming from the data mining and machine learning field, a few friends of mine have been kicking around the idea of whether data or analysis is a more valuable asset. Obviously the two have a symbiotic relationship to be valuable together. If you can have both you’ll take it, but what if you had to choose?” […]
January 15th, 2008 at 9:40 am
[…] Data or Algorithms? […]
February 6th, 2008 at 8:14 pm
Honestly…and this is similar or maybe the same to a point you made, but from a different angle:
If you start collecting data but have no way to analyze it, and then a year later start to analyze, you have a lot of good stuff.
If you start working on analysis techniques, and a year later start gathering data, you have very little.