Data is the Intel Inside
January 31, 2008
data is the intel inside via O’Reilly Radar:
That least-understood principle from my original Web 2.0 manifesto, “Data is the Intel Inside,” is finally coming out of the closet. A post on the Google Operating System Blog entitled Google is Really About Large Amounts of Data notes that in an interview at the Web 2.0 Summit in October, Marissa Mayer, Google’s VP of Search Products and User Experience, “confessed that having access to large amounts of data is in many instances more important than creating great algorithms.”
In particular, Marissa admitted that the reason for offering free 411 service was to get phoneme data for speech recognition algorithms. You heard it first on Radar. What’s also interesting, though, was her note on why they want better speech recognition algorithms right now: to improve video search. There’s an interesting principle here, namely that the obvious applications for a technology (e.g. transcription or speech recognition interfaces) aren’t necessarily the ones that will have the biggest impact. This is a great reason why companies like Google are increasing their data collection of all kinds (and their basic research into algorithms for using that data). As the applications become apparent, the data will be valuable in new ways, and the company with the most data wins.
More thoughts on opening up systems/networks to amass users and hence data by O’Reilly Radar:
…open systems don’t mean the end of competitive advantage, but instead simply move the competition to new ground.
For the current generation of Internet applications, sometimes referred to as “Web 2.0,” the data collected from users is the true source of competitive advantage. And the first movers, the companies that understand and apply this insight, have services that get better fast enough that their competition never catches up.
The power of a social network like MySpace or Facebook isn’t in its software or its control over which applications get on its platform. It is in the critical mass of participating users. Ditto for eBay, Skype or YouTube. Even less obvious cases like Amazon, where user annotation makes for the best product catalog in the world, and Google, whose search index and ad auction are both driven by user participation, show the power that comes from harnessing the collective activity of everyone who uses the service.
Cellular carriers need to embrace this insight. Winner-take-all profits can be achieved by opening up their networks and then harnessing community contributions (including the contributions of software developers) to improve — or invent — new services.