Having just returned from a brief but wonderful vacation, I decided to join Josh in resolving to blog more this year. Happily, four recent news items provided instant fodder for commentary...but as I began working on four blog posts, I realized that I was saying the same basic thing on all four which in and of itself was pretty interesting. Tying these articles together is like playing "six degrees of Kevin Bacon"...
The four stories:
- Josh McHugh's excellent piece in Wired on data ownership and screenscraping
- News that Rhapsody has shut down Yotta for using "non-public" APIs instead of the approved, "public" ones
- The discussion of Twitter's business model, or lack thereof.
- and, of course, the fallout from Scoble being banned from facebook for violating their terms of service by screenscraping email addresses (though, as of this writing, Facebook has decided to allow him back in if he promises not to do any more screenscraping).
Most web applications are built on a three-tier architecture:
- At the bottom is the Data Layer, which is a repository of (or a means of accessing) all the data necessary to run the app, whether it has been generated by the app, entered by the user, or pulled from a third-party source.
- One step above is the Logic Layer, where the app performs calculations and otherwise "processes" the data to provide a useful service, aggregate the data in an interesting way, or perform other related tasks.
- Finally, there is the presentation layer, where the computer-readable results of the data layer's work are rendered into human-readable (and browser-presentable) form, complete with navigation, images, widgets, ads, and whatever else constitutes the "user experience".
Now the business model/screenscraping/data ownership question that any provider of a web service (lower case W - any service provided over the web - as opposed to the upper case Web Service, which has particular meaning to those of us working in the "Web Services Space") is "which layer represents my core competency, the 'secret sauce' that I do particularly well, and which will drive the value I am creating with my business."
It is this "value layer" that needs to be protected from being co-opted by others who would try to go around it...but which also represents the biggest opportunity for rapidly increasing reach by, well, allowing others to co-opt it, but in a good way. Sounds contradictory, but it isn't. Let people provide other layers (presentation, data, or logic) to complement your value layer, but fight vigorously any attempt to circumvent your value layer for their own purposes.
Leaving aside the services that have no value at all, most services have the data layer, the logic layer, or some combination of the two as their value layer.
Twitter (business model notwithstanding) has, through their decision to allow any number of third party apps and services to offer UIs that compete with their own web interface, has explicitly acknowledged that they fit into this category, and whatever business model they develop will almost certainly involve building monetization into these layers. Twitter broadly distributes their value layer through others' presentation layers.
eBay, who has published the statistic that over half of their listings originate through their API (and therefore without any use of their presentation layer), has a similar orientation.
As do the financial institutions who allow for companies like Mint to screen-scrape users' data for inclusion in their vastly superior user interface.
Rhapsody blew it on this one big time, because clearly they have a compelling and defensible business model that people are paying real money for - just like eBay - and Yotta was providing a front-end that some people felt was superior. Rhapsody still made every cent on the subscription, but rather than embracing a service that got their "value layer" broader distribution, they shut it down. Clearly the "public API" does not provide adequate functionality for a service like Yotta; Rhapsody should have looked at how to make the "private" API into a public one, and seen how many other alternative UIs sprang up from their passionate community.
Very few services can legitimately claim that their presentation layer is their "value layer". Of course, most everyone wishes that the truth were otherwise, because so many new applications pin their revenue hopes on advertising, links, and other monetization schemes that can only work if the user visits their presentation layer. But wishes don't count - only actual value.
As for Scoble, I totally agree with Michael's take on the whole thing. Facebook is in the rare but lovely place of having two "layers", one of which is the presentation layer, and the other of which is the data layer. By going after both of them with a "ask forgiveness not permission" strategy, the result was inevitable. On the other hand, Facebook has been explicit that they care little about their logic layer, and as long as you don't try to abuse the other two, you have pretty much free reign to distribute your logic layer through their presentation layer, and leverage their data layer (also known as their user base).
This is one of the early shots in what is likely to be a big debate over APIs vs. Screenscrapers. Services like Kapow and Dapper are hugely convenient for personal projects, but any large-scale (i.e., significant enough to be commercially relevant) use of scrapers will be easily blocked.
As McHugh says, such a relationship benefits both sides:
Some large Web companies don't relish the unregulated dispersal of their data and would love to find a way to monitor and control the information they dole out. That's why many of them have begun encouraging developers to access their data through sets of application protocol interfaces, or APIs. If scraping is similar to raiding someone's kitchen, using an API is like ordering food at a restaurant. Rather than create their own bots, developers use a piece of code provided by the data source. Then, all information requests are funneled through the API, which can tell who is tapping the data and can set parameters on how much of it can be accessed. The advantage for an outside developer is that with a formal relationship, a data source is less likely to suddenly turn off the taps.