Be Aware of Google Bearing Gifts

For the first few days, everyone was focused on Google Base's implication on the e-commerce industry (auction + classified). But there is something much bigger at work here as everyone started to noodle on the bigger implication around data, ownership, business model and access.

I alluded to these issues in my last post on the ping server acquisition (weblogs.com) by Versign that the current crawl model for information indexing is simply not sustainable in the long run.

Information on the web is generated at lightning speed which makes these search engine indexes almost obsolete the moment it's done crawling the site. For example if a search engine only crawls John Battelle's Searchblog once a day (which is a privilege reserved for the chosen few) the index is irrelevant the moment John puts up another post (man, the guy is prolific). As a result, push indexing will go the ways of the dinosaur if the lifecycle content of the web continues to increase. (Remember how old the Google image index was last year? I think something like 6 month old! imagine if a blog search engine has a 6 month old index). As a result blog search engines use the ping/subscribe/crawl architecture which is a lot more efficient with fresher indexes.

GoogleBase is an attempt to take that one step further. Couched in the terms of faster index inclusion and traffic generation, Google wants to take your data directly into their hosting infrastructure. Screw pinging, directly update your data on Google Base! If that doesn't sound scary perhaps we need to remember the uproar we had over Microsoft Passport and Plaxo's initial hosted data model. Just because Google claims to do no evil doesn't mean it can replicate MSN's product model and get away with it. In a world where data is becoming more federated and open, Google is taking a page from a book of a bygone era.

I have my faith that federated, open, and ping is the better model for the future of the search engine evolution. Yes, the GoogleBase model is technically superior; but I'm not too sure we all want to live in a world singly built ontop of SkyNet Google. We've done that for the last 20 years with Microsoft, I don't want to switch one master for another just when I see an inkling of change.

Men Versus The Machine

Pete Cashmore, Paul Montgomery , and Mitch Ratcliff are having a real good conversation regarding the collective intelligence of humanity versus algorithms. A few random thoughts . . . as I dont have too much time for coherentness.

1. Reminds me of the debate in the financial industry on quant hedge funds (program based trading) versus investing in broad indexes such as the S&P 500 (ie algorithms vs. collective intelligence). (BTW the third option, investing in human managed funds, would be considered the web 1.0 model . . . stupid, slow, dumb . . . I'm exagerating but you get my point.) The problem with algorithms (in both finance and editorial content) is that it is based on a priori model of the world which assumes that certain unknown variables (not in the model) remains constant. So as long as major shocks to the system do not appear (such as major earning disappointment, corporate fraud etc) , algorithms are much much faster than humans can at detecting market efficiencies, forming a view on market movements, and capturing value by exploiting such an inefficiencies. These models will make money slowly but consistantly until something goes wrong and the fund could potentially lose everything (read Long-Term Capital). As such, most quant hedge funds built into the system ways to for humans to intervene (such as turning off the trading program during earnings week or triple witching hour) as well as ways to add value into the model. The co-dependency implementation is not perfect but it beats losing your shirt. Perhaps thats where we are headed on the web in the not too distant future.

2. This is another key difference between Yahoo and Google. Many people have pointed this out before. Essentially Yahoo's historic background in directories and Google's in algorithms has pited the two company in two of major camps in their vision of the future. Yahoo has fully embraced the whole web 2.0 peer economy model. MyWeb is a historic attempt at PeerRank while many of its other products like Yahoo Local relies on the contribution of users. Google on the other hand is still taking an aggregated approach to web 2.0. Essentially it is gathering up reviews and content contributed by users of OTHER websites and presenting it to its users. Google believes that it has superior parsing and ranking algorithm to recognize valuable content and create semantics around unstructured data. Look at the relaunch of Froogle and its vendor and product reviews. . . all crawled from other content providers. Both model has its merits . . . Google has a much easier time building critical mass and generating value for its OWN users in the early phase of the product lifecycle. . . but eventually, it is at the whims of content providers (see craigslist and oodle). Yahoo, on the other hand will have a much harder time generating network effects, but once it does, its position in the industry is much more defensible because it owns much of the content on its own site. I wonder who will win out in the future . . .

Google's Vertical Search Strategy

Jeff Nolan has already called me out for my Google obsession but I cant really help it given that its a part of my job (search, finding, etc etc) to track google and everyone in the jasminlive industry as closely as I can :) Last weeks launch of local shopping on froogle sheds some light on Google's vertical search strategy. It is actually quite brilliant really. . . and quite obvious too now that I think of it . . .

Google's biggest asset is no longer its technology, its the traffic that it can generate. Thus, instead of trying to compete with vertical search engines head on with technology by doing the things that vertical search do best (meta tags, application layers etc . . . see the comment section of my previous post . . here) , Google is going to throw its weight around, as well as go back to its roots by remaining an aggregator of aggregators. (or a search engine of search engines)

Froogle's local shopping product is built on top of other vertical search players/aggregators - Getauto, Stepup, ShopLocal (interestingly owned by the Tribune Co. . . hmm syndicated classified by Google imminent?), and others. Why bother with trying to create semantics around millions of data sources when other little startups are already doing it for you? Now, google only needs to leverage GoogleBase's folksonomy based semantic engine/database to aggregate from a few sources. These players could always block google but given that they need the traffic, they can't say no (like Craigslist did to Oodle) but will instead feed their https://www.chaturbaterooms.com/ content to google through googlebase. And as long as each of the verticals remain fragmented, Google does not face serious threat of re-mediation (think what google did to yahoo or MSFT did to IBM). I wonder which verticals Google will move into (leveraging Googlebase) and how other vertical search/application players will respond. . . fight the power or fall in line?

2^N versus N^2

I've been trying to digest Umair's latest presentation on edge competencies especially because Umair's distinction on Metcalfe & Reed's law (+ his own Hague law) is an in interesting framework for segmenting as well as evaluating network effects driven business models. I've spend better part of my career at companies that tried to leverage network effects so it's a important field of study that interests me highly. Furthermore, the framework has also garnered a lot of buzz, as it should, around the web from the likes of Fred Wilson (1,2) and the Stalwart blog.

The first thing I learned is that Umair has to be atleast 3x smarter than me :) , but more importantly, network effects can be really segmented into 3 parts

1. The re-usability of the trading unit. For example, since content is produced once but consumed multiple times (zero or close to zero marginal cost per sell!), its increasing return economies of scale is much higher than that of companies which use network effects for trading non re-usable or non re-consumable units such as physical goods. (but of course physical goods is valued more per unit usually so content centric businesses needs to realize higher scale to achieve the same jasminelive network value)

2. Peer production symmetry. The more flat the distinction between producer and consumer (or client and server) the more value is created. In eBay's case, ideally, you want all your buyers to be sellers as well thus increasing participation, transaction per user, as well as loyalty. Bill Burnham called it input-output asymmetry more than a year ago on his blog.

3. Network efficiency and size. The is probably the best understood part of the network effects business model. How big is the network? how many users? How easy is it to participate in the network (acquisition+activation)? And finally, how efficient, fast, easy, is it to trade on the platform.

(BTW, I hope I'm paraphrasing Umair correctly)

One thing, however, that I do have a few questions with is whether Google, as Umair asserts, has exponential (Reed's) network economies of scale. Obviously the regression graph of revenue on users makes it seem so. But on a technical level I question the number of Google users is an apple to apple comparison to other companies user numbers. The # of adwords/adsense accounts is only a small % of the real user number of Google. Searchers do not need accounts and thus are not counted. Furthermore, gmail, gtalk, orkut, etc users are only a subset of ad clickers (consuming participants). The unique vistors / month metric is also a vastly deflated # as total registered user base is used for other companies. I'm not sure which one Umair used (or Google used). As a result, as keyword prices increased significantly over the last year, the user # held steady (there are only so many buyer of keywords, they are just buying more) it seems as if Google revenue/user is growing exponentially.

Using the framework I outlined, I also question Google is truly a Reed's law business.

1. The major trading unit for Google is information (which arguably is re-usable) but the most monetizable segment is really purchasing information. Ad buyers are trying to sell something and thus only search traffic that have commerce intention are really monetized by Google (i.e. clicked on). As a result, Google can only monetize their traffic as fast as their ad buyers are selling and ad clickers consuming physical goods or services. Thus Google's scale economies are limited by the offline inefficiency of production and consumption (like everyone else).

2. There is significant asymmetry between ad buyers and ad clickers as most searchers do not buy ads from Google.

3. The last part regarding Google's network size and efficiency is where Google really kicks ass. It is obviously a popular/great service with a lot of traffic. Furthermore, its dynamic pricing mechanism is the enviable model of value pricing. So on this part, Google gets a full score.

I do believe Google has as good an opportunity as anyone to become a Reed's law business as it expands beyond search to other business lines that are more amenable to achieve Reed's economies (Gtalk, etc). But for now, it's simply enjoying a hypergrowth phase like many previous internet darlings. Of course, I could be very wrong as I have to admit I did not fully comprehend the intricacies of the framework as presented by Umair.

The Scene 2.0

Blog networks are hot. . . ever since weblogsinc got bought by AOL the idea of blog networks have spread beyond the professionals into the casual bloggers segement. Sure there had always been BoingBoing, Gawker, Corante, even the9; but today everyone and their mom are either starting their own blog network, joining one, or being invited to one. (I won't name names here)

This was certainly inevitable ever since main stream media started publishing top xxx blog lists. A few people realized what was happening like, Fred Wilson, and asked to be taken off those lists. Other's however, saw an opportunity to get famous or make some money. Many on those lists are leveraging their notoriety as anchor blogs and starting their own networks. Those left off, are joining together hoping to gain critical mass in numbers. (BTW Google's PageRank is one of the implicit drivers of the trend).

The emergence of a blog hierarchy, be it single destinations or networks is not a good thing. It feeds the ego of bloggers and destroys the democratic nature and voice of the blogosphere. Perhaps I'm being naive as there was never such a thing (see blog mobs). But I don't see a good ending for the scene in general.

A long long time ago, as a teenager, I was involved (peripherally) in another scene. An underground PC hacking counter culture sometimes called the scene or elite. The scene first started out as bunch of kids distributing and cracking games where copy protection had been removed. The community (the better word for it) at the time was pretty haphazard (like the blogosphere was 12 month ago) and certainly amateurish. There were groups of hackers with their BBS but no one dominated the scene. That all changed when The Humble Guys arrived. No longer a community of teenagers looking for a free copy of Leisure Suit Larry, these guys were adults (almost :) ) who had risen/grown up with the scene and taken a elitist and professional attitude to what was before a hobby and passion. There were money to be made in the ecosystems of cracking and distributing illegal software. All of which fed off the ego of the community to be cooler and better than others. The software remained free, but you can now pay to join a group, pay to be a distribution BBS for a crack group, pay to run your BBS on the latest version of Vision or LSD, etc, etc.

Pretty soon, seeing the power and money making ability of The Humble Guys, other imitators popped up like Razor 1911, iNC, Fairlight, USA, and others. A hierarchy was quickly established and the scene bifurcated significantly. You are either a consumer or part of the management. More specifically, people are either involved in the publication of software (analogous to blog creators) or people who downloaded (readers). There was no middle ground. You were either elite or lame.

There was almost no point starting your own BBS (think blogs) because there is no support system for you to get started unless you belong to a major group or have a network of BBS to drive you traffic (dialers). But of course, you can't join one unless you run your own BBS and build up a good reputation and user base. The catch-22 eventually drove the downsizing of the community. The resulting apathy, the rise of the internet, movement away from copy protection, secret service crack down, and p2p file trading all helped reduce what was once a very vibrant community.

Of course the pc hacking scene is not even a close analogy to the blogosphere today . . . but there are lessons to be learned. Once any community has a huge peer asymmetry between producers and consumers, its network value decreases and a vicious rather than virtuous cycle emerges driving down the incentive of joining such a community. I hope this I not the beginning of the end. I once got apathetic and left the scene 1.0 (plus chasing skirts became more fun :) ), I hope this is not the case for the scene 2.0.

I always wondered what happened to those guys, and where they are today (those that did not get arrested!). Did Paul Allen become Fabulous Furlough after leaving MSFT? (not likely :) ) Did any of them become hugely successful entrepreneurs? Are they running around the valley today knee deep in the tech industry? The blogosphere even? Better yet, are there any BBSs left? Would love to fire up my modem and Procomm, dial around and re-live the wild wild west again. I could even load up theDraw and pull out some ANSI artskills I've hidden away for over 10 years. . .