Archive for September, 2005

Annotating Wikipedia

Saturday, September 3rd, 2005

The Semantic MediaWiki proposal looks really promising.

Anyone who knows how to edit articles should find the syntax simple and usable:

Berlin is the captial of [[is capital of::Federal Republic of Germany|Germany]].

Berlin has about [[Population:=3.390.444|3.4 Mio]] inhabitants.

All that fantastic data, unlocked. (I’ve been meaning to write on post on why explicit metadata is democratic.) Wikipedia database dump downloads will skyrocket.

There are also interesting proposals under Wikidata as well (though big forms make me uneasy), but those mostly seem more applicable to new data-centric projects, while the Semantic MediaWiki proposal looks just right for the encyclopedia. Gordon Mohr‘s Flexible Fields for MediaWiki proposal could probably serve both roles.

Once people get hooked on access to a semantic encyclopedia, perhaps they’ll want similar access to the entire web.

Via Danny Ayers.

Blog search stinks

Friday, September 2nd, 2005

A couple weeks ago Jason Kottke posted a complaint about Technorati. Its search results are slow, non-comprehensive, of mediocre relevance, and can’t even manage one nine of reliability. Technorati’s competitors all have the second problem and have or will likely have the others as they grow.

Kevin Burton would prefer blog search to aim lower:

I’d rather have a Technorati that was fast and always worked even if that meant only indexing 1M blogs. Even 500k blogs as long as they are the top 500k blogs.

Sounds like a reasonable tradeoff, but it’s completely unacceptable. What if Google had decided to index only 100M web pages in order to stay fast and reliable? Google would no longer exist. (Also pretend you read something about the of the blogosphere here.)

Only one of thirty trackbacks to Kottke’s post states the obvious:

When I first encountered RSS search engines a few years ago while at Yahoo! I wondered how they could survive. The difficult part of RSS search isn’t the RSS, it’s the search. Search is hard. For Google or Yahoo!, adding RSS to search is trivial. It’s just another data source. And yes, setting up a ping server is different from crawling links, but not any harder and once you get the content, it’s indexed in basically the same fashion. But for Technorati, adding world class relevence, freshness, comprehensiveness and scalability to RSS is an almost insurmountable effort.

(Possibly two, but this one is mostly in Chinese. Google’s beta Chinese-English translation says in part “very many people anticipates Google/Yahoo can provide the even better function.”)

I hope , , , , , et al do well, but my expectation is for one or more of Google, Yahoo!, or Microsoft to introduce a superior blog search service and eventually for blog search to be an anachronism, subsumed by web search (though I want every site and page to have a feed, so web search should become a bit more like blog search). I want to comprehensively track a webversation starting at any URL, and that requires something that can pass for a comprehensive web index.

Here’s a graph from Alexa showing the “reach” of Technorati and (clearly less popular) competitors:

For comparison Alexa says that Google is used by (only?) a little over one in five browsers a day (over 200,000 per million):

Abominable person theory

Thursday, September 1st, 2005

I am not a fan of the of history, but I’ll give some credence to what I’m going to call abominable person theory, as explained wonderfully at Mahalanobis:

[I]nfluential mistakes create something neither anticipated nor inevitable, while right ideas are somewhat inevitable. Thus good ideas are not so dependent on “great men” because there are lots of smart people and they eventually find the truth (witness the simultaneous discovery of things like evolution by Wallace and Darwin, calculus by Newton and Leibniz, or marginal analysis in economics by Menger, Jevons, and Walras). Bad ideas, in contrast, are infinite in number, and require a special magnetism and impenetrable self-assurance by their champions in order to become influential. Freud is a perfect example, a charlatan who befuddled two generations via his implacable self-esteem. Marx was similar, and Ayn Rand was cut from the same cloth but fortunately her radical ideas against empiricism never had as deleteriously wide an impact as Marx or Freud.

The pièce de résistance:

So for an individual to have great impact, it is probably in some wrong-headed idea about something not obviously falsifiable.

(Not just idea people; nearly anyone remembered as “the Great” was an abominable person.)

That’s most of the post, but read it again, it’ll be fun: The Most Influential Individuals are Generally Bad.