Archive for the ‘Wikipedia’ Category

Wikileaks flows

Saturday, January 26th, 2008

A year ago I mentioned Wikileaks, with some skepticism:

Wikileaks, currently vapor, may be a joke. If Wikileaks is not a joke and if it successfully exposes a large number of secrets, I’d find it hilarious to see this happening on a public website and without financial incentives. P2P, digital cash, information markets, and crypto anarchy? Nope, just a wiki and a communinty.

With each new item I read about Wikileaks, usually via Slashdot, my skepticism wanes and hilarity waxes. Bully for Wikileaks, the Wikileaks community, dissidents and transparency worldwide.

Read the and Wikileaks:About on Wikileaks, available securely and via many front domains.

Of course Wikileaks is blocked in China, which gives them some cred in my opinion (but note the measurement described in that post doesn’t seem to work anymore — from within the U.S. it appears google.com and google.cn now give identical results).

In one recent item cited on Slashdot, a copyright claim is being used to attempt to censor Wikileaks. How unsurprising.

Blog search putrefying

Saturday, December 22nd, 2007

I’ve complained before here that blog search stinks and isn’t getting better. Now I know why — in addition to blog search being a difficult and expensive service to run — there isn’t much demand. The blog search focused sites I mentioned in the “stinks” post each seem to have gained no traction since then, excepting Technorati, which itself is constantly rumored to be troubled.

A TechCrunch post on traffic at various Google properties finally gave me a clue and an inclination to look at my past posts on blog search. Click through to see a graph showing that Google Blog Search barely registers.

To end on a positive note, perhaps blog search is a good use case for , as it isn’t economic for a centralized entity to do well. This reminds me, whatever happened to various ?

Only tangentially related to blog search, I really like Chris F. Masse’s post on blogs vs. newspapers, in which Wikipedia sits at the top of the ecosystem:

So the real winner is Wikipedia — a news and knowledge aggregator… using anonymous volunteers. But Wikipedia is only an information aggregator… it feeds on both media and blogs to gather the facts. Wikipedia is the common denominator of knowledge —not the primary source of reporting. Just like prediction markets feed on polls and other advanced indicators.

Steps toward better software and content

Saturday, December 1st, 2007

The Wikimedia Foundation board has passed a resolution that is a step toward Wikipedia migrating to the Creative Commons Attribution-ShareAlike license. I have an uninteresting interest in this due to working at Creative Commons (I do not represent them on this blog), but as someone who wants to see free knowledge “win” and achieve revolutionary impact, I declare this an important step forward. The current fragmentation of the universe of free content along the lines of legally incompatible but similar in spirit licenses delays and endangers the point at which that universe reaches critical mass — when any given project decides to use a copyleft license merely because then being able to include content from the free copyleft universe makes that decision make sense. This has worked fairly well in the software world with the GPL as the copyleft license.

Copyleft was and is a great hack, and useful in many cases. But practically it is a major barrier to collaboration in some contexts and politically it is still based on censorship. So I’m always extremely pleased by any expansion of the public domain. There could hardly be a more welcome expansion than ‘s release of his code (most notably ) into the public domain. Most of the practical benefit (including his code in free software distributions) could have been achieved by released under any free software license, including the GPL. But politically, check out this two minute video of Bernstein pointing out some of the problems of copyright and announcing that his code is in the public domain.

Bernstein (usually referred to as ‘djb’) also recently doubled the reward for finding a security hole in qmail to US$1,000. I highly recommend his Some thoughts on security after ten years of qmail 1.0, also available as something approximating slides (also see an interesting discussion of the paper on cap-talk).

Wikimedia advertising (soft) drive

Tuesday, October 23rd, 2007

Wikipedia (actually the Wikimedia Foundation) started another yesterday. I’ll just reference what I’ve said in the past:

I am convinced by comments on the above posts and conversations since that it will take a huge shift in Wikipedia community opinion for advertising to have a chance. The time for direct argument in relevant venues is distant. If you agree with me that advertising on Wikipedia will allow the foundation to greatly speed the fulfillment of its commitment, you can make your support known without rancor:

1) When you donate, leave a comment that says “I support advertising on Wikipedia.”

2) On your Wikipedia user page (mine), add the following code, with obvious meaning (|{{PAGENAME}} may not be obvious–it’s a hack to make your name sort correctly in the relevant category listings):

[[Category:Wikipedians for optional advertisements|{{PAGENAME}}]]
[[Category:Wikipedians who think that the Wikimedia Foundation should use advertising|{{PAGENAME}}]]

Fortuitously Mozilla posted their 2006 financial statements today:

Mozilla’s revenues (including both Mozilla Foundation and Mozilla Corporation) for 2006 were $66,840,850, up approximately 26% from 2005 revenue of $52,906,602. As in 2005 the vast majority of this revenue is associated with the search functionality in Mozilla Firefox, and the majority of that is from Google. The Firefox userbase and search revenue have both increased from 2005. Search revenue increased at a lesser rate than Firefox usage growth as the rate of payment declines with volume.

Congratulations to Mozilla. The Open Web‘s prospects would look far worse if Mozilla did not have the wisdom to exploit this revenue source. Now, what about the prospects for Free Knowledge?

Addendum 20071123: The Wikimedia Fundraiser Blog is running Why Wikipedia Does Not Run Ads, a post linked to in the fundraising ad now running on Wikipedia.

Ridiculous simplicity

Monday, May 21st, 2007

is so ridiculous I’m not surprised it took so long for someone to invent it. But it is a thing of sublime beauty. Reminds me of some of the projects at last weekend’s .

pageoftext.com, which hosts wikiclock, is only ridiculous in its simplicity. Why didn’t I think of that?

Both projects via Evan Prodromou reporting on RoCoCo. I’m sad that I couldn’t make it to Montreal but glad to hear it’s coming to the SF Bay Area next year.

SXSW: Semantic Web 2.0 and Scientific Publishing

Saturday, March 10th, 2007

Web 2.0 and Semantic Web: The Impact on Scientific Publishing, probably the densest panel I attended today (and again expertly moderated by Science Commons’ John Wilbanks), covered , new business models for scientific publishers, and how web technologies can help with these and data problems, but kept coming back to how officious Semantic Web technologies and controlled ontologies (which are not the same at all, but are often lumped together) and microformats and tagging (also distinct) complement each other (all four of ‘em!), even within a single application. I agree.

Nearly on point, this comment elsewhere by Denny Vrandecic of the Semantic MediaWiki project:

You are supposed to change the vocabulary in a wiki-way, just as well as the data itself. Need a new relation? Invent it. Figured out it’s wrong? Rename. Want a new category of things? Make it.

Via Danny Ayers, oringal posted to O’Reilly Radar, which doesn’t offer permalinks for comments. This just needs a catchy name. Web 2.0 ontology engineering? Fonktology?

SXSW: Commercialization of Wikis

Saturday, March 10th, 2007

Evan Prodromou gave an excellent presentation on Commercialization of Wikis: Open Community That Pays the Bills. Check out his slides.

A few points:

  • Other stuff will be recognized as having wiki nature, e.g., .
  • Four categories of wiki businesses: service provider (Wikispaces, Wetpaint, PBWiki), content hosting (wikiHow, Wikitravel, Wikia), consulting (SocialText), content development (WikiBiz). My comment: at first blush Wikia would seem to be a service provider, but they are also deeply involved in content creation and community management.
  • Down with and the notion that wiki contributors are suckers or sharecroppers. Better to think of wikis (and wiki businesses) as platforms for knowledge. Contributors use your wiki to help each other, not to give you free content. My comment: I’m not so down on crowdsourcing. Yes, it is MBA language, but the usually involve compensating contributors. Crowdsourcing shouldn’t be conflated with sharecropping, nor confused with community purpose.
  • For wikis purpose more important than friends or ego for blogs (cf. blogs and social networking).

Seven rules for commercial wikis:

  1. Have a noble purpose — e.g., shared knowledge (use a free license), help a community.
  2. Demonstrate value — most interesting example is “carry the torch”; wiki communities can be transient, an entity that keeps focus helps.
  3. Be Transparent.
  4. Extract value where you provide value — most obviously, advertising for hosting.
  5. Be personally involved.
  6. Run with the right crowd — e.g., open source and open content, or you will be suspect of being a crowdsourcer.

It appears that Prodromou’s Wikitravel lives by these rules and has succeeded.

Update 20070317: Prodromou has a roundup of blog responses to his presentation. It was great indeed catching up with him.

“Querying Wikipedia like a Database”

Tuesday, January 23rd, 2007

I’ve mentioned several times as having the potential to tremendously increase the value of Wikipedia by unlocking (in the sense of making queryable) all of the data in the encyclopedia.

dbpedia.org has taken a different approach to “Querying Wikipedia like a Database” (their excellent tagline) — extract datasets from Wikipedia, presumably with a manual mapping of relevant categories and data populating infoboxes to triples (described in What have Innsbruck and Leipzig in common? Extracting Semantic from Wiki Content).

I suspect Wikipedia implementation of Semantic MediaWiki would only help dbpedia.org, but the latter is already impressive, requiring no changes at Wikipedia. In addition to making some of the data in Wikipedia queryable they’re exposing non-Wikipedia datasets.

The Semantic Web is so here, now. Doubters repent! ;-) Like I said before:

Once people get hooked on access to a semantic encyclopedia, perhaps they’ll want similar access to the entire web.

Wikipedia and Linking 2.0

Monday, January 22nd, 2007

has reasons for linking to a Wikipedia article about an organization rather than the organization’s site:

[A] lot of institutional sites are pathetic self-serving fluff served up in anodyne marketing-speak with horrible URIs that are apt to vanish.

Linking to the Wikipedia instead is tempting, and I’ve succumbed a lot recently. In fact, that’s what I did for the Canada Line. After all, the train is still under construction and there’s no real reason to expect today’s links to last; on top of which, the Line’s own site is mostly about selling the project to the residents and businesses who (like me) are getting disrupted by it, and the taxpayers who (like me) are paying for it.

Wikipedia entries, on the other hand, are typically in stable locations, have a decent track record for outliving transient events, are pretty good at presenting the essential facts in a clear, no-nonsense way, and tend to be richly linked to relevant information, including whatever the “official” Web site might currently happen to be.

I wrote something similar about a year ago:

I consider a Wikipedia link more usable than a link to an organization home page. An organization article will link directly to an organization home page, if the latter exists. The reverse is almost never true (though doing so is a great idea). An organization article at Wikipedia is more likely to be objective, succinct, and informational than an organizational home page (not to mention there is no chance of encountering Flash, window resizing, or other annoying distractions — less charitably, attempts to control my browser — at Wikipedia). When I hear about something new these days, I nearly always check for a Wikipedia article before looking for an actual website. Finally, I have more confidence that the content of a Wikipedia article will be relevant to the content of my post many years from now.

Why not preferntially link to Wikipedia? Bray feels bad about not linking directly to original content and says Wikipedia could go off the rails, though later provides a reason to not worry about the latter:

I’d be willing to bet that if Wikipedia goes off the rails and some new online reference resource comes along to compete, there’ll be an automated mapping between Wikipedia links and the new thing; so the actual URIs may retain some value.

Indeed; and the first argument explains why linking to Wikipedia is superior to linking to an institution. But what about “original content”? If the content isn’t simply a home page (of an organization, person, or product significant enough to be in Wikipedia), Wikipedia doesn’t help. For example, I linked to Bray’s post “On Linking”; only providing a link to his Wikipedia article would have been unhelpful. The Wikipedia article link in this case is merely supplementary.

So what to do to help with broken and crappy links to items not described in Wikipedia? Bray suggests “multi-ended links”. I think he’s on the right track, but this is not something a web content creator should need to worry about — robust linking need not involve choosing several typed (e.g., official, reference, search) links. The content creator’s CMS and the user’s browser ought to be able to figure this stuff out; the content creator should just use the best link available, as always.

Last year I wrote:

I predict that in the forseeable future your browser will be able to convert a Wikipedia article link into a home page link if that is your preference, aided by Semantic Mediawiki annotations or similar.

In the case of non-Wikipedia links (and those too), combatting linkrot and providing alternate and related (e.g., reference, reply, archival) links is an obvious feature add for social bookmarking services and can be made available to a CMS or browser via the usual web API/feed/scraping mechanisms.

Wiki search advertising

Tuesday, January 16th, 2007

has launched. It’s a reasonable idea, searching Wikipedia and sites Wikipedia links to (recalling search engines that have used to seed crawls). It’s much faster than Wikipedia’s built in search, but doesn’t satisfy me, as its Wikipedia results are out of date and imcomplete (indicators of the former include turning up deleted articles and finding nothing for ‘wikiseek’).

I find it interesting that Wikiseek’s footer says:

The majority of the revenue generated by Wikiseek advertising is donated to the Wikimedia Foundation.

That’s nice — apparently Searchme, Inc., intends to use Wikisearch to demonstrate its vertical search prowess — and it inspires a potential non-intrusive revenue model for Wikipedia that precisely copies Mozilla’s: sell inclusion in the search box/search page.

This wouldn’t be worth the hundreds of millions annually that tasteful text ads on articles could be (and the ability to fully fund* the Wikimedia Foundation’s mission), but it would surely obviate the need for begging to cover the costs of running Wikipedia.

* If politicians can use that vacuous phrase to indicate they “support education” I can use it in support of funding free knowledge projects.