Google whenever

For years I’ve heard speculation that Google is buiding a web archive. Now there are domain name purchases to fuel the speculation. The Internet Archive has been providing an invaluable service with the Wayback Machine and has set up mirrors in multiple jurisdictions, but recording the web is too important to rely on any single organization, no matter how good or robust. So I hope Google and others are maintaining web archives and will make them available to the public.

Via Tim Finin, who also notes an interesting paper about using article and user history to assign trust levels to Wikipedia article fragments and a Semantic Web archive.

Archives are important for establishing provenance in many situations, though one I’m particularly interested in is citing that a particular work was offered under a Creative Commons license at a particular time. This and other uses (e.g., citation in general, which is often of the form “http://example.com accessed 2005-03-10”, though who knows if a copy of the content as it existed on that date exists) would be enhanced if on-demand archiving were available. The Internet Archive does offer Archive-It.org, but this service is for institutional use and uses periodic crawls rather than immediate archiving of individual pages.

Update, 2 minutes later: I should read a bit more before posting: WebCite does exactly what I want. However, I hate that it uses opaque identifiers, and as such is nearly as evil as TinyURL.

Posted 2006-09-03, filed under Creative Commons, Semantic Web.

2 Responses

Mike Linksvayer » Wikipedia and Linking 2.0 says:

2007-01-22 at 13:09

[…] In the case of non-Wikipedia links (and those too), combatting linkrot and providing alternate and related (e.g., reference, reply, archival) links is an obvious feature add for social bookmarking services and can be made available to a CMS or browser via the usual web API/feed/scraping mechanisms. […]
Mike Linksvayer says:

2008-04-05 at 21:22

Noticed in an old comment on Jon Udell’s blog, WebCIte does support non-opaque (but unnecessarily ugly) identifiers, e.g., http://www.webcitation.org/query?url=http%3A%2F%2Fgondwanaland.com%2Fmlog%2F2006%2F09%2F03%2Fgoogle-archive%2F&date=2008-04-05

Google whenever

2 Responses

Leave a Reply

Contact

Archives

Categories