Archive for July, 2006

Open Data

Sunday, July 30th, 2006

Tim Bray has a very nice summary of open data:

I think any online service can call itself “Open” if it makes, and lives up to, this commitment: Any data that you give us, we’ll let you take away again, without withholding anything, or encoding it in a proprietary format, or claiming any intellectual-property rights whatsoever.

For extra credit, a service could also say: We acknowledge your interest in any value-added information we distill from what you give us, and will share it back with you to the extent we can do so while preserving the privacy of others.

So, do we need some sort of Open Service analogue of the Open Source Definition? It couldn’t hurt.

I don’t know if this goes far enough for “open services” — certainly not far enough for the service equivalent of free software. However, it might be nice if “open” meant something substantially different than “free” or “libre” for services, c.f. open source software and free software.

Tim also says:

I suspect that if we can get the basic idea across, then we’re in old-fashioned consumer-advocacy territory; and I suspect that it will only take a small number of painful experiences for consumers to understand the issue at a pretty deep level.

I have noticed, just in the past six months I think, lots of people with no obvious predisposition (e.g., proprietary software background or just regular users) suddenly “getting” the importance of open formats. Promising and pleasantly surprising.

Free software needs P2P

Friday, July 28th, 2006

Luis Villa on my constitutionally open services post:

It needs a catchier name, but his thinking is dead on- we almost definitely need a server/service-oriented list of freedoms which complement and extend the traditional FSF Four Freedoms and help us think more clearly about what services are and aren’t good to use.

I wasn’t attempting to invent a name, but Villa is right about my aim — I decided to not mention the four freedoms because I felt my thinking too muddled to dignified with such a mention.

Kragen Sitaker doesn’t bother with catchy names in his just posted draft essay The equivalent of free software for online services. I highly recommend reading the entire essay, which is as incisive as it is historically informed, but I’ve pulled out the problem:

So far, all this echoes the “open standards” and “open formats” discussion from the days when we had to take proprietary software for granted. In those days, we spent enormous amounts of effort trying to make sure our software kept our data in well-documented formats that were supported by other programs, and choosing proprietary software that conformed to well-documented interfaces (POSIX, SQL, SMTP, whatever) rather than the proprietary software that worked best for our purposes.

Ultimately, it was a losing game, because of the inherent conflict of interest between software author and software user.

And the solution:

I think there is only one solution: build these services as decentralized free-software peer-to-peer applications, pieces of which run on the computers of each user. As long as there’s a single point of failure in the system somewhere outside your control, its owner is in a position to deny service to you; such systems are not trustworthy in the way that free software is.

This is what has excited about decentralized systems long before P2P filesharing.

Luis Villa also briefly mentioned P2P in relation to the services platforms of Amazon, eBay, Google, Microsoft and Yahoo!:

What is free software’s answer to that? Obviously the ’spend billions on centralized servers’ approach won’t work for us; we likely need something P2P and/or semantic-web based.

Wes Felter commented on the control of pointers to data:

I care not just about my data, but the names (URLs) by which my data is known. The only URLs that I control are those that live under a domain name that I control (for some loose value of control as defined by ICANN).

I hesitated to include this point because I hesitate to recommend that most people host services under a domain name they control. What is the half-life of http://blog.john.smith.name vs. http://johnsmith.blogspot.com or js@john.smith.name vs. johnsmith@gmail.com? Wouldn’t it suck to be John Smith if everything in his life pointed at john.smith.name and the domain was hijacked? I think Wes and I discussed exactly this outside CodeCon earlier this year. Certainly it is preferable for a service to allow hosting under one’s own domain (as Blogger and several others do), but I wish I felt a little more certain of the long-term survivability of my own [domain] names.

This post could be titled “freedom needs P2P” but for the heck of it I wanted to mirror “free culture needs free software.”

Pig assembler

Friday, July 21st, 2006

The story of The Pig and the Box touches on many near and dear themes:

  • The children’s fable is about DRM and digital copying, without mentioning either.
  • The author is raising money through Fundable, pledging to release the work under a more liberal license if $2000 is raised.
  • The author was dissuaded from using the sampling licnese (a very narrow peeve of mine, please ignore).
  • I don’t know if the author intended, but anyone inclined to science fiction or nanotech will see a cartoon .
  • The last page of the story is Hansonian.

Read it.

This was dugg and Boing Boing’d though I’m slow and only noticed on Crosbie Fitch‘s low-volume blog. None of the many commentators noted the sf/nano/upload angle as far as I can tell.

Constitutionally open services

Thursday, July 6th, 2006

Luis Villa provokes, in a good way:

Someone who I respect a lot told me at GUADEC ‘open source is doomed’. He believed that the small-ish apps we tend to do pretty well will migrate to the web, increasing the capital costs of delivering good software and giving next-gen proprietary companies like Google even greater advantages than current-gen proprietary companies like MS.

Furthermore:

Seeing so many of us using proprietary software for some of our most treasured possessions (our pictures, in flickr) has bugged me deeply this week.

These things have long bugged me, too.

I think Villa has even understated the advantage of web applications — no mention of security — and overstated the advantage of desktop applications, which amounts to low latency, high bandwidth data transfer — let’s see, , including video editing, is the hottest thing on the web. Low quality video, but still. The two things client applications still excel at are very high bandwidth, very low latency data input and output, such as rendering web pages as pixels. :)

There are many things that can be done to make client development and deployment easier, more secure, more web-like and client applications more collaboration-enabled. Fortunately they’ve all been tried before (e.g., , , , others of varying relevance), so there’s much to learn from, yet the field is wide open. Somehow it seems I’d be remiss to not mention , so there it is. Web applications on the client are also a possibility, though typical only address ease of development and not manageability at all.

The ascendancy of web applications does not make the desktop unimportant any more than GUIs made filesystems unimportant. Another layer has been added to the stack, but I am still very happy to see any move of lower layers in the direction of freedom.

My ideal application would be available locally and over the network (usually that means on the web), but I’ll prefer the latter if I have to choose, and I can’t think of many applications that don’t require this choice (fortunately is one of them, or close enough).

So what can be done to make the web application dominated future open source in spirit, for lack of a better term?

First, web applications should be super easy to manage (install, upgrade, customize, secure, backup) so that running your own is a real option. Applications like and have made large strides, especially in the installation department, but still require a lot of work and knowledge to run effectively.

There are some applications that centralizaton makes tractable or at least easier and better, e.g., web scale search, social aggregation — which basically come down to high bandwidth, low latency data transfer. Various P2P technologies (much to learn from, field wide open) can help somewhat, but the pull of centralization is very strong.

In cases were one accepts a centralized web application, should one demand that application be somehow constitutionally open? Some possible criteria:

  • All source code for the running service should be published under an open source license and developer source control available for public viewing.
  • All private data available for on-demand export in standard formats.
  • All collaboratively created data available under an open license (e.g., one from Creative Commons), again in standard formats.
  • In some cases, I am not sure how rare, the final mission of the organization running the service should be to provide the service rather than to make a financial profit, i.e., beholden to users and volunteers, not investors and employees. Maybe. Would I be less sanguine about the long term prospects of Wikipedia if it were for-profit? I don’t know of evidence for or against this feeling.

Consider all of this ignorant speculation. Yes, I’m just angling for more freedom lunches.