Archive for February, 2005

he is HE

Thursday, February 17th, 2005

My grandfather died this morning at 99.5 years. He is now one with THE LORD ALMIGHTY in perfect nonexistence!

CodeCon Sunday

Tuesday, February 15th, 2005

I say CodeCon was 3/4 (one abstention) on Sunday.

Wheat. An environment (including a language) for developing web applications. Objects are arranged in a tree with some filesystem-like semantics. Every object has a URL (not necessarily in a public portion of the tree). Wheat‘s web object publishing model and templating seem clearly reminiscent of Zope. In response to the first of several mostly redundant questions regarding Wheat and Zope, Mark Lentczner said that he used Zope a few years ago and was discouraged by the need to use external scripts and the lack of model-view separation in templates (I suspect Mark used DTML — Wheat’s TinyTemplates reminded me of DTML’s replacement, Zope Page Templates, currently my favorite and implemented in several languages). I’m not sure Wheat is an environment I’d like to develop in, but I suspect the world might learn something from pure implementations of URL-object identity (not just mapping) and a web domain specific language/environment (I understand that Wheat has no non-web interface). Much of the talk used these slides.

Incoherence. I find it hard to believe that nobody has done exactly this audio visualization method before (x = left/right, y = frequency, point intensity and size = volume), but as an audio-ignoramous I’ll take the Incoherence team’s word. I second Wes Felter’s take: “I learned more about stereo during that talk than in the rest of my life.”

i-Brokers. This is where XNS landed and where it might go. However, the presentation barely mentioned technology and left far more questions than answers. There was talk of Zooko’s Triangle (“Names: Decentralized, Secure, Human-Memorizable: Choose Two”). 2idi and idcommons seem to have chosen the last two, temporarily. It isn’t clear to me why they brought it up, as i-names will be semi-decentralized (like DNS). In theory i-names provide privacy (you provide only your i-name to an i-name enabled site, always logging in via your i-broker, and access to your data is provided through your i-broker — never enter your password or credit card anywhere else — you set the policies for who can access your data) and persistence (keep an i-name for life, and i-names may be transparently aliased or gatewayed should you obtain others). These benefits, if they exist in the future, are subtler than the claims. Having sites access your data via a broker rather than via you typing it in does little to protect your privacy by itself. You make a decision in both cases whether you want a site to have your credit card number. Once the site has your credit card… Possibly over the long term if lots of people and sites adopt i-names sites will collect or keep less personal information. Users, via their i-brokers, may be on more equal terms with sites, as i-broker access will presumably be governed by some you-have-no-rights-at-all terms of service. Some sites may decide (for new applications) they don’t want to have to worry about the security of customer information and access the same via customers’ i-names. However, once a user has provided their i-broker with lots of personal information, it becomes easy for sites to ask for it all. Persistence is also behavioral. Domain names and URLs can last a long time; good ones don’t change. Similarly an i-name will go away if the owner stops paying for it. Can the i-name ecology be structured so that i-names tend to be longer lived than domain names or URLs? Probably, but that’s a different story. In the short term 2idi is attempting to get adoption in the convention registration market. Good luck, but I wish Fen and Victor had spent their time talking about XRI resolution or other code behind the 2idi broker.

SciTools. A collection of free to use web applications for genetic design and analysis. Integrated DNA Technologies, the company that offers SciTools, makes its money selling (physical) synthesized nucleic acids. I was a cold, tired, bio-ignoramous, so have little clue whether this is novel. (Ted Leung seems to think so and also has interesting things to say about the other presentations.)

OzymanDNS. DNS can route and move data, is deployed and not filtered everywhere, so with a little cleverness we can tunnel arbitrary streams over DNS. Dan Kaminsky is clearly the crowd pleaser, not only for his showmanship and the audacity of his hacks (streaming anime over DNS this time). More than a few in the crowd wanted to put DNS hacks to work, e.g., on aspects of supposed syndication problems. PPT slides of an older version of the talk.

Yesterday.

CodeCon Saturday

Sunday, February 13th, 2005

CodeCon is 5/5 today.

The Ultra Gleeper. A personal web page recommendation system. Promise of collaborative filtering unfulfilled, in dark ages since Firefly was acquired and shut down in the mid-90s. Presenter believes we’re about to experience a renaissance in recommendation systems, citing Audiocrobbler recommendations (I would link to mine, but personal recommendations seem to have disappeared since last time I looked; my audioscrobbler page) as a useful example (I have found no automated music recommendation system useful) and blogs as a use case for recommendations (I have far too much very high quality manually discovered reading material, including blogs, to desire automated recommendations for more and I don’t see collaborative filtering as a useful means of prioritizing my lists). The Ultra Gleeper crawls pages you link to, treating links as positive ratings, pages that link to you (via Technorati CosmosQuery and Google API), presents suggested pages to rate in a web interface. Uses a number of tricks to avoid showing obvious recommendations (does not recommend pages that are two popular) and pages you’ve already seen (including those linked to in feeds you subscribe to). Some problems faced by typical recommendation systems (new users get crummy recommendations until they enter lots of data, early adopters get doubly crummy recommendations due to lack of existing data to correlate with) obviated by bootstrapping from data in your posts and subscriptions. I suppose if lots of people run something like Gleeper robot traffic increases, more people complain about syndication bandwidth-like problems (I’m skeptical about this being a major problem). I don’t see lots of people running Gleepers as automated recommendation systems are still fairly useless and will remain so for a long time. Interesting software and presentation nonetheless.

H2O. Primarily a discussion system tuned to facilitate professor-assigned discussions. Posts may be embargoed and professor may assign course participants specific messages or other participants to respond to. Discussions may include participants from multiple courses, e.g., to facilitate a MIT engineering-Harvard law exchange. Anyone may register at H2O and create own group, acting as professor for created group. Some of the constraints that may be iposed by H2O are often raised in mailing list meta discussions following flame wars, in particular posting delays. I dislike web forums but may have to try H2O out. Another aspect of H2O is syllabus management and sharing, which is interesting largely because syllabi are typically well hidden. Professors in the same school of the same university may not be aware of what each other are teaching.

Jakarta Feedparser. Kevin Burton gave a good overview of syndication and related standards and the many challenges of dealing with feeds in the wild, which are broken in every conceivable way. Claims SAX (event) based Jakarta FeedParser is an order of magnitude faster than DOM (tree) based parsers. Nothing new to me, but very useful code.

MAPPR. Uses Flickr tags, GNS to divine geographic location of photos. REST web services modeled on Flickr’s own. Flash front end, which you could spend many hours playing with.

Photospace. Personal image annotation and search service, focus on geolocation. Functionality available as library, web fron end provided. Photospace publishes RDF which may be consumed by RDFMapper.

Note above two personal web applications that crawl or use services of other sites (The Ultra Gleeper is the stronger example of this). I bet we’ll see many more of increasing sophistication enabled by ready and easily deployable software infrastructure like Jakarta FeedParser, Lucene, SQLite and many others. A personal social networking application is an obvious candidate. Add in user hosted or controlled authentication (e.g., LID, perhaps idcommons) …

Yesterday.

CodeCon Friday

Saturday, February 12th, 2005

CodeCon requires presenters to be active developers of the projects presented and projects must have demonstrably running code. There’s an emphasis on open source and decentralization. This generally makes for interesting presentations. Today was 4/5.

Aura. Case study in how not to give a CodeCon presentation. Talk for a long time about motivations for and very high level problems of reputation systems, which all attendees are surely familiar with. Give almost no specifics about Aura, apparently a peer-to-peer reputation system, including nothing on what differentiates it from other work nor on how or why I’d use it in my own code. Demo stumbles due to display problems, fails due to ill prepared data. One mostly irrelevant bit about Aura’s implementation: it uses SQLite, an embedded, zero configuration, endian neutral SQL database that many projects have started to use recently and tons more will in the near future. I’m certain that SQLite is in my future.

ArX. Very useful presentation on Walter Landry‘s ArX, which began as a fork of the GNU Arch distributed revision control system (both are pronounced ‘arc’). Lists good, bad and ugly of active open source, distributed revision control systems (I agree that any system that does not have those attributes is strictly non-interesting), including GNU Arch/tla, ArX, monotone (also uses SQLite), Darcs, svk, and Codeville. I’ve tried tla a few times but have gotten hung up on what seems to me like uncessary complexity and strange conventions. I’d pretty much settled on using Darcs going forward, but now I’m a little concerned by its reordering of patches in order to solve merge conflicts, which apparently can be very slow and may make the repository’s view of its state at a point in time inaccurate. Not sure whether this is pragmatic, evil, or both, nor am I sure I understand it. See also Zooko’s notes (Darcs row, decentralization column).

Apache CA. A certification authority motivated by the needs of the Apache Software Foundation, which has around 900 developers with commit access working on around 100 projects. Program managers can add committers, but small admin team needs to create shell accounts, add to various text files, creating bottleneck. Solution: all services (most importantly source control — migrate to subversion) eventually use SSL, check for permission based on group membership noted in personal certificates and managed via email by program managers. Sounds like a long term project. “Open CA” feature is an interesting extension — allows anyone who can sign an email with GPG to create groups in the form of user@example.com/groupname. Not sure what the ASF motivation is for Open CA, but I’m sure interesting applications can be built on it.

Off-the-Record Messaging. Messaging using the PGP model (sign with sender’s public key, encrypt with recipient’s public key) can be attacked: the “bad guys” can intercept and store your messages. In the future they can break into your computer, obtain your private key, decrypt your messages and prove that you are the author. Very briefly OTR obtains “perfect forward secrecy” through the use of short lived encryption keys and “refutable authentication” using shared MAC keys — compromise of your long term keys doesn’t allow your messages to be decrypted, and it can’t be proved that you wrote your messages. A toolkit for forging transcripts is even provided to enhance deniability. Details here. This presentation seems to match the one given at CodeCon. They have a GAIM plugin, which I’m now running, and a standalone proxy for other AIM clients. Cool stuff.

RPOW. Reusable Proofs of Work is a system for sequential reuse of hashcash mediated by a server written by the great signal-to-noise enhancer Hal Finney. RPOW has many potential uses — apparently initially motivated by a desire to implement “P2Poker” with interesting “chips” and currently being experimented with in a modified BitTorrent client in which downloaders can pay for priority wit RPOW tokens, possibly encouraging people to leave clients running after completing a download (serving as seeds in BT lingo) in order to earn tokens which may be spent on future downloads. As the BTRP page notes, people could acquire RPOWs out of band, and not contribute more upload bandwidth, or even contribute less. The net effect is hard to predict. If buying download priority with RPOWs proves useful, I expect non-BT filesharing clients, which have far less reason to cooperate, would benefit more than BT clients. Perhaps the most interesting thing about the RPOW system is its great effort to ensure that there can be no cheating, in particular by the server operator. The RPOW server will zero all data if it is physically tampered with, it is possible for anyone to verify the code it is running, and that code can verify that its database in its untrusted host has not been tampered with, using a Merkle hash tree to verify (the secure board only has two megabytes of memory). The RPOW server may be the world’s first transparent server, which could facilitate a world of distributed, cooperating RPOW servers. Presentation slides.

Saturday.

Decision Markets, Quantum Computers, Blogs, Longevity

Wednesday, February 9th, 2005

Ken Kittlitz posted an interesting report on the recent DIMACS Information Markets Workshop to fx-discuss. I hadn’t seen the approaches to clearing regulatory hurdles spelled out so clearly before. Unsurprisingly researchers found that the effect of manipulators on markets is not pernicious. The other theory/experimental results sound interesting, will have to read more. If I used bookmarks I’d bookmark David Pennock. My previous comment expressing disappointment in NewsFutures may have been off base. Ken’s summary indicates that HedgeStreet will soon offer longer term contracts. Several open source platforms are in the works, including possibly opening the code for fx (hooray, the UI could really use some work). Also:

Several participants indicated that the accuracy of these markets may not be their prime attraction to organizations. Rather, the fact that they help make people aware of differences of opinion, and forge a community that can discuss such differences, may be their strongest feature.

Another attendee was reminded of early quantum computing worksops:

Quantum computing back then had some promising research (factoring algorithms for example) and no one was sure whether it would lead to whole new computing paradigm or just disappear into the ether. Information markets are also a new technology with some promising research (mostly analytic and experimental) and no one knows whether it will revolutionize the way everyone does prediction, information aggregation and decision making or just slowly disappear.

Quantum computing needs to deal merely with the laws of physics but information markets need to deal with the laws of the United States of America.

I’m not so sure. If quantum computers break current cryptographic systems, shouldn’t they be banned as circumvention devices, if not cyber-terrorist weapons? Just kidding, I hope.

Unrelated to the DIMACS conference, Art Hutchinson has a nice post noting the commonalities between blogs and prediction markets:

I’m seeing that both blogs and prediction markets – born of the democratizing power of the web – have demonstrated their power to circumvent traditional information hierarchies to the detriment of established organizations and individuals.

Finally, Peter McCluskey asks what a Futarchy should maximize and proposes life expectancy. I’ve assumed a futarchy welfare function would be, or become, very complex, but longevity sounds good and simple to me.

Thought experiment: what would futarchies with different simple welfare functions look like (e.g., maximize GDP, minimize Gini coefficient) after n years of divergence?

One “constitutional” means to prevent a simple welfare function from growing complex through “politics” — require any change in the welfare function to be “approved” by the markets in terms of the current welfare function — investors have to bet that welfare in terms of the current function will be improved if it is replaced with a different function. I haven’t thought this through, it may make little sense. Even if not a hurdle changes must clear, such a market might give futarchy citizens interesting information about the magnitude of a proposed welfare function change.

Addendum 20050210: Chris Masse has added links to Ken Kittlitz’s report and Art Hutchinson has two posts remarking on Ken’s report. Unrelated to idea futures, read anyway Art’s experience with WalMart music store’s Digital Restrictions Management.

Shallow thinking about filesharing

Monday, February 7th, 2005

Tyler Cowen “cannot accept the radical anti-copyright position” and so proffers apologia for the radical intellectual protectionist position. (NB no anti-copyright position is being argued in MGM v. Grokster.) Regarding Cowen’s three arguments:

1. In ten year’s time, what will happen to the DVD and pay-for-view trades? BitTorrent allows people to download movies very quickly.

BitTorrent downloads tend to be faster than those on typical file sharing networks but still very slow. Netflix is a far superior option unless you place a very low value on your time (in addition to waiting many hours in the case of BitTorrent to weeks in the case of eDonkey for a download to complete you also need to spend time finding active torrents or hash links and dealing with low quality, mislabled and overdubbed copies, which often means starting over, even after you’ve learned how to deal with all of these. I pity the computer semi-literate who just wants to snag some “free” movies) .

Note that DVDs already account for more than half of Hollywood domestic revenue. Furthermore the process will be eased when TVs and computers can “talk” to each other more readily. Yes, I am familiar with Koleman Strumpf’s excellent work showing that illegal file-sharing has not hurt music sales. But a song download can be a loss leader for an entire CD or a concert tour. Downloading an entire movie does not prompt a person to spend money in comparable fashion.

Radical protectionists said made similar arguments about the VCR, as have those in countless businesses faced with new technology. In the case of the VCR, entrepreneurs figured out how to use the new technology to make billions. Similarly, it should be up to entrepreneurs to figure out how to thrive in the environment of ubiquitous networking, rather than up to lawmakers to ensure existing businesses survive technological change.

2. Perhaps we can make file-sharing services identify (and block) illegally traded files. After all, the listeners can find the illegal files and verify they have what they wanted. Grokster, sooner or later, will be able to do the same. Yes, fully decentralized and “foreign rogue” systems may proliferate, and any identification system will be imperfect. But this is one way to heed legitimate copyright suits without passing the notorious “Induce Act.”

Fully decentralized filesharing systems have proliferated. LimeWire is #2 at download.com and several other decentralized filesharing clients make the top 50 downloads list.

The imperfections of an identification and blocking system will include invasion of privacy and censorship.

3. I question the almost universal disdain for the “Micky Mouse” copyright extension act. OK, lengthening the copyright extension does not provide much in the way of favorable incentives. Who innovates with the expectation of reaping copyright revenues seventy-five years from now? But this is a corporate rather than an individual issue. Furthermore economic research indicates that current cash flow is a very good predictor of investment. So the revenue in fact stimulates additional investment in creative outputs. If I had my finger on the button, I still would have pushed “no” on the Mickey Mouse extension, if only because of the rule of law. Privileges of this kind should not be extended repeatedly due to special interest pressures. But we are fooling ourselves if we deny that the extension will benefit artistic output, at least in the United States.

The paper Cowen links to above (Cash Flow and Outcomes: How the Availability of Cash Impacts the Likelihood of Investing Wisely) is hardly encouraging regarding the efficacy of additional investments correlated with increased cash flow.

Eric Rescorla points out that subsidizing organizations that happen to hold copyright to work created 70 years ago is hardly the best way to subsidize new content creation, should one wish to do that.

Mass Destruction of Software Patents

Thursday, February 3rd, 2005

Is there something in the ether? Two people “near” me declare software patents potential “Weapons of Mass Destruction” yesterday and today, apparently having been struck by the idea independently: Patents as WMD’s from Mitch Kapor (Creative Commons is housed in his office space) and On Software Patents and WMDs from Ben Adida (who represents Creative Commons at the W3c).

Kapor and Adida have different scenarios in mind. Very roughly North Korea and Al Qaeda respectively.

See also Wikipedia on the software patent debate.