Post Wikipedia

Content layer infrastructure

Saturday, July 25th, 2009

Last Sunday I appeared (mp4 download) on a tech interview program called Press: Here. It went ok. Most of the questions were softball and somewhat repetitive. Lots more could have been said about any of them, but I think I did a pretty good job of hitting a major point on each and not meandering. However, one thing I said (emphasized below) sounds like pure bs:

this has been done in the open source software world for a couple decades now and now that people are more concerned about the content layer that’s really part of the infrastructure having a way to clear those permissions without the lawyer-to-lawyer conversation happen every single time is necessary

I could’ve omitted the bolded words above and retained the respect of any viewer with a brain. What the heck did I mean? I was referring to an argument, primarily made by Joi Ito over the last year or so, using a stylized version of the layers of a protocol stack. David Weinberger’s live-blogging of Ito provides a good summary:

Way back when, it was difficult to connect computers. Then we got Ethernet, then TCP/IP, and then HTTP (the Web). These new layers allow participation without permission. The cost of sending information and the cost of innovation have gone down (because the cost of failure has gone down). Now we’re getting another layer: Creative Commons. “By standardizing and simplifying the legal layer … I think we will lower the costs and create another explosion of innovation.”

Protocol geeks may object, but I think it’s a fairly compelling argument, at least for explaining why what Creative Commons does is “big”. The problems of not having a top layer (I called it “content”, the slide photographed above says “knowledge” — what it calls “content” is usually called “application”, and the note above says “legal”, referring to one required mechanism for opening up permissions around content, knowledge, or whatever one wishs to call it) in which a commons can be taken for granted (ie like infrastructure) is evident, for example in the failure by lawsuit of most interesting online music services, or the inaccessibility of much of the scientific literature to most humans and machines (eg for data mining), as are powerful hints as to what is possible where it exists, for example the vast ecology enabled by Wikipedia’s openness such as DBpedia.

I didn’t make that argument on-screen. Probably a good thing, given the previous paragraph’s tortured language. I shall practice. Critique welcome.

Press: Here is broadcast from its SF bay area home station (NBC) and I’ve heard is syndicated to many other stations. However, its website says nothing about how to view the program on TV, even on its home station. I even had a hard time finding any TV schedule on the NBC Bay Area website — a tiny link in the footer takes one to subpages for the station with lame schedule information syndicated from TV Guide. I found this near total disconnect between TV and the web a very odd, but then again, I don’t really care where the weird segment of the population that watches TV obtains schedule information. Press: Here ought to release its programs under a liberal CC license as soon as the show airs. Its own website gets very little traffic, many of the interviews would be relevant for uploading to Wikimedia Commons, and the ones that got used in Wikipedia would drive significant traffic back to the program website.

Conjectured impact of Wikipedia license interoperability?

Sunday, May 31st, 2009

Wikipedians voted overwhelmingly against kryptonite — for using Creative Commons Attribution-ShareAlike (CC BY-SA) as the main content license for Wikipedias and their sibling projects, permitting these to incorporate work offered under CC BY-SA, the main non-software copyleft license used outside of Wikipedia, and other CC BY-SA licensed projects to incorporate content from Wikipedia. The addition of CC BY-SA to Wikimedia sites should happen in late June and there is an outreach effort to encourage non-Wikimedia wikis under the Free Documentation License (FDL; usually chosen for Wikipedia compatibility) to also migrate to CC BY-SA by August 1.

This change clearly ought to over time increase the proportion of content licensed under free-as-in-freedom copyleft licenses. More content licensed under a single or interoperable copyleft licenses increases the reasons to cooperate with that regime — to offer new work under the dominant copyleft license (in the non-software case, now unambiguously CC BY-SA) in order to have access to content under that regime — and decreases the reasons to avoid copylefted work, one of which is the impossibility of incorporating works under multiple and incompatible copyleft licenses (when relying on the permissions of those licenses, modulo fair use). Put another way, the unified mass and thus gravitational pull of the copylefted content body is about to increase substantially.

Sounds good — but what can we expect from the actual impact of making legally interoperable the mass of Free Culture and its exemplar, Wikipedia? How can we gauge that impact, short of access to a universe where Wikipedians reject CC BY-SA? A few ideas:

(1) Wikimedia projects will be dual licensed after the addition of CC BY-SA — content will continue to be available under the FDL, until CC BY-SA content is mixed in, at which point the article or other work in question is only available under CC BY-SA. One measure of the licensing change’s direct impact on Wikimedia projects would be the number and proportion of CC BY-SA-only articles over time, assuming an effort to keep track.

I suspect it will take a long time (years?) for a non-negligible proportion of Wikipedia articles to be CC BY-SA-only, i.e., to have directly incorporated external CC BY-SA content. However, although most direct, this is probably the least significant impact of the change, and my suspicion could be upset if other impacts (below) turn out to be large, creating lots of CC BY-SA content useful for incorporating into Wikipedia articles.

(2) Content from Wikipedias and other Wikimedia projects could be incorporated in non-Wikimedia projects more. The difficulty here is measurement, but given academic interest in Wikipedia and the web generally, it wouldn’t be surprising to see the requisite data sets (historical and ongoing) and expertise brought together to analyze the use of Wikimedia project content elsewhere over time. Note that a larger than expected (there’s the rub) increase in such use could be the result of CC BY-SA being more straightforward for users than the FDL (indeed, a major reason for the change) as much or more than the result of license interoperability.

(3) New and existing projects could adopt or switch to CC BY-SA when they otherwise wouldn’t have in order to gain compatibility with Wikimedia projects. One sure indication of this would involve major projects using a CC license with a “noncommercial” term switching to CC BY-SA and giving interoperability with Wikipedia as the reason for the switch. Another indicator would simply be an increase in the use of CC BY-SA (and even more permissive instruments such as CC BY and CC0, to the extent the motivation is primarily to create content that can be used in Wikipedia rather than to use content from Wikipedia) relative to more restrictive (and non-interoperable with Wikipedia) licenses.

(4) Apart from needing to be compatible with Wikipedia because one desires to incorporate its content, one might want to be compatible with Wikipedia because it is “cool” to be so. I don’t know that this has occurred on a significant scale to this date, so if it begins to one possible factor in such a development would be the change to CC BY-SA. How could this be? As cool as Wikipedia compatibility sounds, having to adopt a hard to understand license intended for software documentation (the FDL) makes attaining this coolness seem infeasible. Consideration of the FDL just hasn’t been on the radar of many outside of the spaces of documentation, encyclopedias, and perhaps educational materials, while consideration and oftentimes use of CC licenses is active in many segments. However, in most of these more restrictive CC licenses (i.e., those prohibiting commercial use or adaptation) are most popular. So if we see an upsurge in the use of CC BY-SA for popular culture works (music, film) the beginning of which coincides with the Wikimedia licensing change, it may not be unreasonable to guess that the latter caused the former.

(5) The weight of Wikipedia and relative accessibility of CC BY-SA could further consensus that the freedoms demanded by Wikimedia projects are some combination of “good”, “correct”, “moral”, and “necessary” — if some of these can be distinguished from “cool”. In the long term, this could be indicated by the sidelining of terms for content that do not qualify as free and open, as they have been for software, where and similar obvious competitors for important free software niches are strategically irrelevant.

Obviously 3, 4, and 5 overlap somewhat.

(6) I conjecture that making more cultural production more wiki-like (or to gain WikiNature) is probably the biggest determinant of the success of Free Culture. More interplay between the Wikipedia, both the most significant free culture project and the most significant wiki, and the rest of the free culture and open content universe can only further this trend — though I have no idea how to measure the possible impact of the licensing change here, and wouldn’t want to ascribe too much weight to it.

(7) Last, the attention of the Wikipedia community ought to have a positive impact on the quality of future versions of Creative Commons licenses (there shouldn’t be another version until 2011 or so, and hopefully there won’t be another version after that for much longer). Presumably Wikipedians also would have had a positive impact on future versions of the FDL, but arguably less so given the Free Software Foundation’s (excellent) focus on software freedom.

Will any of the above play out in a significant way? How much will it be reasonable to attribute to the license change? Will researchers bother to find out? Here’s to hoping!

Prior to the Wikipedia community vote on adopting CC BY-SA it crossed my mind to set up several play money prediction market contracts concerning the above outcomes conditioned on Wikipedia adopting CC BY-SA by August 1, 2009, for which I did set up a contract. It is just as well that I didn’t — or rather if I had, I would have had to heavily promote all of the contracts in order to stimulate any play trading — the basic adoption contract at this point hasn’t budged from 56% since the vote results were announced, which means nobody is paying attention to the contract on Hubdub.

Wikipedians against kryptonite

Monday, April 13th, 2009



As mentioned previously incompatible widely used copyleft licenses are kryptonite to the efficacy of copyleft. If you’ve made 25 or more edits* to a Wikimedia project, you can vote to liberate Wikipedia from this kryptonite. Vote now, instructions and much more background on the Creative Commons blog.


Original poster by Brianna Laugher / CC BY

* My favorite interview question for any position at Creative Commons goes something like “tell me about your experiences with editing Wikipedia” which serves the dual purposes of testing whether the candidate knows how to use a computer (you’d be surprised) and has any practical clue about the types of collaboration Creative Commons’ work facilitates.

CC6+

Wednesday, December 17th, 2008

December 16 marked six years since the release of the first Creative Commons licenses. Most of the celebrations around the world have already taken place or are going on right now, though San Francisco’s is on December 18. (For CC history before 2002-12-16, see video of a panel recorded a few days ago featuring two of CC’s founding board members and first executive director or read the book Viral Spiral, available early next year, though my favorite is this email.)

I’ve worked for CC since April, 2003, though as I say in the header of this blog, I don’t represent any organization here. However, I will use this space to ask for your support of my and others’ work at CC. We’re nearing the end of our fourth annual fall public fundraising campaign and about halfway to our goal of raising US$500,000. We really need your support — past campaigns have closed out with large corporate contributions, though one has to be less optimistic about those given the financial meltdown and widespread cutbacks. Over the longer term we need to steadily decrease reliance on large grants from visionary foundations, which still contribute the majority of our funding.

Sadly I have nothing to satisfy a futarchist donor, but take my sticking around as a small indicator that investing in Creative Commons is a highly leveraged way to create a good future. A few concrete examples follow.

became a W3C Recommendation on October 14, the culmination of a 4+ year effort to integrate the Semantic Web and the Web that everyone uses. There were several important contributors, but I’m certain that it would have taken much longer (possibly never) or produced a much less useful result without CC’s leadership (our motivation was first to describe CC-licensed works on the web, but we’re also now using RDFa as infrastructure for building decoupled web applications and as part of a strategy to make all scientific research available and queryable as a giant database). For a pop version (barely mentioning any specific technology) of why making the web semantic is significant, watch Kevin Kelly on the next 5,000 days of the web.

Wikipedia seems to be on a path to migrating to using the CC BY-SA license, clearing up a major legal interoperability problem resulting from Wikipedia starting before CC launched, when there was no really appropriate license for the project. The GNU FDL, which is now Wikipedia’s (and most other Wikimedia Foundation Projects’) primary license, and CC BY-SA are both copyleft licenses (altered works must be published under the same copyleft license, except when not restricted by copyright), and incompatible widely used copyleft licenses are kryptonite to the efficacy of copyleft. If this migration happens, it will increase the impact of Wikipedia, Creative Commons, free culture, and the larger movement for free-as-in-freedom on the world and on each other, all for the good. While this has basically been a six year effort on the part of CC, FSF, and the Wikimedia Foundation, there’s a good chance that without CC, a worse (fragmented, at least) copyleft landscape for creative works would result. Perhaps not so coincidentally, I like to point out that since CC launched, there has been negative in the creative works space, the opposite of the case in the software world.

Retroactive copyright extension cripples the public domain, but there are relatively unexplored options for increasing the effective size of the public domain — instruments to increase certainty and findability of works in the public domain, to enable works not in the public domain to be effectively as close as possible, and to keep facts in the public domain. CC is pursuing all three projects, worldwide. I don’t think any other organization is placed to tackle all of these thorny problems comprehensively. The public domain is not only tremendously important for culture and science, but the only aesthetically pleasing concept in the realm of intellectual protectionism (because it isn’t) — sorry, copyleft and other public licensing concepts are just necessary hacks. (I already said I’m giving my opinion here, right?)

CC is doing much more, but the above are a few examples where it is fairly easy to see its delta. CC’s Science Commons and ccLearn divisions provide several more.

I would see CC as a wild success if all it ever accomplished was to provide a counterexample to be used by those who fight against efforts to cripple digital technologies in the interest of protecting ice delivery jobs, because such crippling harms science and education (against these massive drivers of human improvement, it’s hard to care about marginal cultural production at all), but I think we’re on the way to accomplishing much more, which is rather amazing.

More abstractly, I think the role of creating “commons” (what CC does and free/open source software are examples) in nudging the future in a good direction (both discouraging bad outcomes and encouraging good ones) is horribly underappreciated. There are a bunch of angles to explore this from, a few of which I’ve sketched.

While CC has some pretty compelling and visible accomplishments, my guess is that most of the direct benefits of its projects (legal, technical, and otherwise) may be thought of in terms of lowering transaction costs. My guess is those benefits are huge, but almost never perceived. So it would be smart and good to engage in a visible transaction — contribute to CC’s annual fundraising campaign.

25 years of GNU

Tuesday, September 2nd, 2008

The turns 25 on September 27. Not much to add beyond what I wrote on the Creative Commons blog. Watch the Freedom Fry video.

I do have some meta commentary…

The video, featuring British humorist , is very British. That is, Americans might wonder if there is any humor in it at all. I’m fine with that.

It’s great that the video is posted in Ogg Theora format and works seamlessly in my browser via Cortado, and download links are provided. However, HTML to copy & paste for direct inclusion in a blog post or other web page should also be provided, as is typical for sharing video. I haven’t tried making such yet, though I should and might.

Finally, there’s a hidden jab at some in the free software movement in my CC blog post:

One of the movements and projects directly inspired by GNU is Creative Commons. We’re still learning from the free software movement. On a practical level, all servers run by Creative Commons are powered by GNU/Linux and all of the software we develop is free software.

So please join us in wishing the GNU project a happy 25th birthday by spreading a happy birthday video from comedian Stephen Fry. The video, Freedom Fry, is released under a CC Attribution-NoDerivatives license.

Emphasis added. The free culture/open content world lags the free software/open source world in many respects, one of those being an understanding of what freedoms are necessary. Some from the free software world have pushed Creative Commons to recognize that in many cases culture requires freedoms equivalent to those expected for free software/open source (that’s the first bolded link above), while some in the free software world (not necessarily the exact same people, but at least people associated with the same organizations) publish documents and videos under terms that do not grant those same freedoms (that’s the second bolded link above).

The Free Software Foundation has probably published documents under terms roughly equivalent to CC BY-ND probably before CC existed. Currently the footer of fsf.org says:

Verbatim copying and distribution of this entire article are permitted worldwide, without royalty, in any medium, provided this notice is preserved.

Does the FSF really want to reserve the right to use copyright to censor people who might publish derived versions of their texts? They probably are concerned that someone will alter their message so as to be misleading. Perhaps there was some rationale for this pre-web and pre-CC, but now there is not:

  • People can easily see canonical versions by going to fsf.org. (DNS also should obsolete much of trademark as well, but that’s for another post.)
  • CC licenses that permit derivatives include the following (see 3(b), 4(a), 4(b), and 4(c) for the actual language):
    • Licensor can specify a link to provide for attribution
    • Derivative works must state how they are altered
    • Licensor can demand that credit be removed from the derivative
    • Unfortunately, in some jurisdictions licensor could press “moral rights” to censor a derivative considered derogatory

So one can pre-clear the right to make adaptations and retain some legal mechanisms to club creators of adaptations (ordered from best practice to distasteful, according to me).

The Software Freedom Law Center does worse, publishing its website (also, see the SFLC post on 25 years of GNU) under CC BY-NC-ND. Do they really want to prohibit commercial use? SFLC (a super excellent organization, as is the FSF!) is dedicated to software freedom, but still it seems silly for them to publish non-software works under terms antithetical to the spirit of free software.

On a brighter note, the FSF is publishing promotional images for Freedom Fry under a free as in free software as applied to cultural works license (CC BY-SA), one of which has already been taken under those terms for use on Stephen Fry’s Wikipedia article. Ah, the power of free cultural works. :)

Do wish GNU a happy 25th birtday — watch and spread the video!

No index.php

Tuesday, May 20th, 2008

On a mailing list I’m on someone just pointed to no-www.org. It’s been awhile since I’ve run across that site (or, before it existed, Slashdot commenters condemning use of TCWWW — The Cursed WWW), but I strongly agree — www. in a domain name is pointless.

Even worse is index.php in the path. You’ve taken the time to publish a website, now take a few minutes to make its URLs less ugly. I’m not going to bother setting up no-index-php.org, but someone should. However, in the spirit of no-www.org, here are a couple resources for removing index.php from popular software installations:

Please remove index.php from your URLs, or signal that you have no taste, no technical abilities, or both.

Thanks!

Uberfact

Monday, February 18th, 2008

There are a number of fun things about a sketch of Uberfact: the ultimate social verifier. The first is that the post could be written without mentioning . The second is that the proposed project is a nice would-be example of political desires sublimated entirely into creating useful and voluntary tools. Third, Mencius Moldbug is a fun writer.

Something like Uberfact should absolutely be built, though I’m far from certain it would hit a sweet spot. It may be too decentralized or too centralized or both. All points from enhancing Wikipedia to the Semantic Web (with Uberfact somewhere between) are complementary and well worth pursuing, particularly if that pursuit displaces malinvestment in politics.

Relatedly, but no time to explain why:

Wikileaks flows

Saturday, January 26th, 2008

A year ago I mentioned Wikileaks, with some skepticism:

Wikileaks, currently vapor, may be a joke. If Wikileaks is not a joke and if it successfully exposes a large number of secrets, I’d find it hilarious to see this happening on a public website and without financial incentives. P2P, digital cash, information markets, and crypto anarchy? Nope, just a wiki and a communinty.

With each new item I read about Wikileaks, usually via Slashdot, my skepticism wanes and hilarity waxes. Bully for Wikileaks, the Wikileaks community, dissidents and transparency worldwide.

Read the and Wikileaks:About on Wikileaks, available securely and via many front domains.

Of course Wikileaks is blocked in China, which gives them some cred in my opinion (but note the measurement described in that post doesn’t seem to work anymore — from within the U.S. it appears google.com and google.cn now give identical results).

In one recent item cited on Slashdot, a copyright claim is being used to attempt to censor Wikileaks. How unsurprising.

Blog search putrefying

Saturday, December 22nd, 2007

I’ve complained before here that blog search stinks and isn’t getting better. Now I know why — in addition to blog search being a difficult and expensive service to run — there isn’t much demand. The blog search focused sites I mentioned in the “stinks” post each seem to have gained no traction since then, excepting Technorati, which itself is constantly rumored to be troubled.

A TechCrunch post on traffic at various Google properties finally gave me a clue and an inclination to look at my past posts on blog search. Click through to see a graph showing that Google Blog Search barely registers.

To end on a positive note, perhaps blog search is a good use case for , as it isn’t economic for a centralized entity to do well. This reminds me, whatever happened to various ?

Only tangentially related to blog search, I really like Chris F. Masse’s post on blogs vs. newspapers, in which Wikipedia sits at the top of the ecosystem:

So the real winner is Wikipedia — a news and knowledge aggregator… using anonymous volunteers. But Wikipedia is only an information aggregator… it feeds on both media and blogs to gather the facts. Wikipedia is the common denominator of knowledge —not the primary source of reporting. Just like prediction markets feed on polls and other advanced indicators.

Steps toward better software and content

Saturday, December 1st, 2007

The Wikimedia Foundation board has passed a resolution that is a step toward Wikipedia migrating to the Creative Commons Attribution-ShareAlike license. I have an uninteresting interest in this due to working at Creative Commons (I do not represent them on this blog), but as someone who wants to see free knowledge “win” and achieve revolutionary impact, I declare this an important step forward. The current fragmentation of the universe of free content along the lines of legally incompatible but similar in spirit licenses delays and endangers the point at which that universe reaches critical mass — when any given project decides to use a copyleft license merely because then being able to include content from the free copyleft universe makes that decision make sense. This has worked fairly well in the software world with the GPL as the copyleft license.

Copyleft was and is a great hack, and useful in many cases. But practically it is a major barrier to collaboration in some contexts and politically it is still based on censorship. So I’m always extremely pleased by any expansion of the public domain. There could hardly be a more welcome expansion than ‘s release of his code (most notably ) into the public domain. Most of the practical benefit (including his code in free software distributions) could have been achieved by released under any free software license, including the GPL. But politically, check out this two minute video of Bernstein pointing out some of the problems of copyright and announcing that his code is in the public domain.

Bernstein (usually referred to as ‘djb’) also recently doubled the reward for finding a security hole in qmail to US$1,000. I highly recommend his Some thoughts on security after ten years of qmail 1.0, also available as something approximating slides (also see an interesting discussion of the paper on cap-talk).