Post Public Domain

RDFa initial context & one dc:

Tuesday, February 4th, 2014

One of the nice things to come out of RDFa 1.1 is its initial context — a list of vocabularies with prefixes which may be used without having to define locally. In other words, just write, e.g., property="dc:title" without having to first write prefix="dc:".

In addition to making RDFa a lot less painful to use, the list is a good starting place for figuring out what vocabularies to use (if you must), perhaps even for non-RDFa applications — the list is machine-readable of course; I was reminded to write this post when giving feedback on a friend’s proposal to use prefix:property headers in a CSV file for a custom application, and by a recent announcement of the addition of three new predefined prefixes.

Survey data such as Linked Open Vocabularies can also help figure out what to use. Unfortunately LOV and the RDFa 1.1 initial context don’t agree 100% on prefix naming, and neither provides much in the way of guidance. I think there’s room for a highly opinionated and regularly updated guide to what vocabularies to use. I’m no expert, it probably already exists — please inform me!


The first thing I’d put in such an opinionated guide is to start one’s vocabulary search with Dublin Core. Trivial, right? But there is an under-documented subtlety which I find myself pointing out when a friend runs something like the aforementioned by me — DC means DC Terms. While it’s obvious that DC Terms is a superset of DC Elements, it’s harder to find evidence that using the former is best practice for new applications, and that the latter is not still the canonical vocabulary to start with. What I’ve gathered on this follows. I realize that the URIs for individual properties and classes, the prefixes used to abbreviate those URIs, and the documents which define (in English and RDF) properties and classes are distinct but interdependent. Prefixes are surely the most trivial and uninteresting, but for most people I imagine they’re important signals and documentation, thus I go on about them…

Namespace Policy for the Dublin Core Metadata Initiative (DCMI) (emphasis added):

The DCMI namespace URI for the collection of legacy properties that make up the Dublin Core Metadata Element Set, Version 1.1 [DCMES] is:

Dublin Core Metadata Element Set, Version 1.1 (emphasis added):

Since 1998, when these fifteen elements entered into a standardization track, notions of best practice in the Semantic Web have evolved to include the assignment of formal domains and ranges in addition to definitions in natural language. Domains and ranges specify what kind of described resources and value resources are associated with a given property. Domains and ranges express the meanings implicit in natural-language definitions in an explicit form that is usable for the automatic processing of logical inferences. When a given property is encountered, an inferencing application may use information about the domains and ranges assigned to a property in order to make inferences about the resources described thereby.

Since January 2008, therefore, DCMI includes formal domains and ranges in the definitions of its properties. So as not to affect the conformance of existing implementations of “simple Dublin Core” in RDF, domains and ranges have not been specified for the fifteen properties of the dc: namespace ( Rather, fifteen new properties with “names” identical to those of the Dublin Core Metadata Element Set Version 1.1 have been created in the dcterms: namespace ( These fifteen new properties have been defined as subproperties of the corresponding properties of DCMES Version 1.1 and assigned domains and ranges as specified in the more comprehensive document “DCMI Metadata Terms” [DCTERMS].

Implementers may freely choose to use these fifteen properties either in their legacy dc: variant (e.g., or in the dcterms: variant (e.g., depending on application requirements. The RDF schemas of the DCMI namespaces describe the subproperty relation of dcterms:creator to dc:creator for use by Semantic Web-aware applications. Over time, however, implementers are encouraged to use the semantically more precise dcterms: properties, as they more fully follow emerging notions of best practice for machine-processable metadata.

The first two paragraphs explain why a new vocabulary was minted (so that the more precise definitions of properties already in DC Elements do not change the behavior of existing implementations; had only new terms and classes been added, maybe they could have been added to the DC Elements vocabulary, but maybe this is ahistoric, as many of the additional “qualified” DC Terms existed since 2000). The third paragraph explains that DC Terms should be used for new applications. Unfortunately the text informally (the prefixes aren’t used anywhere) notes the prefixes dc: and dcterms:, which I’ve found is not helpful in getting people to focus only on DC Terms.

Expressing Dublin Core metadata using the Resource Description Framework also notes the dc: and dcterms: prefixes for use in the document’s examples (which don’t ever actually use dc:).

Some of these documents have been updated slightly, but I believe their current versions are little changed from about 2008, a year after the proposal of the DC Terms refinements.

How to use DCMI Metadata as linked data uses the dc: and dcterms: prefixes and is clear about the ranges of properties of each: there is no incorrect usage of, e.g., because it has no defined range nor domain, while must be a non-literal, a Perhaps this makes DC Terms seem scarier and partially explains the persistence of DC Elements. More likely I’d guess few know about the difference and lots of use of the DC Terms with non-literal ranges are used with literals in the wild (I might be guilty on occasion).

FAQ/DC and DCTERMS Namespaces:

It is not incorrect to continue using dc:subject and dc:title — alot of Semantic Web data still does — and since the range of those properties is unspecified, it is not actually incorrect to use (for example) dc:subject with a literal value or dc:title with a non-literal value. However, good Semantic Web practice is to use properties consistently in accordance with formal ranges, so implementers are encouraged to use the more precisely defined dcterms: properties.
Update, December 2011: It is worth noting that the initiative is taking a pragmatic approach towards the formal ranges of their properties:

We also expect that often, where we expect a property value of type Person, Place, Organization or some other subClassOf Thing, we will get a text string. In the spirit of “some data is better than none”, we will accept this markup and do the best we can.

What constitutes “best practice” in this area is bound to evolve with implementation experience over time.

There you have people supplying literals for properties expecting non-literals. RDF mappings do not formally condone this pragmatic approach, otherwise you’d see the likes of (addition in bold):

schema:creator a rdf:Property;
    rdfs:label "Creator"@en;
    rdfs:comment "The creator/author of this CreativeWork or UserComments. This is the same as the Author property for CreativeWork."@en;
    rdfs:domain [ a owl:Class; owl:unionOf (schema:UserComments schema:CreativeWork) ];
    rdfs:range [ a owl:Class; owl:unionOf (schema:Organization schema:Person xsd:string) ];
    rdfs:isDefinedBy ;
    rdfs:isDefinedBy ;

Also from 2011, a discussion of what prefixes to use in the RDFa initial context. Decision (Ivan Herman):

For the records: after having also discussed on yesterday’s telecom, I have made the changes on the profile files yesterday evening. The prefix set in the profile for is set to ‘dc’.

Read the expert input of Dan Brickley, Mikael Nilsson, and Thomas Baker. The initial context defines both dc: and dcterms: as prefixes for DC Terms, relegating DC Elements to dc11::

dc Dublin Core Metadata Terms DCMI Metadata Terms
dcterms Dublin Core Metadata Terms DCMI Metadata Terms
dc11 Dublin Core Metadata Element Set, Version 1.1 Dublin Core Metadata Element Set, Version 1.1

I found the above discussion on LOV’s entries for DC Terms and DC Elements, which use dcterms: and dce: prefixes respectively:

(2013-03-07) Bernard Vatant: Prefix restored to dcterms

(2013-06-17) Bernard Vatant: Although “dc” is often used as the prefix for this vocabulary, it’s also sometimes used for DC terms, so we preferred to use the less ambiguous “dce” and “dcterms” in LOV. See usage at,,, and more discussion at

I think the discussion instead supports using dc: and dc11: (because that’s what the RDFa initial context uses) instead. LOV doesn’t have a public source repository or issue tracker currently, but I understand it eventually will.

Now I have this grab-bag blog post to send to friends who propose using DC Elements. Please correct me if I’m wrong, and especially if a more concise (on this topic) and credible document exists, so I can send that instead; perhaps something like an opinionated guide to metadata mentioned way above.

Another topic such a guide might cover, perhaps as a coda, would be what to do if you really need to develop a new vocabulary. One thing is you really need to ask for help. The W3C now provides some infrastructure for doing this. Or, some qualified dissent from a hugely entertaining blogger called Brinxmat.

Some readers of my blog who have bizarrely read through this post, or skipped to the end, might enjoy Brinxmat’s Attribution licences for data and why you shouldn’t use them (another future issue report for LOV, which uses CC-BY?); I wrote a couple posts in the same blogversation; also a relevant upgrade exhortation.

Public domain wins copyright week!

Sunday, January 19th, 2014

public domain wins copyright weekEFF coordinated a six day copyright week, with suggested readings and actions in support of six principles, below with readings + actions count:

  • Transparency: 10 + 1 = 11
  • Building and Defending a Robust Public Domain: 16 + 0 = 16
  • Open Access: 9 + 2 = 11
  • You Bought it, You Own It: 8 + 3 = 11
  • Fair Use Rights: 14 + 1 = 15
  • Getting Copyright Right: 7 + 1 = 8

I couldn’t help but notice that the public domain “wins” by the metric of total readings + actions, perhaps indicative of relative enthusiasm and evaluation of importance by the communities EFF reaches. Good.

The apparent “loser” is getting copyright right, which I’ll also take undue satisfaction in: it’s an impoverished objective, relative to expanding and protecting intellectual freedom. Alternatively, public domain maximalism (second alternative, corresponding to the runner-up: fair use maximalism) is getting copyright right. But I acknowledge advocating “getting copyright right” (and the entire exercise of copyright week) is a fine thing to do given constraints, and its “loss” is likely due to being a more difficult writing assignment, and falling on the last day.

The latent “loser” though is the role of commons initiatives in changing the knowledge economy, thus the range of policies which can be imagined, and the resources available to support various policies. Some initiatives are mentioned, but almost exclusively as victims of costs imposed by bad policy. Daniel Mietchen’s Wikimedia and Open Access might be the reading closest to what I’d like to see a whole day dedicated to (on the seventh day of copyright week, commoners made their own freedom). Though starting with copyright-imposed costs to the project, Mietchen proceeds to describe collaboration among Wikimedians and the Open Access movement, and ends with (implied) competition:

wider exposure of Open Access materials through Wikimedia platforms may perhaps serve as an incentive for researchers to reconsider whether putting their articles behind access and reuse barriers is an appropriate approach to publishing them.

Related, because it is the domain of the most robust commons initiatives, it is too bad software was not the primary topic of several copyright week readings and actions. But even ignoring the seventh day angle, it is incredibly short-sighted to treat software as a separate category, whether for purposes of study or policy (e.g., copyright). All of the traditional subjects of copyright are now largely made with and mediated by software, but that’s just the beginning. Soon enough, they’ll all be software, or be obsolete. (In hindsight I should have noticed copyright week approaching, and urged various free/open source software initiatives to participate, and explain their policy relevance and potency.)

Back to cheering, I highly recommend at least skimming a few of the readings in each category, linked on the EFF copyright week page. Unless you follow knowledge policy writ large really closely, you’re almost certain to learn something new about policy battles that will play a large role in shaping the future of society.

To make up for the lack of copyright week “actions” recommended for building and defending a robust public domain: sign the public domain manifesto, upgrade your work to the public domain, and enjoy and share the greatest public domain film to date.

Happy GNU Year & Public Domain Day

Wednesday, January 1st, 2014
happy gnu year and public domain day

Any previous combinations? Reminded of GNU year greetings by Laurel Russwurm and Public Domain Day by the Public Domain Review and Center for the Study of the Public Domain.

My previous Public Domain Day posts:

Echoing the 2011 entry, I recently urged all to upgrade to CC0 (a public domain dedication and license). Also, January 1 is a good date to reiterate:

Unless stated otherwise, everything by me, Mike Linksvayer, published anywhere, is hereby placed in the public domain.

Join me. More importantly, unless you’re prodigious, demand that at the very least all government material go directly into the public domain.

The bottom part of the image is from The Gnoo (1804) by Samuel Daniell (1775-1811). The top is from an illustration (1883) by Louis-Maurice Boutet de Monvel (1851-1913). Latter selected because it is newly unambiguously in the public domain worldwide, including Mexico, which has life + 100 years of restriction. It would not be shocking to see this term ratchet worldwide in the next years.

Bonus links:

CC11x11, before, 0, &freebassel

Monday, December 16th, 2013

Gimped CC cake 10 / BY / Kristina Alexanderson
(I wrote 90% of this post a year ago; currently unaware of any actual CC 11 cakes or celebrations.)

Today is the 11th anniversary of the launch of the first version of the first 11 Creative Commons licenses. Depending how one counts, there are now as few as 0, though 6 is probably the conventional answer (only current international versions of ones that were among the original 11), or as many as 608 (all versions, jurisdiction ports, retired licenses, and public domain instruments).

If 2002-12-16 is a significant marker, I’d like to take a look at what preceded it, very nearby — other public copyright licenses, public domain dedications, and ad hoc sharing statements. Eventually I hope to take a more in-depth look at all of these, and moreso I hope others do research around them.

Prior to the 1980s, such statements are very scattered. Has anyone pieced together commonalities and differences of pro-info-sharing statements through history? Examples…

In 868 the Diamond Sutra included:

Reverently [caused to be] made for universal free distribution by Wang Jie on behalf of his two parents on the 13th of the 4th moon of the 9th year of Xiantong.

1869 Recent Discussions on the Abolition of Patents for Inventions, setting a standard that modern books on advocating reform (inclusive of abolition) fail to meet:

No rights are reserved

1910 the English translation of Gandhi’s Indian Home Rule was printed with the words No Rights Reserved on the title page.

1967 the copyright notice of All Watched Over by Machines of Loving Grace included:

Permission is granted to reprint any of these poems in magazines, books and newspapers if they are given away free.

1976 Tiny BASIC for Intel 8080 included:


1978 In the Making included:

“Alternative publications may reproduce freely provided acknowledgement is made.”

I believe many statements along such lines were published, especially in the last century, but again, as far as I know, nobody has ever thoroughly investigated. I’m very interested, in part because I have a hunch what might be characterized as “information commons” have been malgoverned for the entirety of human history. Why did pro-sharing statements, in the form of public copyright licenses, only become regularized, widespread, and thought by some as creating and protecting commons, in the 1980s, starting with software?

The easy answer is that software had just become clearly restricted by copyright, and programmers have a more immediately compelling need to collaborate across organizational boundaries in a way that implicates copyright restrictions than do others. Still, one may question just how different paths would need to have been for explicit pro-sharing practices to have developed in other domains first, even pre-computer, and how the norms of such practices might have differed. I’ve speculated, very briefly that it’s plausible order could’ve been different, and essentially software freedom norms are a “sweet spot” that would’ve been arrived at anyway. Much more could be said about that, and also about whether and how the explicit pro-sharing practices I’ve recognized as such in this post have crowded out or complemented other pro-sharing practices.

In any case, in the 5 years prior to the launch of the first 11 Creative Commons licenses, there was a proliferation of interest in public copyright licenses for various forms of non-software works (including hardware designs, which took longer to capture much interest, and I won’t cover here). An incomplete list of such licenses released 1998-2002:

Anti-Copyright License, Comic Book Public License, Design Science License, Distributed Encyclopedia General Public License, EFF Open Audio License, Electrohippie Collective’s Ethical Open Documentation License, Ethymonics Free Music License, Free Art License, Free Media License, Free Music Public License, GNU Free Documentation License, No Type License, OpenBits License, Open Content License, Open Directory License, the Open Music licenses, Open Publication License, Open Source Music License, Public Library of Science Open Access License, QING Public Licnese, and Phy-d’eau — License of Intention for Liberty in Expression and Creativity.

Many of these licenses are non-free/open, and nearly all are incompatible with all the rest. These problems preceded Creative Commons. Whether in the past 10 years Creative Commons has on net made these problems better or worse (or merely not better fast enough) is hard to say. One curiosity about these pre-CC licenses is that the only ones remaining in any kind of significant use (Free Art License and Free Documentation License) are free/open, copyleft licenses.

Near certainty of large adoption of public licenses and public domain dedications outside software also preceded CC. The effect one can be most certain of attributing to CC is of killing adoption of the few of these licenses that had any plausibility, and of the development of further non-CC licenses, for awhile. Whether a dominant central license steward was net positive, is hard to say. It’s easy to see some marketing benefits, and some innovation costs, and vice versa.

Some public licenses created for software, mostly the GNU GPL, and BSD licenses, were used for some non-software works before the explosion of non-software public licenses (of which CC was part). An open question is whether this explosion was a good thing at all, or rather a failure on the part of free software license pioneers to occupy a broader space, and create a broader-based, less fragmented movement for intellectual freedom…the part facilitated by public licenses that is.

It’s also possible that free software started with the wrong arrangement in the form of public licenses, and others, including what became CC, ought have tried something different, for example clubs/pools, or skipping voluntary methods altogether. (Many people have focused on one or more of direct action, litigation, and public policy. I tend to think there’s far too little appreciation and collaboration across these methods and voluntary construction, resulting in a further fragmented, scared, and weak movement.)

I didn’t publish a year ago because I’d intended to add sections on the “CC era” of the past 10, now 11 years, and the future. My recent extended quasi-review of CC 4.0 licenses will have to suffice. Now…

Celebrate CC’s 11th birthday:

Upgrade to CC0

Free Bassel

[Semi]Commons Coordinations & Copyright Choices 4.0

Monday, December 9th, 2013

CC0 is superior to any of the Creative Commons (CC) 4.0 licenses, because CC0 represents a superior policy (public domain). But if you’re unable or unwilling to upgrade to CC0, the CC 4.0 licenses are a great improvement over the 3.0 licenses. The people who did the work, led by Diane Peters (who also led CC0), many CC affiliates (several of whom were also crucial in making CC0 a success), and Sarah Pearson and Kat Walsh, deserve much praise. Bravo!

Below read my idiosyncratic take on issues addressed and not addressed in the 4.0 licenses. If that sounds insufferable, but you want to know about details of the 4.0 licenses, skip to the excellent version 4 and license versions pages on the CC wiki. I don’t bother linking to sections of those pages pertinent to issues below, but if you want detailed background beyond my idiosyncratic take on each issue, it can be found there.

Any criticism I have of the 4.0 licenses concerns policy choices and is not a criticism of the work done or people involved, other than myself. I fully understand that the feasible choices were and are highly constrained by previous choices and conditions, including previous versions of the CC licenses, CC’s organizational history, users of CC licenses, and the overall states of knowledge commons and info regulation and CC’s various positions within these. I always want CC and other “open” organizations to take as pro-commons of a stance as possible, and generally judge what is possible to be further than that of the conventional wisdom of people who pay any attention to this scene. Sometimes I advocated for more substantial policy changes in the 4.0 licenses, though just as often I deemed such advocacy futile. At this point I should explain that I worked for CC until just after the 4.0 licenses process started, and have consulted a bit on 4.0 licenses issues since then as a “fellow”. Not many people were in a better position to influence the 4.0 licenses, so any criticisms I have are due to my failure to convince, or perhaps incorrect decision to not try in some cases. As I’ve always noted on this blog, I don’t represent any organization here.


Pro-commons? As opposed to what? The title of the CC blog post announcing the formal beginning of work on the new licenses:

Copyright Experts Discuss CC License Version 4.0 at the Global Summit

My personal blog post:

Commons experts to develop version 4.0 of the CC licenses

The expertise that CC and similar organizations ought to bring to the world is commons coordination. There are many copyright experts in the world, and understanding public copyright licenses, and drafting more, are no great intellectual challenges. The copyright expertise needed to do so ought be purely instrumental, serving the purpose of commons coordination. Or so I think.

Throughout CC’s existence, it has presented itself, and been perceived as, to varying extents, an organization which provides tools for copyright holders to exercise their copyrights, and an organization which provides tools for building a commons. (What it does beyond providing tools adds another dimension, not unrelated to “copyright choice” vs. “commons coordination”; there’s some discussion of these issues in a video included in my personal post above.)

I won’t explain in this post, but I think the trend through most of CC’s history has been very slow movement in the “commons coordination” direction, and the explicit objectives of the 4.0 versioning process fit that crawl.

“Commons coordination” does not directly imply the usual free/open vs. proprietary/closed dichotomy. I think it does mostly fall out that way, in small part due to “license interoperability” practicalities, but probably mostly because I think the ideal universal copyregulation policy corresponds to the non-discriminatory commons that “free/open” terms and communities carve out on a small scale, including the pro-sharing policy that copyleft prototypes, and excluding any role for knowledge enclosure, monopoly, property, etc. But it is certainly possible, indeed usual, to advocate for a mixed regime (I enjoy the relatively new term “semicommons”, but if you wish to see it everywhere, try every non-demagogic call for “balance”), in which case [semi]commons tools reserving substantial exclusivity (e.g., “commercial use”) make perfect sense for [semi]commons coordination.

Continuing to ignore the usual [non-]open dichotomy, I think there still are a number of broad criteria for would-be stewards of any new commons coordinating license (and make no mistake, a new version of a license is a new license; CC introduced 6 new licenses with 4.0) to consider carefully, and which inform my commentary below:

  • Differentiation: does the new license implement some policy not currently available in existing licenses, or at least offer a great improvement in implementation (not to provide excuses for new licenses, but the legal text is just one part of implementation; also consider branding/positioning, understandability, and stewardship) of policy already available?
  • Permissions: does the new license grant all permissions needed to realize its policy objective?
  • Regulation: how does the license’s policy objective model regulation that ought be adopted at a wider scale, e.g., how does it align with usual “user rights” and “copyright reform” proposals?
  • Interoperability: is the new license maximally compatible with existing licenses, given the constraints of its policy objectives, and indeed, to the expense of its immediate policy objectives, given that incompatibility, non-interoperability, and proliferation must fragment and diminish the value of commons?
  • Cross-domain impact: how does the license impact license interoperability and knowledge sharing across fields/domains/communities (e.g., software, data, hardware, “content”, research, government, education, culture…)? Does it further silo existing domains, a tragedy given the paucity of knowledge about governing commons in the world, or facilitate sharing and collaboration across domains?

Several of these are merely a matter of good product design and targeting, and would also apply to an organization that really had a primary goal of offering copyright holders additional choices the organization deems are under-provided. I suspect there is plenty of room for innovation in “copyright choice” tools, but I won’t say more in this post, as such have little to do with commons, and whatever CC’s history of copyright choice rhetoric and offering a gaggle of choices, creating such tools is distant from its immediate expertise (other than just knowing lots about copyright) and light years from much of its extended community.

Why bother?

Apart from amusing myself and a few others, why this writeup? The CC 4.0 licenses won’t change, and hopefully there won’t be CC 4.1 or 4.5 or 5.0 licenses for many years. Longevity was an explicit goal for 4.0 (cf. 1.0: 17 months, 2.0: 12 months; 2.5: 20 months; 3.0: 81 months). Still, some of the issues covered here may be interesting to people choosing to use one of the CC 4.0 licenses, and people creating other licenses. Although nobody wants more licenses, often called license proliferation, as an end in itself, many more licenses is the long term trend, of which the entire history of CC is just a part. Further, more licenses can be a good, to the extent they are significantly different from and better than, and as compatible as possible with, existing licenses.

To be totally clear: many new licenses will be created and used over the next 10 years, intended for various domains. I would hope, some for all domains. Proliferators, take heed!

Development tools

A 4.0 wiki page and a bunch of pages under that were used to lay out objectives, issues and options for resolution, and link to drafts. Public discussion was on the cc-licenses list, with tangential debate pushed to cc-community. Drafts and changes from previous drafts were published as redlined word processor files. This all seems to have worked fairly well. I’d prefer drafts as plain text files in a git repository, and an issue tracker, in addition to a mailing list. But that’s a substantially different workflow, and word processor documents with track changes and inline comments do have advantages, not limited to lawyers being familiar with those tools.

100% wiki would also work, with different tradeoffs. In the future additional tools around source repositories, or wikis, or wikis in source repositories, will finally displace word processor documents, but the tools aren’t there yet. Or in the bad future, all licenses will be drafted in word processors in the cloud.

(If it seems that I’m leaving a a lot out, e.g., methodology for gathering requirements and feedback, in-person and teleconferences, etc., I merely have nothing remotely interesting to say, and used “tools” rather than “process” to narrow scope intentionally.)


The 4.0 licenses were drafted to be jurisdiction neutral, and there will be official, equivalent, verbatim language translations of the licenses (the same as CC0, though I don’t think any translations have been made final yet). Legal “porting” to individual jurisdictions is not completely ruled out, but I hope there will be none. This is a wholly positive outcome, and probably the most impactful change for CC itself (already playing out over the past few years, e.g., in terms of scope and composition of CC affiliates), though it is of small direct consequence to most users.

Now, will other license drafters and would-be drafters follow CC’s lead and stop with the vanity jurisdiction license proliferation already?


At least the EU, Mexico, Russia, and South Korea have created “database rights” (there have been attempts in other jurisdictions), copyright-like mechanisms for entities that assemble databases to persecute others who would extract or copy substantial portions of said databases. Stupid policies that should be abolished, copyright-like indeed.

Except for CC0 and some minor and inconsistent exceptions (certain within-EU jurisdiction “port” versions), CC licenses prior to 4.0 have not “covered” database rights. This means, modulo any implied license which may or may not be interpreted as existing, that a prior-to-4.0 (e.g., CC-BY-3.0) licensee using a database subject to database restrictions (when this occurs is a complicated question) would have permission granted by the licensor around copyright restrictions, but not around database restrictions. This is a pretty big fail, considering that the first job of a public license is to grant adequate permissions. Actual responses to this problem:

  • Tell all database publishers to use CC0. I like this, because everyone should just use CC0. But, it is an inadequate response, as many will continue to use less permissive terms, often in the form of inadequate or incompatible licenses.
  • Only waive or license database restrictions in “ports” of licenses to jurisdictions in which database restrictions exist. This is wholly inadequate, as in the CC scheme, porting involves tailoring the legal language of a license to a jurisdiction, but there’s no guarantee a licensor or licensee in such jurisdictions will be releasing or using databases under one of these ports, and in fact that’s often not the case.
  • Have all licenses waive database restrictions. This sounds attractive, but is mostly confusing — it’s very hard to discern when only database and not copyright restrictions apply, such that a licensee could ignore a license’s conditions — and like “tell database publishers to use CC0” would just lead many to use different licenses that do purport to conditionally license database rights.
  • Have all licenses grant permissions around database restrictions, under whatever conditions are present in the license, just like copyright.

I think the last is the right approach, and it’s the one taken with the CC 4.0 licenses, as well as by other licenses which would not exist but for CC 3.0 licenses not taking this approach. I’m even more pleased with their generality, because other copyright-like restrictions are to be expected (emphasis added):

Copyright and Similar Rights means copyright and/or similar rights closely related to copyright including, without limitation, performance, broadcast, sound recording, and Sui Generis Database Rights, without regard to how the rights are labeled or categorized. For purposes of this Public License, the rights specified in Section 2(b)(1)-(2) are not Copyright and Similar Rights.

The exclusions of 2(b)(1)-(2) are a mixed bag; see moral and personality rights, and patents below.

CC0 also includes a definition with some generality:

Copyright and Related Rights include, but are not limited to, the following:

  1. the right to reproduce, adapt, distribute, perform,
    display, communicate, and translate a Work;
  2. moral rights retained by the original author(s) and/or
  3. publicity and privacy rights pertaining to a person’s
    image or likeness depicted in a Work;
  4. rights protecting against unfair competition in regards
    to a Work, subject to the limitations in paragraph 4(a),
  5. rights protecting the extraction, dissemination, use and
    reuse of data in a Work;
  6. database rights (such as those arising under Directive
    96/9/EC of the European Parliament and of the Council of 11
    March 1996 on the legal protection of databases, and under
    any national implementation thereof, including any amended
    or successor version of such directive); and
  7. other similar, equivalent or corresponding rights
    throughout the world based on applicable law or treaty, and
    any national implementations thereof.

As does GPLv3:

“Copyright” also means copyright-like laws that apply to other kinds of works, such as semiconductor masks.

Do CC0 and CC 4.0 licenses cover semiconductor mask restrictions (best not to use for this purpose anyway, see patents)? Does GPLv3 cover database restrictions? I’d hope the answer is yes in each case, and if the answer is no or ambiguous, future licenses further improve on the generality of restrictions around which permissions are granted.

There is one risk in licensing everything possible, and culturally it seems, specifically in licensing database rights — the impression that licensee which do so ‘create obligations’ related to those rights. I find this an odd way to think of a conditional permission as the creation of an obligation, when the user’s situation without said permission is unambiguously worse, i.e., no permission. Further, this impression is a problem for non-maximally-permissive licenses around copyright, not only database or other copyright-like rights.

In my opinion the best a public license can do is to grant permissions (conditionally, if not a maximally permissive license) around restrictions with as much generality as possible, and expressly state that a license is not needed (and therefore conditions to not apply) if a user can ignore underlying restrictions for some other reason. Can the approach of CC version 4.0 licenses to the latter be improved?

For the avoidance of doubt, where Exceptions and Limitations apply to Your use, this Public License does not apply, and You do not need to comply with its terms and conditions.

These are all trivialities for license nerds. For publishers and users of databases: Data is free. Free the data!

Moral and personality rights

CC 4.0 licenses address them well:

Moral rights, such as the right of integrity, are not licensed under this Public License, nor are publicity, privacy, and/or other similar personality rights; however, to the extent possible, the Licensor waives and/or agrees not to assert any such rights held by the Licensor to the limited extent necessary to allow You to exercise the Licensed Rights, but not otherwise.

To understand just how well, CC 3.0 licenses say:

Except as otherwise agreed in writing by the Licensor or as may be otherwise permitted by applicable law, if You Reproduce, Distribute or Publicly Perform the Work either by itself or as part of any Adaptations or Collections, You must not distort, mutilate, modify or take other derogatory action in relation to the Work which would be prejudicial to the Original Author’s honor or reputation. Licensor agrees that in those jurisdictions (e.g. Japan), in which any exercise of the right granted in Section 3(b) of this License (the right to make Adaptations) would be deemed to be a distortion, mutilation, modification or other derogatory action prejudicial to the Original Author’s honor and reputation, the Licensor will waive or not assert, as appropriate, this Section, to the fullest extent permitted by the applicable national law, to enable You to reasonably exercise Your right under Section 3(b) of this License (right to make Adaptations) but not otherwise.

Patents and trademark

Prior versions were silent, CC 4.0 licenses state:

Patent and trademark rights are not licensed under this Public License.

Perhaps some potential licensor will be reassured, but I consider this unnecessary and slightly harmful, replicating the main deficiency of CC0. The explicit exclusion makes it harder to see an implied license. This is especially troublesome when CC licenses are used in fields in which patents can serve as a barrier. Software is one, for which CC has long disrecommended use of CC licenses largely because software is already well-covered by licenses with which CC licenses are mostly incompatible with; the explicit patent exclusion in the CC 4.0 licenses makes them even less suitable. Hardware design is another such field, but one with fragmented licensing, including use of CC licenses. CC should now explicitly disrecommend using CC licenses for hardware designs and declare CC-BY-SA-4.0 one-way compatible with GPLv3+ so that projects using one of the CC-BY-SA licenses for hardware designs have a clear path to a more appropriate license.

Patents of course can be licensed separately, and as I pointed out before regarding CC0, there could be curious arrangements for projects using such licenses with patent exclusions, such as only accepting contributions from Defensive Patent License users. But the better route for “open hardware” projects and the like to take advantage of this complementarity is to do both, that is use a copyright and related rights license that includes a patent peace clause, and join the DPL club.


CC 4.0 licenses:

The Licensor waives and/or agrees not to assert any right or authority to forbid You from making technical modifications necessary to exercise the Licensed Rights, including technical modifications necessary to circumvent Effective Technological Measures.

This is a nice addition, which had been previously suggested for CC 3.0 licenses and rejected — the concept copied from GPLv3 drafts at the time. I would have preferred to also remove the limited DRM prohibition in the CC licenses.


The CC 4.0 licenses slightly streamline and clarify the substance of the attribution requirement, all to the good. The most important bit, itself only a slight streamlining and clarification of similar in previous versions:

You may satisfy the conditions in Section 3(a)(1) in any reasonable manner based on the medium, means, and context in which You Share the Licensed Material. For example, it may be reasonable to satisfy the conditions by providing a URI or hyperlink to a resource that includes the required information.

This pulls in the wild use from near zero to-the-letter compliance to fairly high.

I’m not fond of the requirement to remove attribution information if requested by the licensor, especially accurate information. I don’t know whether a licensor has ever made such a request, but that makes the clause only pointless rather than harmful. Not quite though, as it does make for a talking point.


not primarily intended for or directed towards commercial advantage or private monetary compensation. For purposes of this Public License, the exchange of the Licensed Material for other material subject to Copyright and Similar Rights by digital file-sharing or similar means is NonCommercial provided there is no payment of monetary compensation in connection with the exchange.

Not intended to be a substantive change, but I’ll take it. I’d have preferred a probably more significantly narrowed definition and a re-branding so as to increase the range of and differentiation among the licenses that CC stewards. But at the beginning of the 4.0 licenses process, I expected no progress, so am not disappointed. Branding and other positioning changes could come post-launch, if anyone is so inclined.

I think the biggest failure of the range of licenses with an NC term (and there are many preceding CC) is not confusion and pollution of commons, very roughly the complaints of people who would like NC to have a more predictable meaning and those who think NC offers inadequate permissions, respectively, but lack of valuable use. Licenses with the NC term are certainly used for hundreds of millions of photos and web pages, and some (hundreds of?) thousands of songs, videos, and books, but few where either the licensor or the public gains significant value above what would have been achieved if the licensor had simply offered gratis access (i.e., put stuff on the web, which is incredibly valuable even with no permissions granted). As far as I know, NC licenses haven’t played a significant role in enabling (again, relative to gratis access) any disruptive product or policy, and their use by widely recognized artists and brands is negligible (cf. CC-BY-SA, which Wikipedia and other mass collaboration projects rely on to exist, and CC-BY and CC0, which are part of disruptive policy mandates).

CC is understandably somewhat stuck between free/open norms, which make licenses with the NC an embarrassment, and their numerically large but low value uses. A license steward or would-be steward that really believed a semicommons license regime could do much more would try to break out of this rut by doing a complete rethink of the product (or that part of the product line), probably resulting in something much more different from the current NC implementation than the mere definitional narrowing and rebranding that I started out preferring. This could be related to my commentary on innovation in “copyright choice” tools above; whether the two are really the same thing would be a subject for inquiry.


If there were licenses that should not have been brought to version 4.0, at least not under the CC brand, it would have been CC-BY-NC-ND and CC-BY-ND.

Instead, an express permission to make derivatives so long as they are not shared was added. This change makes so-called text/content/data mining of any work under any of the CC version 4.0 licenses unambiguously permitted, and makes ND stick out a tiny bit less as an aberration from the CC license suite modeling some moderate copyright reform baseline.

There are some costs to this approach: surprise that a “no derivatives” license permits derivatives, slight reduction in scope and differentiation among licenses that CC stewards, giving credence to ND licenses as acceptable for scholarship, and abetting the impression that text/content/data mining requires permission at all. The last is most worrisome, but (as with similar worries around licensing databases) can be turned into a positive to the extent CC and everyone knowledgeable emphasizes that you ought not and probably don’t need a license; we’re just making sure you have the freedoms around CC licensed works that you ought to have anyway, in case the info regulation regime gets even worse — but please, mine away.


This is the most improved named (BY/NC/ND/SA) elements in CC 4.0 licenses, and the work is not done yet. But first, I wish it had been improved even more, by making more uses unambiguously “trigger” the SA provision. This has been done once, starting in 2.0:

For the avoidance of doubt, where the Work is a musical composition or sound recording, the synchronization of the Work in timed-relation with a moving image (“synching”) will be considered a Derivative Work for the purpose of this License.

The obvious next expansion would have been use of images (still or moving) in contextual relation to other material, eg illustrations used in a text. Without this expansion, CC-BY-SA and CC-BY-NC-SA are essentially identical to CC-BY and CC-BY-NC respectively for the vast majority of actual “reuse” instances. Such an expansion would have substantially increased the range of and differentiation among licenses that CC stewards. The main problem with such an expansion (apart from specifying it exactly) would be increasing the cost of incompatibility, where texts and images use different licenses. This problem would be mitigated by increasing compatibility among copyleft licenses (below), or could be eliminated by broadening the SA licensing requirement for uses triggered by expansion, eg any terms granting at least equivalent permissions, such that a CC-BY-SA illustration could still be used in a text licensed under CC-BY or CC0. Such an expansion did not make the cut, but I think together with aforementioned broadening of licensing requirements, such a modulation (neither strictly “stronger” nor “weaker”) would make for an interesting and heretofore unimplemented approach to copyleft, in some future license.

Apart from a subtle improvement that brings SA closer to a full “or later versions” license, and reflects usual practice and understanding (incidentally, “no sublicensing” in non-SA licenses remains pointless, is not to be found in most non-CC permissive licenses, and should not be replicated), the big improvements in CC 4.0 licenses with the SA element are the addition of the potential for one-way compatibility to CC-BY-SA, adding the same compatibility mechanism to CC-BY-NC-SA, and discussions with stewards of potentially compatible licenses which make the realization of compatibility more likely. (I would have included a variation on the more complex but in my view elegant and politically advisable mechanism introduced in MPL 2.0, which allows for continued use under the donor compatible license as long as possible. Nobody demanded such, so not adding the complexity was perhaps a good thing.)

I hope that in 2014 CC-BY-SA-4.0 will be declared bilaterally compatible with the Free Art License 1.3, or if a new FAL version is required, it is being worked on, with achieving bilateral compatibility as a hard requirement, and more importantly, that CC-BY-SA-4.0 is declared one-way compatible (as a donor) with GPLv3+. An immediate step toward those ends will be finalizing an additional statement of intent regarding the stewardship of licenses with the ShareAlike element.

Though I’ll be surprised if any license appears as a candidate for compatibility with CC-BY-NC-SA-4.0, adding the mechanism to that license is a good thing: as a matter of general license stewardship, reducing the barriers to someone else creating a better NC license (see above), and keeping “porting” completely outside the 4.0 license texts (hopefully there will be no porting, but if there is any, compatibility with the international versions in licenses with the SA element would be exclusively via the compatibility mechanism used for any potentially compatible license).


All license clauses have id attributes, allowing direct linking to a particular clause. These direct links are used for references within the licenses. These are big usability improvements.

I would have liked to see an expansive “tech” (including to some extent design) effort synchronized with the 4.0 licenses, from the practical (e.g., a canonical format for license texts, from which HTML, plain text, and others are generated; that may be HTML, but the current license HTML is inadequate for the task) to the impractical (except for increasing CC’s reputation, e.g., investigating whether any semantic annotation and structure, preferably building on existing research, would be useful, in theory, for the license texts, and possibly even a practical aid to translation), to testing further upgrades to the ‘legal user interface’ constituted by the license texts and “deed” summaries (e.g., combining these), to just bringing various CC tooling and documentation up to date with RDFa 1.1 Lite. But, some of these things could be done post-launch if anyone is so inclined, and my understanding is that CC has only a single technology person on staff, dedicated to creating other products, and most importantly, the ability to directly link to any license clause probably has more practical benefits than anything on my wishlist.


One of the best things about the CC 4.0 licenses is their increased understandability. This is corroborated by crude automated readability metrics below, but I suspect these do not adequately characterize the improvement, for they include three paragraphs of explanatory text not present in previous versions, probably don’t fully reflect the improvement of splitting hairball paragraphs into lists, and have no mechanism for accounting for how the improved usability of linking to individual clauses contributes to understandability.

CC-BY-NC-SA (the license with the most stuff in it, usually used as a drafting template for others) from version 1.0 through 4.0, including 4.0 drafts (lower numbers indicate better readability, except in the case of Flesch; Chars/(Flesch>=1) is my gross metric for how painful it is to read a document; see license automated readability metrics for an explanation):

SHA1 License Characters Kincaid ARI Coleman-Liau Fog Lix SMOG Flesch Chars/(Flesch>=1)
39b2ef67be9e5b4e743e5269a31ad1691515eede CC-BY-NC-SA-1.0 10228 13.3 16.3 14.2 17.0 59.7 14.2 48.4 211
5800ac2d32e35ace035cdcae693423cd9ff5bb6f CC-BY-NC-SA-2.0 11927 13.3 16.2 14.7 17.1 60.0 14.4 47.0 253
e5f44c2df6b1391d1ddb6efb2db6f90670e4ae67 CC-BY-NC-SA-2.5 12013 13.1 16.0 14.6 16.9 59.6 14.2 47.7 251
a63b7e81e7b9e30df5d253aed1d2991af47992df CC-BY-NC-SA-3.0 17134 16.4 19.7 14.2 20.6 67.0 16.3 38.8 441
8b36c30ed0510d9ca9c69a2ef826b9fd52992474 by-nc-sa-4.0d1 12465 13.0 15.0 14.9 16.3 57.4 14.0 43.9 283
4a87c7af5cde7729e2e456ee0e8958f8632e3005 by-nc-sa-4.0d2 11583 13.1 14.8 14.2 16.8 56.2 14.4 44.7 259
bb6f239f7b39343d62440bff00de24da2b3d256f by-nc-sa-4.0d3 14422 14.1 15.8 15.1 18.2 61.0 15.4 38.6 373
cf5629ae38a745f4f9eca429f7b26af2e71eb109 by-nc-sa-4.0d4 14635 13.8 15.6 15.5 17.8 60.2 15.2 38.6 379
a5e1b9829fd287cbe255df71eb9a5aad7fb19dbc by-nc-sa-4.0d4v2 14808 14.0 15.8 15.5 18.0 60.6 15.2 38.1 388
887f9a5da675cf681421eab3ac6d61f82cf34971 CC-BY-NC-SA-4.0 14577 13.1 14.7 15.7 17.1 58.6 14.7 40.1 363

Versions 1.0 through 4.0 of each of the six CC licenses brought to version 4.0, and CC0:

SHA1 License Characters Kincaid ARI Coleman-Liau Fog Lix SMOG Flesch Chars/(Flesch>=1)
74286ae0dfea38c489437bf659b209737945145c CC0-1.0 5116 16.2 19.5 15.0 19.5 66.3 15.6 36.8 139
c766cc6d5e63277e46a3d83c6254e3528082587b CC-BY-1.0 8867 12.6 15.5 14.1 16.4 57.8 13.8 51.3 172
bf23729bec8ffd0de4d319fb33395c595c5c762b CC-BY-2.0 9781 12.1 14.9 14.3 16.1 56.7 13.7 51.9 188
024bb6d37d0a17624cf532bd14fbd42e15c5a963 CC-BY-2.5 9867 11.9 14.7 14.2 15.8 56.3 13.6 52.6 187
20dc61b94cfe1f4ba5814b340095b4c3fa23e801 CC-BY-3.0 14956 16.1 19.4 14.1 20.4 66.1 16.2 40.0 373
00b29551deee9ced874ffb9d29379b92f1487045 CC-BY-4.0 13003 13.0 14.5 15.4 16.9 57.9 14.6 41.1 316
e0c4b13ec5f9b5702d2e8b88d98b803e07d65cf8 CC-BY-NC-1.0 9313 13.2 16.2 14.3 17.0 59.3 14.1 49.3 188
970421995789d2e8189bb12071ab838a3fcf2a1a CC-BY-NC-2.0 10635 13.1 16.1 14.6 17.2 59.5 14.4 48.1 221
08773bb9bc13959c6f00fd49fcc081d69bda2744 CC-BY-NC-2.5 10721 12.9 15.8 14.5 16.9 59.0 14.2 48.9 219
9639556280637272ace081949f2a95f9153c0461 CC-BY-NC-3.0 15732 16.5 19.9 14.1 20.8 67.2 16.4 38.7 406
afcbb9791897e1e2f949d9d56ba64164746e0828 CC-BY-NC-4.0 13520 13.2 14.8 15.6 17.2 58.6 14.8 39.8 339
9ab2a3818e6ccefbc6ffdd48df7ecaec25e32e41 CC-BY-NC-ND-1.0 8729 12.7 15.8 14.4 16.4 58.6 13.8 51.0 171
966c97357e3b529e9c8bb8166fbb871c5bc31211 CC-BY-NC-ND-2.0 10074 13.0 16.1 14.7 17.0 59.7 14.3 48.8 206
c659a0e3a5ee8eba94aec903abdef85af353f11f CC-BY-NC-ND-2.5 10176 12.8 15.9 14.6 16.8 59.2 14.2 49.3 206
ad4d3e6d1fb6f89bbd28a44e263a89430b575dfa CC-BY-NC-ND-3.0 14356 16.3 19.7 14.1 20.5 66.8 16.2 39.7 361
68960bdf512ff5219909f932b8a81fdb255b4642 CC-BY-NC-ND-4.0 13350 13.3 14.8 15.7 17.2 58.4 14.8 39.4 338
39b2ef67be9e5b4e743e5269a31ad1691515eede CC-BY-NC-SA-1.0 10228 13.3 16.3 14.2 17.0 59.7 14.2 48.4 211
5800ac2d32e35ace035cdcae693423cd9ff5bb6f CC-BY-NC-SA-2.0 11927 13.3 16.2 14.7 17.1 60.0 14.4 47.0 253
e5f44c2df6b1391d1ddb6efb2db6f90670e4ae67 CC-BY-NC-SA-2.5 12013 13.1 16.0 14.6 16.9 59.6 14.2 47.7 251
a63b7e81e7b9e30df5d253aed1d2991af47992df CC-BY-NC-SA-3.0 17134 16.4 19.7 14.2 20.6 67.0 16.3 38.8 441
887f9a5da675cf681421eab3ac6d61f82cf34971 CC-BY-NC-SA-4.0 14577 13.1 14.7 15.7 17.1 58.6 14.7 40.1 363
e4851120f7e75e55b82a2c007ed98ffc962f5fa9 CC-BY-ND-1.0 8280 12.3 15.5 14.3 16.1 57.9 13.6 52.4 158
f1aa9011714f0f91005b4c9eb839bdb2b4760bad CC-BY-ND-2.0 9228 11.9 14.9 14.5 15.8 56.9 13.5 52.7 175
5f665a8d7ac1b8fbf6b9af6fa5d53cecb05a1bd3 CC-BY-ND-2.5 9330 11.8 14.7 14.4 15.6 56.5 13.4 53.2 175
3fb39a1e46419e83c99e4c9b6731268cbd1591cd CC-BY-ND-3.0 13591 15.8 19.2 14.1 20.0 65.6 15.9 41.2 329
ac747a640273815cf3a431be0afe4ec5620493e3 CC-BY-ND-4.0 12830 13.0 14.4 15.4 16.9 57.6 14.6 40.7 315
dda55573a1a3a80d294b1bb9e1eeb3a6c722968c CC-BY-SA-1.0 9779 13.1 16.1 14.2 16.8 59.1 14.0 49.5 197
9cceb80d865e52462983a441904ef037cf3a4576 CC-BY-SA-2.0 11044 12.5 15.3 14.4 16.2 57.9 13.8 50.2 220
662ca9fce7fed61439fcbc27ca0d6db0885718d9 CC-BY-SA-2.5 11130 12.3 15.0 14.4 16.0 57.5 13.6 50.9 218
4a5bb64814336fb26a9e5d36f22896ce4d66f5e0 CC-BY-SA-3.0 17013 16.4 19.8 14.1 20.5 67.2 16.2 38.9 437
8632363dcc2c9fc44f582b14274259b3a35744b2 CC-BY-SA-4.0 14041 12.9 14.4 15.4 16.8 57.8 14.5 41.4 339

It’s good for automated readability metrics that from 3.0 to 4.0 CC-BY-SA is most improved (the relevant clause was a hairball paragraph; CC-BY-NC-SA should have improved less, as it gained the compatibility mechanism) and CC-BY-ND is least improved (it gained express permission for private adaptations).


I leave a list of recommendations (many already mingled in or implied by above) to a future post. But really, just use CC0.

Upgrade to CC-BY(-(NC(-(ND|SA))?|ND|SA))?-4\.0

Monday, November 25th, 2013

Today Creative Commons released version 4.0 of six* of its licenses, with many improvements over version 3.0, after more than two years of work. I’ll write more about those details later. But you should skip right past 4.0 and upgrade to CC’s premier legal product, CC0. This is the case whether you’re looking to adopt a CC license for the first time, or to upgrade from version 1.0, 2.0, 2.1, 2.5, or 3.0.

Let’s review the named conditions present in some or all of the CC 4.0 licenses, and why unconditional CC0 is better.

Don’t forget unmitigated © in the basement.

Attribution (BY). Do not take part in the debasement of attribution, and more broadly, provenance, already useful to readers, communities of practice, and publishers, by making them seem mere objects of copyright license compliance. If attribution is useful, it will be provided. If not, robots will find out. Rarely does anyone comply with the exact legal requirements of the attribution term anyway, and as a licensor, you probably won’t provide the information needed by licensees to easily comply. Plus, the corresponding icon looks like a men’s bathroom sign.

NonCommercial (NC). Sounds nice, but nobody knows what it means. Perhaps this goes some way to explaining why NC licensed works are often used by for-profit entities, including with advertising, while NC licensed works are verboten for many community and non-profit projects, most prominently Wikipedia and other Wikimedia projects. (Because commercial entities know there is very low risk of being sued for non-compliance, and can manage risk, while community projects tend to draw and follow bright lines. Perhaps community projects ought to be able to manage risk, and that they can’t is a demonstration of their relative lack of institutional sophistication…but that’s another topic!)

NoDerivatives (ND). This term has no business being in the “Creative Commons” license suite, but sadly still is. If you don’t want to contribute to a creative commons, don’t. If you’d like to, but think copyright (through withholding permission to share adaptations, i.e., the ND term) will prevent people from misrepresenting you, you’re wrong, committing an act of hate toward free speech, and undermining the potential of voluntary license practice to align with and support an obvious baseline objective for copyright reform: noncommercial sharing and remix should always be legal.

ShareAlike (SA). Also sounds nice, and I am a frequent apologist and sometime advocate for the underlying idea, copyleft. But SA is a weak implementation of copyleft. It isn’t “triggered” by the most common use of CC-licensed material (contextual illustration, not full remix), and it has no regulatory condition not present in non-SA CC licenses (cf GPL, which requires sharing source for a work, and is usable for any work; if you care about copyleft, tell CC to finish making CC-BY-SA one-way compatible with GPL). And the SA implementation retains the costs of copyleft: blank stares of incomprehension, even from people who have worked in the “open” world for over a decade, and occasionally intense fear and dislike (the balance is a bit different in the software world, but this is my direct experience among non-software putatively open organizations and people); also, compatibility problems. It’s time to take the unsolicited advice often given to incumbents and others fearful of the internet: ‘obscurity is a greater threat than piracy’ — and apply it: ‘obscurity is a greater threat than proprietarization.’

Upgrade to CC0!

CC0 isn’t perfect, but it is by far the best tool provided by CC. I have zero insight into the future of the CC organization, but I hope it gives ample priority to the public domain, post-4.0 launch.

*CC-BY(-(NC(-(ND|SA))?|ND|SA))?-4\.0 is a regular expression matching all six licenses released today.

Hierarchy of mechanisms for limiting copyright and copyright-like barriers to use of Public Sector Information, or More or Less Universal Government License(s)

Sunday, November 24th, 2013

This sketch is in part motivated by a massive proliferation of copyright and copyright-like licenses for government/public sector information, e.g., sub- and sub-sub-national jurisdiction licenses and sector- and jurisdiction-specific licenses intended to combat license proliferation within a sector within a jurisdiction. Also by longstanding concern about coordination among entities working to limit barriers to use of PSI and knowledge commons governance generally.

Everything following concerns PSI only relative to copyright and copyright-like barriers. There are other pertinent regulations and considerations to follow when publishing or using PSI (e.g., privacy and fraud; as these are pertinent even without copyright, it is silly and unnecessarily complicating to include them in copyright licenses) and other important ways to make PSI more useful technically and politically (e.g., open formats, focusing on PSI that facilitates accountability rather than openwashing).

Eliminate copyright and copyright-like restrictions

No longer barriers to use of PSI, because no longer barriers to use of information. May be modulated down to any general copyright or copyright-like barrier reduction, where the barrier is pertinent to use of PSI. Examples: eliminate sui generis database restrictions where they exist, increase threshold of originality required for information to be subject to copyright restriction, expand exceptions and limitations to copyright restrictions, expand affirmative user rights.

Eliminate copyright and copyright-like restrictions for PSI

For example, works produced by employees of the U.S. federal government are not subject to copyright restrictions in the U.S. Narrower exclusions from copyright restrictions (e.g., of laws, court rulings) are fairly common worldwide. These could be generalized to include eliminate copyright and copyright-like restrictions for PSI, worldwide, and expanded to include PSI produced by contractors or other non-government but publicly funded entities. PSI could be expanded to include any information produced with public funding, e.g., research and culture funded by public grants.

“Standard” international licenses for PSI

Public copyright licenses not specifically intended for only PSI are often used for PSI, and could be more. CC0 is by far the best such license, but other Creative Commons (CC) and Open Data Commons (ODC) licenses are frequently used. Depending on the extent to which the licenses used leave copyright and copyright-like restrictions in place (e.g., CC0: none; CC-BY-NC-ND, lots, thus considered non-open) and how they are applied (from legislative mandate for all PSI to one-off use for individual reports and datasets at discretion of agency), could have effect similar to eliminating copyright and copyright-like restrictions for PSI, or almost zero effect.

Universal Government License

Governments at various levels have chosen to make up their own licenses rather than use a standard international license. Some of the better reasons for doing so will be eliminated by the forthcoming version 4.0 of 6 of the CC licenses (though again, CC0 has been the best choice, since 2009, and will remain so). But some of the less good reasons (uncharitable characterization: vanity) can’t be addressed by a standard international license, and furthermore seem to be driving the proliferation of sub-sub-national licenses, down to licenses specific to an individual town.

Ideally this extreme license proliferation trend would terminate with mass implementation of one of the above options, though this seems unlikely in the short term. Maybe yet another standard license would help! The idea of an “open government license” which various governments would have a direct role in creating and stewarding has been casually discussed in the past, particularly several years ago when the current proliferation was just beginning, the CC 4.0 effort had not begun, and CC and ODC were not on the same page. Nobody is particularly incented to make this unwieldy project happen, but nor is it an impossibility — due to the relatively small world of NGOs (such as CC and the Open Knowledge Foundation, of which ODC is a project) and government people who really care and know about public licenses, and the possibility their collective exhaustion and exasperation over license details, incompatibility, and proliferation could reach a tipping point into collective action. There’s a lot to start from, including the research that went into CC-BY-4.0, and the OGL UK 2.0, which is a pretty good open license.

But why think small? How many other problems could be addressed simultaneously?

  • Defend the traditional meaning of ‘open government’ by calling the license something else, e.g., Universal/Uniform/Unified Government License.
  • Rallying point for public sector worldwide to commit more firmly and broadly to limiting copyright and copyright-like barriers to use of PSI, more rapidly establishing global norm, and leading to mandates. The one thing to be said for massive PSI license proliferation could be increased commitment from proliferating jurisdictions to use their custom licenses (I know of no data on this). A successful UGL would swamp any increased local commitment due to local vanity licenses through much higher level expectation and mandate.
  • Make the license work well for software (including being approved by the Open Source Initiative), as:
    • Generically “open” licenses are inevitably used for software, whether the steward apparently intends this (OGL UK 2.0) or does not (CC).
    • The best modern permissive license for software (Apache 2.0) is relatively long and unreadable for what it does, and has an discomfiting name (not nearly as bad as certain pro sports organizations, but still); it ought be superseded.
  • Ensure the license works for other domains, e.g., open hardware, which don’t really require domain-specific licenses, are headed down the path of proliferation and incompatibility, and that governments have obvious efficiency, regulatory, security, and welfare interests in.
  • Foster broader “open innovation community” engagement with government and public policy and vice versa, and more knowledge transfer across OIC domains, on legal instruments at the least.
  • Uniform Public License may be a better name than UGL in some respects (whatever the name, it ought be usable by the public sector, and the general public), but Government may be best overall, a tip of the hat to both the vision within governments that would be necessary to make the license succeed, and to the nature of copyright and copyright-like barriers as government regulatory regimes.

National jurisdiction licenses for PSI

A more likely mechanism for license proliferation deceleration and harm reduction in the near term is for governments within a national jurisdiction to use a single license, and follow various license stewardship and use best practices. Leigh Dodds recently blogged about the problem and highlighted this mechanism in a post titled The Proliferation of Open Government Licences.

Sub-national jurisdiction licenses for PSI

Each province/state and sub-jurisdiction thereof, down to towns and local districts, could use its own vanity license. This appears to be the trend in Canada. It would be possible to push further in this direction with multiple vanity licenses per jurisdiction, e.g., various licenses for various kinds of data, reports, and other materials.

Licenses for each PSI dataset or other work

Each and every government dataset or other publication could come with its own bespoke license. Though these licenses would grant permissions around some copyright and copyright-like restrictions, I suspect their net effect would be to heighten copyright and copyright-like restrictions as a barrier to both the use and publication of PSI, on an increased cost basis alone. This extreme highlights one of the downsides of copyright licenses, even unambiguously open ones — implementing, understanding, and using them can be seen as significant cost centers, creating an additional excuse for not opening materials, and encouraging the small number of people who really understand the mechanisms to be jealous and wary of any other reform.


Included for completeness.

Privatization of PSI copyright

Until now, I’ve assumed that copyright and copyright-like restrictions are barriers to use of PSI. But maybe there aren’t enough restrictions, or they aren’t allocated to the right entities, such that maximum value is realized from use of PSI. Control of copyright and copyright-like restrictions in PSI could be auctioned off to entities with the highest ability to extract rents from PSI users. These businesses could be government-owned, with various public-private partnerships in between. This would increase the direct contribution of PSI to GDP, incent the creation and publication of more PSI, ensure PSI is maintained and marketed, reaching citizens that can affordneed it, and provide a solid business model for Government 2.0, academia, cultural heritage, and all other publicly funded and publicly interested sectors, which would otherwise fail to produce an optimal level of PSI and related materials and innovations.

Do not let any of the above trick you into paying more attention to possible copyright and copyright-like barriers and licenses than actually doing stuff, especially with PSI, especially with “data”, doubly with “government data”.

I agree with Denny Vrandečić’s paradoxical sounding but correct directive:

Data is free. Free the data!

I tried to communicate the same in a chapter of the Data Journalism Handbook, but lacked the slogan.

Data is free. Free the data!

And what is not data? ☻

Addendum: Entirely by coincidence (in response to a European Commission consultation on PSI, which I had already forgotten about), today posts by Timothy Vollmer for the Communia Association and Creative Commons call out the license proliferation problem and endorse public domain as the default for PSI.

Economics and the Commons Conference [knowledge stream] report

Wednesday, October 30th, 2013

Economics and the Common(s): From Seed Form to Core Paradigm. A report on an international conference on the future of the commons (pdf) by David Bollier. Section on the knowledge stream (which I coordinated; pre-conference post) copied below, followed by an addendum with thanks and vague promises. First, video of the stream keynote (slides) by Carolina Botero (introduced by me; copy).

III. “Treating Knowledge, Culture and Science as Commons”

Science, and recently, free software, are paradigmatic knowledge commons; copyright and patent paradigmatic enclosures. But our vision may be constrained by the power of paradigmatic examples. Re-conceptualization may help us understand what might be achieved by moving most provisioning of knowledge to the commons; help us critically evaluate our commoning; and help us understand that all commons are knowledge commons. Let us consider, what if:

  • Copyright and patent are not the first knowledge enclosures, but only “modern” enforcement of inequalities in what may be known and communicated?
  • Copyright and patent reform and licensing are merely small parts of a universe of knowledge commoning, including transparency, privacy, collaboration, all of science and culture and social knowledge?
  • Our strategy puts commons values first, and views narrow incentives with skepticism?
  • We articulate the value of knowledge commons – qualitative, quantitative, ethical, practical, other – such that knowledge commons can be embraced and challenged in mainstream discourse?

These were the general questions that the Knowledge, Culture and Science Stream addressed.

Knowledge Stream Keynote Summary

Carolina Botero Cabrera, a free culture activist, consultant and lawyer from Colombia, delivered a plenary keynote for the Knowledge Stream entitled, “What If Fear Changes Sides?” As an author and lecturer on free access, free culture and authors’ rights, Botero focused on the role of information and knowledge in creating unequal power relationships, and how knowledge and cultural commons can rectify such problems.

“If we assume that information is power and acknowledge the power of knowledge, we can start by saying that controlling information and knowledge means power. Why does this matter?” she asked. “Because the control of information and knowledge can change sides. The power relationship can be changed.”

One of the primary motives of contemporary enclosures of information and knowledge, said Botero, is to instill fear in people – fear of violating copyright law, fear of the penalties for doing so. This inhibits natural tendencies to share and re-use information. So the challenge facing us is to imagine if fear could change sides. Can we imagine a switch in power relationships over the control of knowledge – how we produce, distribute and use knowledge? Botero said we should focus on the question: “How can we switch the tendency of knowledge regulation away from enclosure, so that commons can become the rule and not the exception?”

“There are still many ways to produce things, to gain knowledge,” said Botero, who noted that those who use the word “commons” [in the context of knowledge production] are lucky because it helps name these non-market forms of sharing knowledge. “In Colombia, we don’t even have that word,” she said.

To illustrate how customary knowledge has been enclosed in Colombia, Botero told the story of parteras, midwives, who have been shunted aside by doctors, mostly men, who then asserted control over women’s bodies and childbirth, and marginalized the parteras and their rich knowledge of childbirth. This knowledge is especially important to those communities in remote areas of Colombia that do not have access to doctors. There is currently a huge movement of parteras in Colombia who are fighting for the recognition of their knowledge and for the legal right to act as midwives.

Botero also told about how copyright laws have made it illegal to reproduce sheet music for songs written in 18th and 19th century Colombia. In those times, people simply shared the music among each other; there was no market for it. But with the rise of the music industry in the 20th century, especially in the North, it is either impossible or unaffordable to get this sheet music because most of it is copyrighted. So most written music in Colombia consists of illegally photocopied versions. Market logic has criminalized the music that was once natural and freely flowing in Colombian culture. Botero noted that this has increased inequality and diminished public culture.

She showed a global map illustrating which nations received royalties and fees from copyrights and patents in 2002; the United States receives more than half of all global revenues, while Latin America, Africa, India and other countries of the South receive virtually nothing. This is the “power relationships” that Botero was pointing to.

Botero warned, “We have trouble imagining how to provision and govern resources, even knowledge, without exclusivity and control.” Part of the problem is the difficulty of measuring commons values. Economists are not interested, she said, which makes it difficult to go to politicians and persuade them why libraries matter.

Another barrier is our reliance on individual incentives as core value in the system for regulating knowledge, Botero said. “Legal systems of ‘intellectual property’ place individual financial incentives at the center for knowledge regulation, which marginalizes commons values.” Our challenge is to find ways to switch from market logics by showing that there are other logics.

One reason that it is difficult to displace market logics is because we are reluctant or unable to “introduce the commons discourse from the front door instead of through the back door,” said Botero. She confessed that she herself has this problem because most public debate on this topic “is based on the premise that knowledge requires enclosure.” It is difficult to displace this premise by talking about the commons. But it is becoming increasingly necessary to do so as new policy regimes, such as the Transpacific Trade (TPP) Agreement, seek to intensify enclosures. The TPP, for example, seeks to raise minimum levels of copyright restriction, extend the terms of copyrights, and increase the prison terms for copyright violations.

One way to reframe debate, suggested Botero, is to see the commons “not as the absence of exclusivity, but the presence of non-exclusivity. Th is is a slight but important difference,” she said, “that helps us see the plenitude of non-exclusivity” – an idea developed by Séverine Dussolier, professor and director of the Revue Droit des Technologies de l’Information (RDTI, France). This shift “helps us to shift the discussion from the problems with the individual property and market-driven perspective, to a framework and society that – as a norm – wants its institutions to be generative of sharing, cooperation and equality.”

Ultimately, what is needed are more “efficient and effective ways to protect the ethic and practice of sharing,” or as she put it, “better commoning.” Reforming “intellectual property” is only one small part of the universe of knowledge commoning, Botero stressed. It also includes movements for “transparency, privacy, collaboration, and potentially all of science and culture.”

“When and how did we accept that the autonomy of all is subservient to control of knowledge by the few?” asked Botero. “Most important, can we stop this? Can we change it? Is the current tragedy our lack of knowledge of the commons?” Rediscovering the commons is an important challenge to be faced “if fear is going to change sides.”

An Account of the Knowledge, Culture and Science Stream’s Deliberations

There were no presentations in the Knowledge Stream breakout sessions, but rather a series of brief provocations. These were intended to spur a lively discussion and to go beyond the usual debates heard at free and open software/free culture/open science conferences. A primary goal of the breakout discussions was to consider what it means to regard knowledge as a commons, rather than as a “carve-out” exception from a private property regime. The group was also asked to consider how shared knowledge is crucial to all commoning activity. Notes from the Knowledge Stream breakout sessions were compiled through a participatory titanpad, from which this account is adapted.

The Knowledge Stream focused on two overarching themes, each taking advantage of the unique context of the conference:

  1. Why should commoners of all fields care about knowledge commons?
  2. If we consider knowledge first as commons, can we be more visionary, more inclusive, more effective in commoning software, science, culture, seeds … and much more?

The idea of the breakout session was to contextualize knowledge as a commons, first and foremost: knowledge as a subset of the larger paradigm of commons and commoning, as something far more than domain-specific categories such as software, scientific publication and educational materials.

An overarching premise of the Knowledge Stream was the point made by Silke Helfrich in her keynote, that all commons are knowledge commons and all commons are material commons. Saving seeds in the Svalbaard Seedbank are of no use if we forget how to cultivate them, for example, and various digital commons are ultimately grounded in the material reality of computers, electricity infrastructures and the food that computer users need to eat.

There is a “knowledge commons” at the center of each commons. This means that interest in a “knowledge commons” isn’t confined to those people who only care about software, scientific publication, and so on. It also means that we should refrain from classifying commons into categories such as “natural resources” and “digital,” and begin to make the process of commoning itself the focal point.

Of course, one must immediately acknowledge that digital resources do differ in fundamental ways from finite natural resources, and therefore the commons management strategies will differ. Knowledge commons can make cheap or virtually free copies of intangible information and creative works, and this knowledge production is often distributed at very small scales. For cultural commons, noted Philippe Aigrain, a French analyst of knowledge governance and CEO of Sopinspace, a maker for free software for collaboration and participatory democracy, “the key challenge is that average attention becomes scarcer in a world of abundant production.” This means that more attention must be paid on “mediating functions” – curating – and “revising our cultural expectations about ‘audiences’.”

It is helpful to see the historical roots of Internet-enabled knowledge commons, said Hilary Wainwright, the editor behind the UK political magazine Red Pepper and a research at the Transnational Institute. The Internet escalated the practice of sharing knowledge that began with the feminist movement’s recognition of a “plurality of sources.” It also facilitated the socialization of knowledge as a kind of collective action.

That these roots are not widely appreciated points to the limited vision of many knowledge commons, which tend to rely on a “deeply individualistic ethical ontology,” said Talha Syed, a professor of law at the University of California, Berkeley. This worldview usually leads commoners to focus on coercion – enclosures of knowledge commons – as the problem, he said. But “markets are problematic even if there is no monopoly,” he noted, because “we need to express both threats and positive aspirations in a substantive way. Freedom is more than people not coercing us.”

Shun-Ling Chen, a Taiwanese professor of law at the University of Arizona, noted that even free, mass-collaboration projects such as Wikipedia tend to fall back on western, individualistic conceptions of authorship and authority. This obscures the significance of traditional knowledge and history from the perspective of indigenous peoples, where less knowledge is recorded by “reliable sources.”

As the Stream recorded in its notes, knowledge commons are not just about individual freedoms, but about “marginalized people and social justice.” “The case for knowledge commons as necessary for social justice is an undeveloped theme,” the group concluded. But commons of traditional knowledge may require different sorts of legal strategies than those that are used to protect the collective knowledge embodied in free software or open access journal. The latter are both based on copyright law and its premises of individual rights, whereas traditional knowledge is not recognized as the sum of individual creations, but as a collective inheritance and resource.

This discussion raised the question whether provisioning knowledge through commons can produce different sorts of “products” as those produced by corporate enclosures, or whether they will simply create similar products with less inequality. Big budget movies and pharmaceuticals are often posited as impossibilities for commons provision (wrongly, by the way). But should these industries be seen as the ‘commanding heights’ of culture and medicine, or would a commons-based society create different commanding heights?”

One hint at an answer comes from seeing informality as a kind of knowledge commons. “Constructed commons” that rely upon copyright licenses (the GPL for software, Creative Commons licenses for other content) and upon policy reforms, are generally seen as the most significant, reputable knowledge commons. But just as many medieval commons relied upon informal community cooperation such as “beating the bounds” to defend themselves, so many contemporary knowledge commons are powerful because they are based on informal social practice and even illegality.

Alan Toner of Ireland noted that commoners who resist enclosures often “start from a position of illegality” (a point made by Ugo Mattei in his keynote talk). It may be better to frankly acknowledge this reality, he said. After all, remix culture would be impossible without civil disobedience to various copyright laws that prohibit copying, sharing and re-use – even if free culture people sometimes have a problem with such disrespectful or illegal resistance. “Piracy” is often a precursor to new social standards and even ne w legal rules. “What is legal is continent,” said Toner, because practices we spread now set traditions and norms for the future. We therefore must be conscious about the traditions we are creating. “The law is gray, so we must push new practices and organizations need to take greater risks,” eschewing the impulse to be “respectable” in order to become a “guiding star.”

Felix Stalder, a professor of digital culture at Zurich University of the Arts, agreed that civil disobedience and piracy are often precisely what is needed to create a “new normal,” which is what existing law is explicitly designed to prevent. “Piracy is building a de facto commons,” he added, “even if it is unaware of this fact. It is a laboratory of the new that can enrich our understanding of the commons.”

One way to secure the commons for the future, said Philippe Aigrain of Sopinspace, is to look at the specific challenges facing the commons rather than idealizing them or over-relying on existing precedents. As the Stream discussion notes concluded, “Given a new knowledge commons problem X, someone will state that we need a ‘copyleft for X.’ But is copyleft really effective at promoting and protecting the commons of software? What if we were to re-conceptualize copyleft as a prototype for effective, pro-commons regulation, rather than a hack on enclosure?”

Mike Linksvayer, the former chief technology officer of Creative Commons and the coordinator of the Knowledge Stream, noted that copyleft should be considered as “one way to “force sharing of information, i.e., of ensuring that knowledge is in the commons. But there may be more effective and more appropriate regulatory mechanisms that could be used and demanded to protect the commons.”

One provocative speculation was that there is a greater threat to the commons than enclosure – and that is obscurity. Perhaps new forms of promotion are needed to protect the commons from irrelevance. It may also be that excluding knowledge that doesn’t really contribute to a commons is a good way to protect a commons. For example, projects like Wikipedia and Debian mandate that only free knowledge and software be used within their spaces.


Thanks to everyone who participated in the knowledge stream. All who prepared and delivered deep and critical provocations in the very brief time allotted:
Bodó Balázs
Shun-Ling Chen
Rick Falkvinge
Marco Fioretti
Charlotte Hess
Gaëlle Krikorian
Glyn Moody
Mayo Fuster Morrell
Prabir Purkayastha
Felix Stalder
Talha Syed
Wouter Tebbens
Alan Toner
Chris Watkins

Also thanks to Mayo Fuster Morrell and Petros for helping coordinate during the stream, and though neither could attend, Tal Niv and Leonhard Dobusch for helpful conversations about the stream and its goals. I enjoyed working with and learned much from the other stream coordinators: Saki Bailey (nature), Heike Löschmann (labor & care), Ludwig Schuster (money), and especially Miguel Said Vieira (infrastructure; early collaboration kept both infrastructure and knowledge streams relatively focused); and stream keynote speaker Carolina Botero; and conference organizers/Commons Strategy Group members: David Bollier, Michel Bauwens, and Silke Helfrich (watch their post-conference interview).

See the conference wiki for much more documentation on each of the streams, the overall conference, and related resources.

If a much more academic and apolitical approach is of interest, note the International Association for the Study of the Commons held its 2013 conference about 10 days after ECC. I believe there was not much overlap among attendees, one exception being Charlotte Hess (who also chaired a session on Governance of the Knowledge and Information Commons at the IASC conference).

ECC only strengthened my feeling (but, of course I designed the knowledge stream to confirm my biases…) that a much more bold, deep, inclusive (domains and methods of commoning, including informality, and populations), critical (including self-critical; a theme broached by several of the people thanked above), and competitive (product: displacing enclosure; policy: putting equality & freedom first) knowledge commons movement, or vanguard of those movements. Or as Carolina Botero put it in the stream keynote: bring the commons in through the front door. I promise to contribute to this project.

ECC also made me reflect much more on commons and commoning as a “core paradigm” for understanding and participating in the arrangements studied by social scientists. My thoughts are half baked at best, but that will not stop me from making pronouncements, time willing.


Sunday, October 6th, 2013

How the NFL Fleeces Taxpayers by Gregg Easterbrook is a fine article, adding to the not nearly large enough pile of articles criticizing the U.S. professional sports civic extortion racket. With a bonus explicit connection with copy regulation. I’ll quote just the directly relevant paragraphs:

Too often, NFL owners can, in fact, get away with anything. In financial terms, the most important way they do so is by creating game images in publicly funded stadiums, broadcasting the images over public airwaves, and then keeping all the money they receive as a result. Football fans know the warning intoned during each NFL contest: that use of the game’s images “without the NFL’s consent” is prohibited. Under copyright law, entertainment created in publicly funded stadiums is private property.

When, for example, Fox broadcasts a Tampa Bay Buccaneers game from Raymond James Stadium, built entirely at the public’s expense, it has purchased the right to do so from the NFL. In a typical arrangement, taxpayers provide most or all of the funds to build an NFL stadium. The team pays the local stadium authority a modest rent, retaining the exclusive right to license images on game days. The team then sells the right to air the games. Finally, the NFL asserts a copyright over what is broadcast. No federal or state law prevents images generated in facilities built at public expense from being privatized in this manner.

Baseball, basketball, ice hockey, and other sports also benefit from this same process. But the fact that others take advantage of the public too is no justification. The NFL’s sweetheart deal is by far the most valuable: This year, CBS, DirecTV, ESPN, Fox, NBC, and Verizon will pay the NFL about $4 billion for the rights to broadcast its games. Next year, that figure will rise to more than $6 billion. Because football is so popular, its broadcast fees would be high no matter how the financial details were structured. The fact that game images created in places built and operated at public expense can be privatized by the NFL inflates the amounts kept by NFL owners, executives, coaches, and players, while driving up the cable fees paid by people who may not even care to watch the games.

Easterbrook’s idea for reform also involves copy regulation (emphasis added):

The NFL’s nonprofit status should be revoked. And lawmakers—ideally in Congress, to level the national playing field, as it were—should require that television images created in publicly funded sports facilities cannot be privatized. The devil would be in the details of any such action. But Congress regulates health care, airspace, and other far-more-complex aspects of contemporary life; it can crack the whip on the NFL.

If football images created in places funded by taxpayers became public domain, the league would respond by paying the true cost of future stadiums—while negotiating to repay construction subsidies already received. To do otherwise would mean the loss of billions in television-rights fees. Pro football would remain just as exciting and popular, but would no longer take advantage of average people.

This idea would have many loopholes (team owners are excellent at extracting public subsidies even for “privately financed” stadiums), but would be a step forward. It is good to see the principle of public funding means public domain applied in new domains (it is as yet a mostly unrealized, but accepted by many activists, goal for domains such as public sector information, cultural heritage, and academic publication).

While on the topic, another mostly good recent article is Death of a sports town: What does a city lose when its pro teams leave? Oakland just might find out. Two caveats. A questionable story about a kid who sees a football player turned police officer as a role model. Any reliance on such a coincidence for role models shows just how badly Oakland and many other cities are policed — residents should be demanding performance and compliance from police such that most officers are obvious role models for youth. The article also repeats the specious claim that “pro sports are the city’s plumb line, cutting across class and race and elevation.”

While on that claim, Doug Whitfield republished my article, (original) with commentary on top:

I’m going to try something new today. Over at his blog, Mike Linksvayer dedicates his posts to the public domain. That means I don’t have to give attribution to his work, but obviously I’m doing so. I think he’s wrong that art brings all classes and cultures together. How many “red necks” or “thugs” do you see at the opera? How many women wearing Prada do you see enjoying the finer arts of graffiti or break-dancing? I also think he’s wrong about groceries. There are plenty of people that can’t afford to shop at Whole Foods (or choose not to because of their anti-union policies).

But that’s not the point. The point is that we as sports-enthusiasts need to highlight amateur athletics and player-owned and supporter-owned clubs to combat these stereotypes about athletics. Not all athletics are bad.

It is worth thinking about how sports can destroy communities and relationships though, even if you don’t think it’s happening in your life or even if the positives outweigh the negatives. Either way, please enjoy what is probably a different view than your own.

Whitfield is wrong about art and groceries. Yes, various forms and genres have fans concentrated with various demographics. But there are also huge and increasing crossovers, especially when it comes to popular art. It’s acceptable and unsurpriing for anyone to be a fan of anything. With regard to groceries, I know plenty of wealthy people who shop at Wal-Mart (or locally, Grocery Outlet) and plenty of poor people who shop at Whole Foods (or Berkeley Bowl), and even more who shop at all. Note the trend in both culture and shopping is exactly the opposite of stadium attendance — increased mixing vs increased stratification.

Whitfield is right about the point. Athletics is good. How can arrangements which do not destroy communities and increase inequality compete with the extortion racket?

Whitfield also republished a shorter article on pro sports civic extortion (original) of mine, and on another of his blogs, on post on the federated social web (original). I appreciate the experiment, which the latter is tiny bit relevant to, mentioning that blog technology (and culture) failed to compete with “social” silos, or failed to form the basis of the “social web”, depending on whether your glass is 90% empty or 10% full. One of the things blogs generally failed to compete on is “sharing” links, sometimes with brief commentary. One can do that with a blog of course, and people do, but it isn’t central to blogging.

Public copyright license readability metrics

Sunday, September 22nd, 2013

Promised boring topic blog post in form of README snapshot.

The README with tables removed has a Flesch Readability Ease score of 48.5, slightly worse than the average license text. I did not try to write intelligibly, though I should. The topic may have subconsciously restrained parenthetical discursiveness.

Automated readability metrics for public copyright licenses. Give style a list of plain license texts, generate HTML table containing metrics.

In Debian, style is available in the diction package.

License texts are referenced from the SPDX licenses list. Other license curiosities are included in licenses-other.

sh licenses-spdx/*.txt licenses-other/*.txt


Part of one of the goals of the Creative Commons (CC) licenses version 4.0 effort is to make the licenses "readily understood". One way to test that is with automated readability metrics, on which CC licenses version 3.0 score poorly (previous versions scored much better). I checked an early version 4.0 draft, and scored much better, more or less back to version 2.5 scores, quite an accomplishment given it is a more sophisticated license in many ways. I did not check again until the near-final 4th draft was published. Its score is not as good as early drafts, probably to be expected as details were settled, but still a big improvement over 3.0. I intended to blog the early 4.0 draft improvement at that time but didn’t get around to it.

In the meantime I’d peeked at the readability metrics for various free/open source software licenses, in part to see if copyleft-next scored better than comparable licenses (probably, though comparability is problematic). With the CC 4.0 licenses nearly final, I started a blog post about readability of various licenses, and ended up with this README and associated files.

See Caveats and Output below for readability metrics for about 228 licenses. There probably will not be any big surprises awaiting anyone familiar with the usual relatively popular licenses. A small selection of licenses not in the SPDX licenses list (including CC 4.0 drafts and copyleft-next versions) are at the end.


Drafters understandably try hardest to "get the legal details right". But if "licenses are the constitutions of software communities"12, even a little bit (I think a casual reading of that quote makes licenses far more central than they are, or implies impoverished communities, but will take its repetition as an indicator of licenses’ social importance), perhaps yet more effort ought be put into making licenses more understandable.

  • There is probably a large literature on readability and understandability of contracts, legislation, regulation, and other legal texts, which ought be digested for lessons for the public copyright licensing community. Apparently many jurisdictions have "plain language" requirements for contracts. Some U.S. states require insurance forms to have a minimum Flesch Reading Ease score. Is this an indicator that readability metrics are useless, or should free/libre/open/software/knowledge communities be embarrassed that they have failed to self-regulate to this level?
  • Cloze testing and subjective evaluation (both requiring humans) and natural language processing/machine learning based metrics are suggested by a readability tools site in addition to simple automated readability metrics. The site, by Michael Curtotti, is presumably discussed in his forthcoming paper The Right to Access Implies a Right to Know: An Open Online Research Platform for Assessing the Readability of Law. Could some of these tools be useful for evaluating licenses? Barriers would include lack of interest needed to pay for human testing, and a relatively small corpus of license texts. Hopefully the source code for this platform will be made available.
  • Attempts to increase readability and understandability outside of changing the words in a license text could be evaluated, including summaries, FAQs, choosers, and typography and other design elements around web publication of the license text itself.
  • There are many additional obscure licenses intended for "content", "data", "government", and "hardware designs" not included in the SPDX license list that could be analyzed.
  • Non-English license texts could be analyzed with language-appropriate metrics. In addition to the few CECILL licenses included in the SPDX licenses list, targets could include the many official language versions of EUPL versions, unofficial translations of GPL versions, License Art Libre, various public sector-focused licenses, and hundreds of CC license "jurisdiction ports".
  • To what extent is understanding of licenses social, gained via hearsay, not based on reading license texts at all? If social learning currently predominates, does this indicate that license readability and understandability are unimportant? Or that their lack constitutes an obscurantist barrier to participation by people not socially connected to existing communities, and increase other risks, such as non-compliance through ignorance, and being ignored by policymakers?
  • Would it be valuable to use readability metrics to test other texts important to free/libre/open communities, e.g., documentation, codes of conduct, contributor agreements?


General, with respect to the metrics:

  • Metric explanations are available in the style man page. All are problematic.
  • Lower numbers indicate better readability for all metrics except Flesch.
  • None of the metrics incorporate text length, so correlations with character count ought indicate that longer texts tend to use more or less readable language. But 3 of the metrics positively correlate readability with longer texts, and 4 negatively, which might indicate no overall correlation (taking the numbers at face value, with no further validation).
  • Not sure why Coleman-Liau’s correlations with other metrics are much weaker than among others; at a glance the formula is measuring the same types of things.
  • Arbitrarily choosing to focus on Flesch, as it seems widely used, and its more is better makes for an easier combination with text length, "Chars/(Flesch>=1)", to indicate how painful reading an entire license might be.
  • Flesch can be negative, so a minimum value of 1 is used for the pain calculation. This is arbitrary too.

The following tables are calculated in scratch.ods.

Readability metric correlations: nothing really surprising, no gross errors?
Kincaid ARI Coleman-Liau Fog Lix SMOG Flesch Chars/(Flesch>=1)
Characters 0.12 -0.10 -0.27 0.13 -0.15 0.25 -0.25 0.96
Kincaid 0.89 0.04 0.99 0.81 0.90 -0.91 0.32
ARI 0.30 0.89 0.97 0.70 -0.67 0.10
Coleman-Liau 0.07 0.41 0.11 -0.09 -0.20
Fog 0.82 0.93 -0.90 0.33
Lix 0.63 -0.59 0.04
SMOG -0.95 0.43
Flesch -0.43
Aggregate metrics: compare your favorite license to the masses and outliers.
Characters Kincaid ARI Coleman-Liau Fog Lix SMOG Flesch Chars/(Flesch>=1)
average 8318.7 12.8 16.0 14.5 16.1 59.1 13.4 50.7 177
median 7321.5 12.6 15.4 14.4 15.9 57.9 13.2 50.4 160
stdev 6864.8 2.9 3.5 1.4 3.1 7.4 1.8 11.1 152
min 209 4.5 8.2 10.3 7.0 42.5 7.6 -25.8 2
max 36285 37.0 45.7 18.0 40.3 116.8 24.9 83.3 806

With respect to particular licenses:

  • The CECILL licenses, except 1.1, are in French. These readability metrics may not be tuned for French, though the results do not look weird.
  • The CC by-nc-sa-4.0-drafts are drafts. Every other license analyzed is "released".
  • GPL-[version]-with-[exception name]-exception are not complete licenses, should be appended to the relevant GPL-[version]. However, standalone (as provided by the SPDX licenses list) provides an idea of how readable each exception is.
  • LGPL-3.0[+] incorporates GPL-3.0 by reference, so it is not directly comparable to GPL-with-exceptions above, nor with other licenses.
  • Some licenses (most notably [A]GPL and FDL) have a preamble or addendum which explain the license’s purpose and how to use the license. This makes such a license longer, but arguably increases understandability in a way not captured by an automated readability metric.
  • The only license with a negative Flesch score is the Historic Permission Notice and Disclaimer (HPND), which is deprecated. It deserves the score, basically being a template with many optional and fill-in parts.
  • The longest and also most "painful" to read license, the Adaptive Public License (APL), is also basically a template with options and fill-in parts.
  • The shortest and also least "painful" to read license, the Fair License might require too much imagination about what "usage" means to actually be easily understandable.


SHA1 License Characters Kincaid ARI Coleman-Liau Fog Lix SMOG Flesch Chars/(Flesch>=1)
f53aa44a98a67f79d79bb061a39ac0694c017d88 AAL 2347 14.7 20.8 16.1 17.9 69.6 13.7 49.4 47
b26853ef3e258172c7bc9e7a69e9582d651c0269 AFL-1.1 3827 11.1 15.9 15.3 15.1 59.6 12.8 59.7 64
54f83bc9e70424af32e5a133c47e76698086369c AFL-1.2 4059 13.7 15.8 15.2 18.3 58.8 15.4 41.1 98
735e1f8b4613292d7d80e51e5a586e34ac852a74 AFL-2.0 7105 12.8 14.4 14.7 17.0 56.5 14.6 44.1 161
fedb7d79211a6e58a65b46985f47fa834b00ee6f AFL-2.1 7103 12.8 14.4 14.8 17.0 56.4 14.6 44.0 161
5b400f7a1518b5e43a913085fa338e3df1e9e241 AFL-3.0 8314 13.8 15.6 14.4 17.8 58.2 15.0 42.2 197
ecf6b4a3803b9706a0c38d30b0d07b0c624001ed AGPL-1.0 12578 19.0 23.4 12.5 21.9 71.9 14.8 38.0 331
c34c24e89e6c26506a4aa9535425afe6af4ab700 AGPL-3.0 27208 14.4 16.8 13.4 17.5 59.1 14.2 44.8 607
2b6ca3805481833fddead9c45f92fe4c81d4017d Aladdin 9270 13.6 16.9 13.2 17.0 60.1 13.5 51.6 179
295765ae399d1a9ced2bc4e1fb096e83e529cbfa ANTLR-PD 792 10.3 11.4 12.3 13.8 43.6 12.3 58.4 13
acc3577130a1e528970142d1e5180f554b7fdad9 Apache-1.0 2021 10.7 15.8 16.2 13.9 55.5 12.0 60.0 33
81d8a4169126e0af11b4d51449b6c420880c6d40 Apache-1.1 2017 11.0 16.5 17.9 14.0 60.0 12.3 55.7 36
8ffe2c5c25b85e52f42fcde68c2cf6a88b7abd69 Apache-2.0 8310 16.8 19.8 15.1 20.7 64.6 16.6 33.6 247
4f97e77af1aac9f8ef6500cd2a08915741c37f2c APL-1.0 36285 14.2 17.7 14.8 18.1 62.4 14.9 45.0 806
158031d76c5611507e81870b0a649461eb74be7f APSL-1.0 15302 12.5 15.2 13.7 15.7 55.9 13.1 51.9 294
e444feb210ce2096e565fb0613f98d04f2d97f91 APSL-1.1 15735 13.1 16.0 13.8 16.2 57.1 13.4 50.2 313
a19d874fcde9c037e40cd41916697ac5aac2e220 APSL-1.2 15603 13.1 16.1 13.9 16.2 57.6 13.4 50.0 312
b64068ced2da810cdadd07ac9053c192271e0a56 APSL-2.0 15945 12.4 15.4 14.0 15.4 56.0 12.8 52.2 305
c11ec559ebca765ba8f8d16634e288cdc75dff81 Artistic-1.0-cl8 3689 11.7 13.8 13.9 14.0 55.0 12.1 51.8 71
bcd8b4d1a1af706aaa1337811786a9dc6673c822 Artistic-1.0-Perl 4308 12.6 14.8 13.6 15.0 57.1 12.6 49.9 86
17c9069548d063de8fefb58b995be99c1d08bd45 Artistic-1.0 3421 11.6 13.7 14.0 13.8 54.8 12.0 52.2 65
8e42910d467b06d6af9a008678122dc61a245fcc Artistic-2.0 6949 13.1 16.1 14.5 15.4 60.7 12.8 48.3 143
d82c8eb2abc453fbd4a56aca46b22fe9fdad780d BitTorrent-1.0 19085 20.9 25.7 14.2 24.3 79.0 17.2 27.7 688
d183df8131a7114052fc3c3de647dca5fbdcb79a BitTorrent-1.1 22188 12.3 14.4 14.3 15.5 56.9 13.3 48.9 453
f45386af24b0d36976c96eac8baf5d205bed1570 BSD-2-Clause-FreeBSD 1240 11.5 18.5 16.5 15.2 66.3 12.1 62.9 19
a61e0646333b20301525695918aae3656344f611 BSD-2-Clause-NetBSD 1137 10.2 17.3 16.4 14.1 61.6 11.4 68.2 16
0fa6c43e2345f4768176f63ad24e469b832a40ac BSD-2-Clause 1046 12.3 20.3 16.5 16.1 68.0 11.9 63.8 16
cab0ab541f4f5f1ecf493b9259617df33dcbfa3d BSD-3-Clause-Clear 1372 11.5 18.1 15.9 14.9 64.8 11.8 63.3 21
54f1eeb17a7341ea0a0261a59bc5170b23137eb9 BSD-3-Clause 1200 12.5 20.0 16.3 16.0 68.2 12.0 61.5 19
f579ecea35ef059d706b32108097a960990b777d BSD-4-Clause 1325 11.9 18.0 17.0 15.5 65.1 12.8 57.0 23
837b0df8f4d995591d45c939cf567d6db8ba03d8 BSD-4-Clause-UC 1448 11.9 17.9 17.3 15.9 65.3 13.4 55.7 25
388fa291da4bd074a17d7b33334696eb71bf5ff8 BSL-1.0 1084 21.8 29.1 14.5 25.3 87.3 15.8 33.0 32
0302aaced8d1dbe1916fa0281c6a717069fda16f CATOSL-1.1 15220 15.6 18.9 15.3 19.3 65.0 15.7 38.1 399
74286ae0dfea38c489437bf659b209737945145c CC0-1.0 5116 16.2 19.5 15.0 19.5 66.3 15.6 36.8 139
c766cc6d5e63277e46a3d83c6254e3528082587b CC-BY-1.0 8867 12.6 15.5 14.1 16.4 57.8 13.8 51.3 172
bf23729bec8ffd0de4d319fb33395c595c5c762b CC-BY-2.0 9781 12.1 14.9 14.3 16.1 56.7 13.7 51.9 188
024bb6d37d0a17624cf532bd14fbd42e15c5a963 CC-BY-2.5 9867 11.9 14.7 14.2 15.8 56.3 13.6 52.6 187
20dc61b94cfe1f4ba5814b340095b4c3fa23e801 CC-BY-3.0 14956 16.1 19.4 14.1 20.4 66.1 16.2 40.0 373
e0c4b13ec5f9b5702d2e8b88d98b803e07d65cf8 CC-BY-NC-1.0 9313 13.2 16.2 14.3 17.0 59.3 14.1 49.3 188
970421995789d2e8189bb12071ab838a3fcf2a1a CC-BY-NC-2.0 10635 13.1 16.1 14.6 17.2 59.5 14.4 48.1 221
08773bb9bc13959c6f00fd49fcc081d69bda2744 CC-BY-NC-2.5 10721 12.9 15.8 14.5 16.9 59.0 14.2 48.9 219
9639556280637272ace081949f2a95f9153c0461 CC-BY-NC-3.0 15732 16.5 19.9 14.1 20.8 67.2 16.4 38.7 406
9ab2a3818e6ccefbc6ffdd48df7ecaec25e32e41 CC-BY-NC-ND-1.0 8729 12.7 15.8 14.4 16.4 58.6 13.8 51.0 171
966c97357e3b529e9c8bb8166fbb871c5bc31211 CC-BY-NC-ND-2.0 10074 13.0 16.1 14.7 17.0 59.7 14.3 48.8 206
c659a0e3a5ee8eba94aec903abdef85af353f11f CC-BY-NC-ND-2.5 10176 12.8 15.9 14.6 16.8 59.2 14.2 49.3 206
ad4d3e6d1fb6f89bbd28a44e263a89430b575dfa CC-BY-NC-ND-3.0 14356 16.3 19.7 14.1 20.5 66.8 16.2 39.7 361
39b2ef67be9e5b4e743e5269a31ad1691515eede CC-BY-NC-SA-1.0 10228 13.3 16.3 14.2 17.0 59.7 14.2 48.4 211
5800ac2d32e35ace035cdcae693423cd9ff5bb6f CC-BY-NC-SA-2.0 11927 13.3 16.2 14.7 17.1 60.0 14.4 47.0 253
e5f44c2df6b1391d1ddb6efb2db6f90670e4ae67 CC-BY-NC-SA-2.5 12013 13.1 16.0 14.6 16.9 59.6 14.2 47.7 251
a63b7e81e7b9e30df5d253aed1d2991af47992df CC-BY-NC-SA-3.0 17134 16.4 19.7 14.2 20.6 67.0 16.3 38.8 441
e4851120f7e75e55b82a2c007ed98ffc962f5fa9 CC-BY-ND-1.0 8280 12.3 15.5 14.3 16.1 57.9 13.6 52.4 158
f1aa9011714f0f91005b4c9eb839bdb2b4760bad CC-BY-ND-2.0 9228 11.9 14.9 14.5 15.8 56.9 13.5 52.7 175
5f665a8d7ac1b8fbf6b9af6fa5d53cecb05a1bd3 CC-BY-ND-2.5 9330 11.8 14.7 14.4 15.6 56.5 13.4 53.2 175
3fb39a1e46419e83c99e4c9b6731268cbd1591cd CC-BY-ND-3.0 13591 15.8 19.2 14.1 20.0 65.6 15.9 41.2 329
dda55573a1a3a80d294b1bb9e1eeb3a6c722968c CC-BY-SA-1.0 9779 13.1 16.1 14.2 16.8 59.1 14.0 49.5 197
9cceb80d865e52462983a441904ef037cf3a4576 CC-BY-SA-2.0 11044 12.5 15.3 14.4 16.2 57.9 13.8 50.2 220
662ca9fce7fed61439fcbc27ca0d6db0885718d9 CC-BY-SA-2.5 11130 12.3 15.0 14.4 16.0 57.5 13.6 50.9 218
4a5bb64814336fb26a9e5d36f22896ce4d66f5e0 CC-BY-SA-3.0 17013 16.4 19.8 14.1 20.5 67.2 16.2 38.9 437
238de92eb09c2e33e4e5fb438fe578fe5179276b CDDL-1.0 12605 11.6 13.9 14.9 14.7 55.1 12.9 50.4 250
8c7adc36e1b6f20e0cfa5fc40cefe6a427fb2cb6 CDDL-1.1 13407 12.0 14.4 15.0 15.1 56.0 13.2 49.5 270
46ebe8c487ec3e321842ed1325d98d757f965e14 CECILL-1.0 14796 11.9 12.3 11.1 15.5 51.1 13.3 53.4 277
052845a59dca83a104558addc1fdfb2cff82d328 CECILL-1.1 15874 12.0 14.1 14.3 15.4 54.3 13.4 49.9 318
c8ddd94454934cb1869ef96bddc93ff44039c591 CECILL-2.0 15163 13.2 14.0 11.1 17.0 55.0 14.1 49.9 303
04e73e027c1f47dbf743cb013480bbc974e3a8c3 CECILL-B 15337 13.4 14.2 11.3 17.1 55.5 14.2 49.2 311
1308e5090e66dcba2e594950dc4a8021551fa540 CECILL-C 15646 13.9 14.7 11.0 17.7 56.2 14.5 48.0 325
10ae2b5540f376c8cac9ccedc38ddc3435207efa ClArtistic 4511 12.5 14.8 14.0 14.8 57.3 12.6 49.6 90
cebccd48cf2bad04b29e863c564d8fd1c1f5ee15 CNRI-Python-GPL-Compatible 3172 13.0 17.8 15.6 16.2 62.1 13.1 52.7 60
18756dcb45d9598b5281368a7d35cd5e9a88306b CNRI-Python 2699 12.0 16.4 14.0 15.1 59.0 12.1 59.0 45
4bb47f04bcd1c7afb44ceb13c3bd2f62b9e0af6e Condor-1.1 4855 12.3 16.1 16.0 14.8 59.8 12.6 50.5 96
a4ece6afe1e4e92ba5985bba6f1ce76d2ee24dbb CPAL-1.0 22039 12.7 14.7 14.4 16.2 56.2 13.9 47.0 468
433089094810035bd296b27931ff68464676ed5b CPL-1.0 9273 14.8 18.1 16.6 18.3 63.6 15.3 37.7 245
251beebfa122c0c58abf32bb8224e1b9ebb6db59 CPOL-1.02 9216 10.7 13.1 13.1 14.0 49.9 12.1 59.5 154
a529e9bff1eb4f976a9bf1eb3ef8054e52967a91 CUA-OPL-1.0 18086 12.1 14.3 14.3 15.3 55.0 13.3 49.6 364
0a5785a9fe34a8f779ee79f8333ee766d5c0676e D-FSL-1.0 12123 11.4 14.0 17.2 13.9 49.2 12.5 45.1 268
04ed6736b16995b2bbd3fd7b4fb1cb6efa44b6a6 ECL-1.0 1949 15.2 18.9 15.6 19.4 67.7 15.8 40.2 48
2a3706dec618b5198ba177691bbf30d97becc7a8 ECL-2.0 8955 17.0 20.0 15.2 20.9 65.4 16.8 32.5 275
8d7c74721fac21d583f9bffafb5747ad6994f695 eCos-2.0 1148 11.0 12.7 11.2 14.0 49.4 11.7 60.7 18
7b8021b0d18d9fd4f5ac7bac3a5584c9fb4d5966 EFL-1.0 521 6.6 13.1 14.9 9.6 58.0 7.9 83.3 6
530270003ac19b54a548e13b08108c1abf166a09 EFL-2.0 630 6.3 11.5 14.4 8.7 56.6 7.6 81.0 7
a3ce248131ee7e9eca19460ddd1c7858350aed9e Entessa 1827 9.8 15.1 17.0 12.8 59.5 11.4 61.2 29
11fadbd49466127930da08f01fea6c803dc8c462 EPL-1.0 9028 14.6 17.9 16.6 18.1 63.4 15.2 38.5 234
5a46ff9626e703387228d6de1d695ce9d8d47931 ErlPL-1.1 11028 11.9 14.1 14.2 15.2 54.6 13.1 50.5 218
9098acfa2c2b7780da7d3644faa81df5a44a0536 EUDatagrid 2605 14.0 19.8 18.0 17.6 69.3 14.5 45.0 57
4654cbafc6474f59de1234a11eb6462a02aaffe6 EUPL-1.0 9821 12.6 13.8 13.5 16.4 54.1 14.1 46.5 211
2aeeba8e44c78afd0b9064a43a277158cb018227 EUPL-1.1 10047 12.6 13.8 13.7 16.4 54.3 14.1 46.3 216
5c18b40cb5bbd57b858a2e2827fc89d66e202894 Fair 209 4.5 8.2 15.0 7.0 42.5 7.7 80.4 2
bded6d45d800403709fa630a58c0f1d68e3365e7 Frameworx-1.0 7536 17.4 22.0 16.3 21.5 72.4 17.0 33.4 225
f3a7b9a82d2cecb0edcf57d7d8aa0e37f6fcde66 FTL 4580 10.2 12.4 13.8 13.2 53.1 11.8 57.9 79
579a1d52e08b7a09429df7f7651dd3ef747727c7 GFDL-1.1 14318 12.5 14.4 13.5 15.4 56.1 13.1 49.9 286
eb670eaf7269bf3cb8990a52b05618e5dbd963b6 GFDL-1.2 16124 12.3 14.2 13.6 15.4 55.7 13.2 50.1 321
394dbacde4c26a5f58c9823e25ef6e937eba75a3 GFDL-1.3 18147 12.6 14.6 13.5 15.8 56.1 13.4 49.3 368
21d887b87b4297c5b6eea0c300b77ee8b3f8337d GPL-1.0 9288 12.2 14.9 12.2 15.1 54.6 12.1 57.4 161
21d887b87b4297c5b6eea0c300b77ee8b3f8337d GPL-1.0+ 9288 12.2 14.9 12.2 15.1 54.6 12.1 57.4 161
0473f7b5cf37740d7170f29232a0bd088d0b16f0 GPL-2.0 13664 13.3 16.2 12.5 16.2 57.0 12.7 52.9 258
80c08b24ac7e98376c0c387a1890a283e9c5ffe0 GPL-2.0+ 13664 13.3 16.2 12.5 16.2 57.0 12.7 52.9 258
3a9262b9d066ce5d41feb871b1f786336a20628b GPL-2.0-with-autoconf-exception 1313 12.0 13.8 12.9 14.7 53.5 12.5 52.8 24
86084a36e75fb92c36082dd61ac812495446f6d8 GPL-2.0-with-bison-exception 561 18.3 20.7 12.7 21.0 64.5 15.6 32.7 17
be15f8fcc097be4db1cfef6f469108b4364db84e GPL-2.0-with-classpath-exception 785 13.2 14.9 12.7 16.7 55.4 14.0 48.4 16
3cddc56f4cb24809cabd4af80ce3d3bda97152f8 GPL-2.0-with-font-exception 492 12.5 13.3 10.3 16.9 51.2 14.0 55.2 8
22ece69587f935f0bf61df00cec4b2c4f73163f7 GPL-2.0-with-GCC-exception 283 24.7 30.3 15.1 29.0 87.3 20.3 14.2 19
d4ec7d0b46077b89870c66cb829457041cd03e8d GPL-3.0 27588 13.7 16.0 13.3 16.8 57.5 13.8 47.2 584
d4ec7d0b46077b89870c66cb829457041cd03e8d GPL-3.0+ 27588 13.7 16.0 13.3 16.8 57.5 13.8 47.2 584
a0ac5d9bc70d97d2f2068f87779aa9fe35368dd8 GPL-3.0-with-autoconf-exception 1460 10.3 12.0 14.1 13.5 51.5 12.2 55.5 26
3d7d507c4df41892664f128b810daf0131e0b817 GPL-3.0-with-GCC-exception 2716 13.4 15.1 15.2 16.3 58.5 14.0 40.6 66
36f65f578919826062bffcca2557317294953ad3 gSOAP-1.3b 15881 10.6 14.1 14.3 13.8 55.2 11.9 59.5 266
5f20c1e3037bc2aba7780b6e69a0a10004811a5a HPND 458 37.0 45.7 17.5 40.3 116.8 24.9 -25.8 458
80de1c399f7117b9c56a196048df93b8c41d8e4f IBM-pibs 623 12.4 14.5 13.6 15.4 53.8 13.1 50.8 12
f854fffe51b32ca7b7dfa93662444aec6cb96f49 IJG 3294 9.0 10.3 13.2 12.1 46.1 11.2 61.1 53
375a8db2e5dafcdccfd8b45da15b7e74d9332353 Imlib2 1632 13.9 18.6 14.4 17.1 64.5 13.2 52.2 31
df867beac70889c3e7c68ae5ff5f377ed50118b7 Intel 1508 8.4 13.7 15.5 12.1 53.4 10.7 71.0 21
a9772a55cbe512cb612ecbb89cd65a6b19ee6bf8 IPA 7214 13.9 17.2 14.7 17.6 66.6 14.6 46.2 156
9a1ce5b388ac0523419987d4a75006e099850126 IPL-1.0 8989 13.5 16.4 16.7 17.1 60.6 14.7 40.6 221
18f98daf7abe3959a3a7a0642ec6106f55c4a54d ISC 663 8.9 15.6 14.2 11.9 59.8 8.5 77.6 8
95d1a9940f507660b5cd6afb4f91363d07c59933 JSON 894 13.9 20.0 13.6 17.2 68.8 11.8 59.0 15
1fec28a7b0a64d83a5922274069b90804012ca6f LGPL-2.0 19661 13.1 15.5 12.4 16.0 56.6 12.8 52.2 376
1fec28a7b0a64d83a5922274069b90804012ca6f LGPL-2.0+ 19661 13.1 15.5 12.4 16.0 56.6 12.8 52.2 376
46dee26f31cce329fa13eacb74f8ac5e52723380 LGPL-2.1 20570 13.2 15.6 12.5 15.9 56.8 12.7 51.8 397
46dee26f31cce329fa13eacb74f8ac5e52723380 LGPL-2.1+ 20570 13.2 15.6 12.5 15.9 56.8 12.7 51.8 397
c902338383ea4324b02c8fa7fc6054bf100b2c06 LGPL-3.0 5887 13.1 14.9 12.7 15.7 56.8 13.0 49.5 118
c902338383ea4324b02c8fa7fc6054bf100b2c06 LGPL-3.0+ 5887 13.1 14.9 12.7 15.7 56.8 13.0 49.5 118
07041306373f72b71a8d4ecf55f268d47ab70404 Libpng 2854 9.8 12.2 14.2 13.2 49.3 11.9 59.1 48
0659190facd4ce63eb62a5cfcb64c85685b9b879 LPL-1.02 9329 13.5 16.5 16.8 17.2 60.7 14.7 40.3 231
c49cfb2721bfaa5a3aec5108c56582ba981f120f LPL-1.0 9358 13.6 16.5 16.6 17.2 61.1 14.7 40.4 231
2063306474e511b6f156a1d6be84002c05e78833 LPPL-1.0 6647 12.8 14.1 13.4 15.5 53.7 13.2 46.3 143
a97c5a459de398311c98f4ac9226fa8f8302eccc LPPL-1.1 10684 14.4 16.1 13.4 17.6 59.0 14.5 42.2 253
3be04fe75fae2be93de36fe7dfd6c1b9dbd282c9 LPPL-1.2 10749 14.5 16.2 13.4 17.6 59.1 14.5 42.0 255
2c40ed15717ba7ffcf1f93fa2d6f31b5d930237f LPPL-1.3a 13966 13.2 15.2 12.9 16.6 55.0 13.8 49.4 282
6820ac06cce2932140b6f0f3e7e4579021d81623 LPPL-1.3c 14405 13.2 15.2 12.8 16.7 55.1 13.9 49.3 292
025f38a752cdd6f4ffd5847c8eb6af0265f62b9a MirOS 1680 16.3 18.5 11.1 19.2 62.9 14.1 44.0 38
d25ad2c65dde58aeacd9ad6ef9faff476bcbf19e MIT 866 16.8 23.8 14.2 20.1 77.8 12.9 50.2 17
47cd2bb52728ba7fe49c302ce2fcf9f07a69031c Motosoto 16170 12.9 15.4 14.4 16.0 58.8 13.5 48.0 336
ae49332ade14c453c46e44548bdb35e5a6457890 MPL-1.0 13965 11.2 13.5 14.0 14.4 53.6 12.6 54.4 256
90c670fbb524656b8196aa2065bf3bb1d3e8ced5 MPL-1.1 18213 13.0 15.4 14.4 16.2 57.3 13.8 46.9 388
78fe0ed5d283fd1df26be9b4afe8a82124624180 MPL-2.0-no-copyleft-exception 11766 14.7 16.9 14.5 17.9 60.5 14.9 40.1 293
78fe0ed5d283fd1df26be9b4afe8a82124624180 MPL-2.0 11766 14.7 16.9 14.5 17.9 60.5 14.9 40.1 293
d0f1fedc0533327331d9c9d9c2a9ef4326306212 MS-PL 2104 12.3 13.8 15.1 14.8 54.5 13.1 44.7 47
c510da2adf4c570a4401348a0051b41484b6e151 MS-RL 2415 12.3 14.0 14.6 14.8 53.9 12.9 46.8 51
4f02ff8d7e1f78728913bcadb7735afd5aadac0b Multics 1528 14.6 17.7 18.0 19.3 66.3 16.2 33.1 46
6c88b19d555e1a1828ccfc82627d53efafebe5fe NASA-1.3 11113 9.6 12.4 16.2 13.1 53.3 12.0 55.6 199
583435eee5df415ff8d2af1d5db5e47b1bb10a0c Naumen 1577 9.8 15.0 15.5 13.3 55.8 11.5 65.1 24
1910aad3d5cef18f02d842b379bf1a37bcd9316a NBPL-1.0 3879 11.6 13.7 14.0 13.9 54.9 12.1 52.2 74
fb6606a9d8eddaa82be2e51e534ee0e466755265 NCSA 1363 14.4 19.7 15.8 18.2 66.8 14.5 48.0 28
203f45fe393cee8c20a2c6354d815d8f1ce8965a NGPL 3666 10.7 12.7 12.4 13.4 50.2 11.5 59.7 61
bcfa1513b1b3fe4c06d832e4e1a98b694b936b9e Nokia 16605 12.5 15.3 14.8 15.7 57.7 13.4 49.0 338
8932181884589205d2a999827cb7da821a0cad53 NOSL 18392 12.1 14.2 14.2 15.3 55.1 13.2 49.9 368
31e0d4850f7e5bf3f5f48f3302f75cc09f326680 NPL-1.0 16286 11.4 13.7 14.1 14.5 54.2 12.7 53.5 304
3019e515abc1d4c8e82a09fe574d01875ab05ab1 NPL-1.1 21636 13.0 15.5 14.3 16.1 57.9 13.6 47.4 456
db3b003197b39a73df2fab3c2505fbb99dd92e70 NPOSL-3.0 9502 13.7 15.5 14.6 17.6 58.9 14.9 41.9 226
d8864d8d46c52460ee2a2d05572de9cdefa121ce NTP 584 18.9 22.4 17.6 21.1 72.2 16.8 20.9 27
1a5cc6614a1ac2cde709527063adc051aaef5b59 OCLC-2.0 8679 10.8 13.7 14.9 14.1 54.4 12.5 54.8 158
5d5071c43c31f56f9a48750d7015c34883ebbc8a ODbL-1.0 19659 12.5 13.4 13.7 16.6 54.2 14.3 45.6 431
eefe55ea67004f87ce035c998f593f7014fd14a0 OFL-1.0 3018 13.2 17.3 14.0 16.1 60.8 12.6 53.5 56
8042b321424aa7ad59076ed53910f3ba8cfdc444 OFL-1.1 3227 12.6 16.3 14.0 15.4 58.9 12.5 54.0 59
47409e4b3d9391bc1717b55788d67308b3645161 OGTSL 3724 11.3 13.6 14.0 14.2 54.0 12.4 53.6 69
1809980eae5c6027f604dcc623479fdcf111dfdf OLDAP-1.1 3854 11.2 13.3 14.0 13.5 54.1 11.9 53.3 72
00d6a7524efe17277039661ea99957e4da93edce OLDAP-1.2 3858 11.3 13.3 14.0 13.6 54.2 11.9 53.0 72
04904f47ddad228383ac8e19316ae560f2cc9139 OLDAP-1.3 4187 11.8 13.9 14.1 14.1 55.5 12.2 51.4 81
64859bfbec0602a1d98de6823e667157aed72996 OLDAP-1.4 4252 11.7 13.8 14.1 14.0 55.3 12.1 51.8 82
cda18d60730759abf0ec90725d458b10e2b9e36c OLDAP-2.0.1 1604 9.7 15.9 17.1 13.3 62.8 11.6 65.0 24
39905c3bae9b696f91b3387252efee9661c46744 OLDAP-2.0 1607 9.7 15.9 17.1 13.3 62.8 11.6 65.1 24
3bf504932fc78f4c32786c65393c048b055cd88e OLDAP-2.1 1800 9.2 14.7 16.6 12.7 59.9 11.3 65.8 27
17608c92f8eca35786067eba1c42a28291ce96de OLDAP-2.2.1 1816 9.2 14.7 16.3 12.9 59.3 11.4 66.1 27
e19ccd84c7378a721a1689eee6e8d155d58fe52a OLDAP-2.2.2 1836 9.4 14.9 16.6 13.4 60.4 11.9 64.6 28
8d6770bb082eadd17d814495c2fcf8d24bd49b07 OLDAP-2.2 1804 9.1 14.6 16.3 12.7 59.1 11.3 66.5 27
5aaf8041616d468dcf2f8adc15d8ee8047e643d7 OLDAP-2.3 1834 9.4 14.9 16.5 13.4 60.4 11.9 64.6 28
70fc08e656d3e4ce80cc6118414fa379e5cff63c OLDAP-2.4 1726 9.8 14.9 16.5 13.9 60.4 12.3 62.5 27
3bbe28af43f6d59244ee3d60894e73718cca4656 OLDAP-2.5 1758 9.8 14.8 16.0 13.8 59.8 12.2 63.3 27
ebf1ba925ef83d6a1158ad0204a8e8707f30324b OLDAP-2.6 1712 10.2 15.4 16.2 14.3 61.2 12.5 62.3 27
18a7008a9e4f4ce6651236427cb165bbbb32dc96 OLDAP-2.7 1792 10.0 15.1 16.0 14.0 60.4 12.3 62.6 28
daee80ea2d9c79b8f36319e17f8eb72c58114879 OLDAP-2.8 1790 10.0 15.0 16.0 13.9 60.1 12.2 62.8 28
f63158eb54c637627c609bb57833cc27093840b0 OpenSSL 4175 10.7 15.6 15.7 14.0 60.7 12.0 60.5 69
d87f38eee3178884ac7922ee2bc7af85a5cb620f OPL-1.0 15610 12.0 14.2 14.2 15.2 54.8 13.1 50.4 309
9fbd6c8b270f7383bb99df0191ac919e9f1e0f50 OSL-1.0 7078 11.7 15.0 14.8 15.3 57.6 13.2 53.4 132
9e3b5bd8803fe6518c1df58c90a2f27279285383 OSL-2.0 7853 13.2 14.9 14.4 17.3 57.3 14.7 43.8 179
86b22069d64687af5e2e791dbf3cdfe212117a0a OSL-2.1 7851 13.0 14.7 14.5 17.2 56.9 14.7 44.1 178
ff5e10e2563c74f29a3aec4a23ff29f45c0c3f88 OSL-3.0 8307 13.7 15.6 14.4 17.6 58.2 14.9 42.4 195
2d059b9dbb826799e0616f01fe336defd7915cc9 PDDL-1.0 12455 11.7 13.0 13.6 15.6 49.4 13.6 50.3 247
1816b5611a38aa92e72c19872d0aeaba2de6e7b4 PHP-3.01 2280 8.6 12.5 13.6 11.5 49.9 10.1 70.4 32
6204ef6d604c3124f574ee2b0c05b5c2c2ce4d4e PHP-3.0 2271 8.6 12.4 13.6 11.5 49.7 10.1 70.6 32
ffd7ec6573f5a0046e4091c3ccb6c8562e620c76 PostgreSQL 978 16.9 26.7 16.9 20.3 84.3 12.5 52.6 18
8e52b19d8b7af858f8d65a795d600a0c29bcc488 Python-2.0 7483 12.8 18.1 14.9 15.9 63.3 12.4 56.4 132
cfe6913b8af9e08dc0476896e2825bd3c2e3bd60 QPL-1.0 3529 12.9 14.9 15.3 15.5 56.6 13.4 43.8 80
fe5002688af6e08772e9c5e87035a24cf7d65057 RHeCos-1.1 16857 12.0 14.8 13.7 15.3 55.9 12.9 54.3 310
b18f7b4788d54f876001eb744e6e9ef7b6a90229 RPL-1.1 27119 12.6 15.8 15.0 15.9 59.2 13.5 49.7 545
3ff0d1b2011ba1792e59c614696ceb0eafda65c9 RPL-1.5 25393 12.3 15.4 15.0 15.7 58.4 13.4 50.0 507
eed9143202e3cc9e40c8ea2a5522cab69ab875cb RPSL-1.0 23768 14.0 16.9 14.4 17.2 60.3 14.2 45.3 524
6142a28306a98801d75eda3b10be6e689c6c1b3d RSCPL 16273 11.6 14.0 14.2 14.9 53.1 13.0 52.9 307
4c00bf0a49428c5572a2651fcee55dda86d160de Ruby 1913 19.6 23.4 13.6 22.8 71.0 16.5 30.4 62
b349e32616df129a43f4915f15bc7363efbd5736 SAX-PD 1730 11.8 13.0 12.0 14.6 49.8 12.4 54.2 31
59b7423f8583f65c5ed6c5789e4bb50d1c9eb48d SGI-B-1.0 10452 10.3 12.9 14.7 13.8 53.6 12.3 56.6 184
3c865d4d01abb6e12e87a55ab4a6f32ec6f26e11 SGI-B-1.1 11272 10.9 13.5 14.6 14.2 55.0 12.5 54.9 205
482c60ab189d8a097558134505f00d37ad167089 SGI-B-2.0 1155 9.7 13.9 14.3 12.8 57.1 10.9 65.6 17
2d913cb49e14d68db24e248412cc346c486a134b SimPL-2.0 1982 10.8 12.9 11.3 14.3 47.2 11.9 62.5 31
00b6083c2e6d5f10eade11a8e7a81b019e777e6d SISSL-1.2 9715 10.9 12.9 13.9 14.2 51.9 12.6 54.4 178
58d422e2fb51a3d64d6773e65e7bec3d33334a13 SISSL 11086 11.3 13.9 13.8 14.8 54.2 12.8 55.4 200
a1ebd9c1e79d08c3d342a2e6b49f1fa9fdb59e9e Sleepycat 4034 11.6 17.8 16.0 15.2 63.4 12.3 61.3 65
29fcf0e1356a3625ad0e4ced9f282cc31e2fcb87 SMLNJ 923 20.8 25.3 15.4 24.3 76.3 18.0 23.6 39
d93e3210aeffc64d43ee7e47e108cf8a7d5f6d44 SPL-1.0 17878 11.9 14.0 14.3 15.2 54.4 13.2 50.0 357
088641c6a445bf1cb01a4b0b060a6d7731401b83 SugarCRM-1.1.3 19078 11.9 14.0 14.3 15.0 54.5 13.1 50.3 379
ff007ce11f3ff7964f1a5b04202c4e95b5c82c85 Unlicense 926 10.8 15.1 12.9 14.0 56.1 11.0 66.2 13
51fcabf2b2216f7a32d58e4bac53ef10cd7c4305 VSL-1.0 1675 10.7 16.5 17.3 13.2 60.5 11.3 59.5 28
321fc468cdbdd00cb1a3fef7e9f2b53c49cb95fd W3C 2182 13.1 17.2 15.2 16.3 62.4 13.4 50.1 43
9a81c35d5a8bd3c02e017d1f6a05af59c2e4fb6e Watcom-1.0 16562 13.1 16.3 13.9 16.3 57.8 13.4 50.6 327
ceffa174420c734f28329d45a2782584cb693192 WTFPL 341 9.3 12.5 11.0 13.2 53.0 10.7 73.2 4
aac0a17c95873a6983c546bf8a1b8d5574b4517d WXwindows 1862 13.1 16.2 12.8 15.3 58.7 11.7 53.5 34
f6cdf05df7acdde7587a632d418465e3547fe498 X11 1075 14.9 20.0 13.2 18.4 68.4 13.2 52.8 20
56f041b77f41fa7e2353b711ec2e116467a48015 XFree86-1.1 1913 14.4 19.9 15.2 17.7 66.3 13.6 51.0 37
7815253be63682b2ab88623d3eb2748c6a099a1d Xnet 1000 11.5 16.2 13.1 15.0 61.0 11.5 64.5 15
2ebb89d44df1431221b9050fe35c359c4e6a2e46 YPL-1.0 6882 10.5 13.2 14.1 13.7 52.3 12.1 58.0 118
fbc04592ee3c671408fcc39eae398613b26be187 YPL-1.1 6882 10.7 13.4 14.1 13.9 52.6 12.1 57.6 119
fc3ed4984f4376db6672f9c568bdf03af2eedcc4 Zimbra-1.3 7216 15.0 19.0 14.0 18.3 63.7 14.2 46.2 156
79918b0a3364753d0d509682df8cfb8082e252d9 Zlib 657 10.0 11.7 14.0 13.0 50.9 11.8 56.5 11
fe67fb158e684e2bbea78602f736c4254384c1ea ZPL-1.1 2339 11.1 15.3 16.9 14.9 58.1 13.1 53.0 44
83cb704b06d68aba270f52d8f79c77711a925949 ZPL-2.0 1811 10.2 14.8 15.1 13.8 56.0 11.9 62.8 28
f54d181214c88437a4105d6674209cf742e568be ZPL-2.1 1676 11.4 16.7 15.4 14.7 62.4 12.1 60.4 27
8b36c30ed0510d9ca9c69a2ef826b9fd52992474 by-nc-sa-4.0d1 12465 13.0 15.0 14.9 16.3 57.4 14.0 43.9 283
4a87c7af5cde7729e2e456ee0e8958f8632e3005 by-nc-sa-4.0d2 11583 13.1 14.8 14.2 16.8 56.2 14.4 44.7 259
bb6f239f7b39343d62440bff00de24da2b3d256f by-nc-sa-4.0d3 14422 14.1 15.8 15.1 18.2 61.0 15.4 38.6 373
cf5629ae38a745f4f9eca429f7b26af2e71eb109 by-nc-sa-4.0d4 14635 13.8 15.6 15.5 17.8 60.2 15.2 38.6 379
6accdf75f661b6c431ddc69c98509009c859cd28 copyleft-next-0.1.0 7050 11.6 13.2 14.9 14.7 53.5 13.0 47.8 147
bdd57168874256bf2ce405a0095044c72e0f7894 copyleft-next-0.1.1 6940 11.5 13.1 14.9 14.5 53.2 12.9 48.2 143
4f7cbaf808b3597ac01252a3ce8f79b750114208 copyleft-next-0.2.0 7311 14.2 16.5 14.6 17.5 59.3 14.6 41.7 175
2a8b5cf0173ac66861d9bc30cbc834674a7cb072 copyleft-next-0.2.1 7332 14.2 16.6 14.7 17.6 59.9 14.7 41.5 176
c5808e5f27b516498eda66cd03e8f073e224e1e6 copyleft-next-0.3.0 7653 13.6 15.9 14.6 16.9 58.3 14.3 43.7 175
adb5c8b02580ff23f959a9a4a36f9d53a24cef38 FAL-1.3 6641 9.8 11.4 12.6 12.8 47.1 11.4 61.0 108
98064c6b9d40c4e43206c5343daae933155bd63a OGL-UK-1.0 4577 15.2 17.9 15.4 19.8 62.5 16.3 37.2 123
c7539d2f3a5edb8fd71e4714db0aa36e87ece9e8 OGL-UK-2.0 4555 14.4 17.0 15.4 18.7 60.7 15.7 39.6 115
02fa56fef253718abfd8756f43b322f250a515f5 TAPR-OHL-1.0 10481 13.3 15.9 14.2 15.9 55.9 13.2 47.4 221


Original files in this project are disjunctively licensed under all licenses in the SPDX licenses list 1.19 (215) and those included in licenses-other (13). Take your pick of any or all.

License texts purport to be under various terms; see each individual license text.