Collaborative Futures 3

Day 3 of the Collaborative Futures book sprint and we’re close to 20,000 words. I added another chapter intended for the “future” section, current draft copied below. It is very much a scattershot survey based on my paying partial attention for several years. There’s nothing remotely new apart from recording a favorite quote from my colleague John Wilbanks that doesn’t seem to have been written down before.

Continuing a tradition, another observation about the sprint group and its discussions: an obsession with attribution. A current drafts says attribution is “not only socially acceptable and morally correct, it is also intelligent.” People love talking about this and glomming on all kinds of other issues including participation and identity. I’m counter-obsessed (which Michael Mandiberg pointed out means I’m still obsessed).

Attribution is only interesting to me insofar as it is a side effect (and thus low cost) and adds non-moralistic value. In the ideal case, it is automated, as in the revision histories of wiki articles and version control systems. In the more common case, adding attribution information is a service to the reader — nevermind the author being attributed.

I’m also interested in attribution (and similar) metadata that can easily be copied with a work, making its use closer to automated — Creative Commons provides such metadata if a user choosing a license provides attribution information and CC license deeds use that metadata to provide copy&pastable attribution HTML, hopefully starting a beneficient cycle.

Admittedly I’ve also said many times that I think attribution, or rather requiring (or merely providing in the case of public domain content) attribution by link specifically, is an undersold term of the Creative Commons licenses — links are the currency of the web, and this is an easy way to say “please use my work and link to me!”

Mushon Zer-Aviv continues his tradition for day 3 of a funny and observant post, but note that he conflates attribution and licensing, perhaps to make a point:

The people in the room have quite strong feelings about concepts of attribution. What is pretty obvious by now is that both those who elevate the importance of proper crediting to the success of collaboration and those who dismiss it all together are both quite equally obsessed about it. The attribution we chose for the book is CC-BY-SA oh and maybe GPL too… Not sure… Actually, I guess I am not the most attribution obsessed guy in the room.

Science 2.0

Science is a prototypical example of collaboration, from closely coupled collaboration within a lab to the very loosely coupled collaboration of the grant scientific enterprise over centuries. However, science has been slow to adopt modern tools and methods for collaboration. Efforts to adopt or translate new tools and methods have been broadly (and loosely) characterized as “Science 2.0” and “Open Science”, very roughly corresponding to “Web 2.0” and “Open Source”.

Open Access (OA) publishing is an effort to remove a major barrier to distributed collaboration in science — the high price of journal articles, effectively limiting access to researchers affiliated with wealthy institutions. Access to Knowledge (A2K) emphasizes the equality and social justice aspects of opening access to the scientific literature.

The OA movement has met with substantial and increasing success recently. The Directory of Open Access Journals (see lists 4583 journals as of 2010-01-20. The Public Library of Science’s top journals are in the first tier of publications in their fields. Traditional publishers are investing in OA, such as Springer’s acquisition of large OA publisher BioMed Central, or experimenting with OA, for example Nature Precedings.

In the longer term OA may lead to improving the methods of scientific collaboration, eg peer review, and allowing new forms of meta-collaboration. An early example of the former is PLoS ONE, a rethinking of the journal as an electronic publication without a limitation on the number of articles published and with the addition of user rating and commenting. An example of the latter would be machine analysis and indexing of journal articles, potentially allowing all scientific literature to be treated as a database, and therefore queryable — at least all OA literature. These more sophisticated applications of OA often require not just access, but permission to redistribute and manipulate, thus a rapid movement to publication under a Creative Commons license that permits any use with attribution — a practice followed by both PLoS and BioMed Central.

Scientists have also adopted web tools to enhance collaboration within a working group as well as to facilitate distributed collaboration. Wikis and blogs have been purposed as as open lab notebooks under the rubric of “Open Notebook Science”. Connotea is a tagging platform (they call it “reference management”) for scientists. These tools help “scale up” and direct the scientific conversation, as explained by Michael Nielsen:

You can think of blogs as a way of scaling up scientific conversation, so that conversations can become widely distributed in both time and space. Instead of just a few people listening as Terry Tao muses aloud in the hall or the seminar room about the Navier-Stokes equations, why not have a few thousand talented people listen in? Why not enable the most insightful to contribute their insights back?

Stepping back, what tools like blogs, open notebooks and their descendants enable is filtered access to new sources of information, and to new conversation. The net result is a restructuring of expert attention. This is important because expert attention is the ultimate scarce resource in scientific research, and the more efficiently it can be allocated, the faster science can progress.

Michael Nielsen, “Doing science online”,

OA and adoption of web tools are only the first steps toward utilizing digital networks for scientific collaboration. Science is increasingly computational and data-intensive: access to a completed journal article may not contribute much to allowing other researcher’s to build upon one’s work — that requires publication of all code and data used during the research used to produce the paper. Publishing the entire “resarch compendium” under apprpriate terms (eg usually public domain for data, a free software license for software, and a liberal Creative Commons license for articles and other content) and in open formats has recently been called “reproducible research” — in computational fields, the publication of such a compendium gives other researches all of the tools they need to build upon one’s work.

Standards are also very important for enabling scientific collaboration, and not just coarse standards like RSS. The Semantic Web and in particular ontologies have sometimes been ridiculed by consumer web developers, but they are necessary for science. How can one treat the world’s scientific literature as a database if it isn’t possible to identify, for example, a specific chemical or gene, and agree on a name for the chemical or gene in question that different programs can use interoperably? The biological sciences have taken a lead in implementation of semantic technologies, from ontology development and semantic databsases to inline web page annotation using RDFa.

Of course all of science, even most of science, isn’t digital. Collaboration may require sharing of physical materials. But just as online stores make shopping easier, digital tools can make sharing of scientific materials easier. One example is the development of standardized Materials Transfer Agreements accompanied by web-based applications and metadata, potentially a vast improvement over the current choice between ad hoc sharing and highly bureaucratized distribution channels.

Somewhere between open science and business (both as in for-profit business and business as usual) is “Open Innovation” which refers to a collection of tools and methods for enabling more collaboration, for example crowdsourcing of research expertise (a company called InnoCentive is a leader here), patent pools, end-user innovation (documented especially by Erik von Hippel in Democratizing Innovation), and wisdom of the crowds methods such as prediction markets.

Reputation is an important question for many forms of collaboration, but particularly in science, where careers are determined primarily by one narrow metric of reputation — publication. If the above phenomena are to reach their full potential, they will have to be aligned with scientific career incentives. This means new reputation systems that take into account, for example, re-use of published data and code, and the impact of granular online contributions, must be developed and adopted.

From the grand scientific enterprise to business enterprise modern collaboration tools hold great promise for increasing the rate of discovery, which sounds prosaic, but may be our best tool for solving our most vexing problems. John Wilbanks, Vice President for Science at Creative Commons often makes the point like this: “We don’t have any idea how to solve cancer, so all we can do is increase the rate of discovery so as to increase the probability we’ll make a breakthrough.”

Science 2.0 also holds great promise for allowing the public to access current science, and even in some cases collaborate with professional researchers. The effort to apply modern collaboration tools to science may even increase the rate of discovery of innovations in collaboration!

One Response

  1. […] Mike Linksvayer My opinions only. I do not represent any organization in this publication. « Collaborative Futures 3 […]

Leave a Reply