Last week I attended CODATA 2012 in Taipei, the biannual conference of the Committee on Data for Science and Technology. I struggled a bit with deciding to go — I am not a “data scientist” nor a scientist and while I know a fair amount about some of the technical and policy issues for data management, specific application to science has never been my expertise, all away from my current focus, and I’m skeptical of travel.
I finally went in order to see through a session on mass collaboration data projects and policies that I developed with Tyng-Ruey Chuang and Shun-Ling Chen. A mere rationalization as they didn’t really need my presence, but I enjoyed the conference and trip anyway.
My favorite moments from the panel:
- Mikel Maron said approximately “not only don’t silo your data, don’t silo your code” (see a corresponding bullet in his slides), a point woefully and consistently underestimated and ignored by “open” advocates.
- Chen’s eloquent polemic closing with approximately “mass collaboration challenges not only Ⓒ but distribution of power, authority, credibility”; I hope she publishes her talk content!
My slides from the panel (odp, pdf, slideshare) and from an open data workshop following the conference (odp, pdf, slideshare).
Tracey Lauriault summarized the mass collaboration panel (all of it, check out the parts I do not mention), including:
Mike Linksvayer, was provocative in stating that copyright makes us stupider and is stupid and that it should be abolished all together. I argued that for traditional knowledge where people are seriously marginalized and where TK is exploited, copyright might be the only way to protect themselves.
I’m pretty sure I only claimed that including copyright in one’s thinking about any topic, e.g., data policy, effectively makes one’s thinking about that topic more muddled and indeed stupid. I’ve posted about this before but consider a post enumerating the ways copyright makes people stupid individually and collectively forthcoming.
I didn’t say anything about abolishing copyright, but I’m happy for that conclusion to be drawn — I’d be even happier for the conclusion to be drawn that abolition is a moderate reform and boring (in no-brainer and non-interesting senses) among the possibilities for information and innovation policies — indeed, copyright has made society stupid about these broader issues. I sort of make these points in my future of copyright piece that Lauriault linked to, but will eventually address them directly.
Also, Traditional Knowledge, about which I’ve never posted unless you count my claim that malgovernance of the information commons is ancient, for example cult secrets (mentioned in first paragraph of previous link), though I didn’t have contemporary indigenous peoples in mind, and TK covers a wide range of issues. Indeed, my instinct is to divide these between issues where traditional communities are being excluded from their heritage (e.g., plant patents, institutionally-held data and items, perhaps copyrestricted cultural works building on traditional culture) and where they would like to have a collective right to exclude information from the global public domain.
The theme of CODATA 2012 was “Open Data and Information for a Changing Planet” and the closing plenary appropriately aimed to place the entire conference in that context, and question its impact and followup. That included the inevitable asking whether anyone would notice. At the beginning of the conference attendees were excitedly encouraged to tweet, and if I understood correctly, there were some conference staff to be dedicated to helping people tweet. As usual, I find this sort of exhortation and dedication of resources to social media scary. But what about journalists? How can we make the media care?
Fortunately for (future) CODATA and other science and data related events, there’s a great answer (usually there isn’t one), but one I didn’t hear mentioned at all outside of my own presentation: invite data journalists. They could learn a lot from other attendees, have a meta story about exactly the topic they’re passionate about, and an inside track on general interest data-driven stories developing from data-driven science in a variety of fields — for example the conference featured a number of sessions on disaster data. Usual CODATA science and policy attendees would probably also learn a lot about how to make their work interesting for data journalists, and thus be able to celebrate rather than whinge when talking about media. A start on that learning, and maybe ideas for people to invite might come from The Data Journalism Handbook (disclaimer: I contributed what I hope is the least relevant chapter in the whole book).
Someone asked how to move forward and David Carlson gave some conceptually simple and very good advice, paraphrased:
- Adopt an open access data publishing policy at the inception of a project.
- Invest in data staff — human resources are the limiting factor.
- Start publishing and doing small experiments with data very early in a project’s life.
Someone also asked about “citizen science”, to which Carlson also had a good answer (added to by Jane Hunter and perhaps others), in sum roughly:
- Community monitoring (data collection) may be a more accurate name for part of what people call citizen science;
- but the community should be involved in many more aspects of some projects, up to governance;
- don’t assume “citizen scientists” are non-scientists: often they’ll have scientific training, sometimes full-time scientists contributing to projects outside of work.
To bring this full circle (and very much aligned with the conference’s theme and Carlson’s first recommendation above) would have been consideration of scientist-as-citizen. Fortunately I had serendipitously titled my “open data workshop” presentation for the next day “Open data policy for scientists as citizens and for citizen science”.
Finally, “data citation” was another major topic of the conference, but semantic web/linked open data not explicitly mentioned much, as observed by someone in the plenary. I tend to agree, but may have missed the most relevant sessions, though they may have been my focus if I was actually working in the field. I did really enjoy happening to sit next to Curt Tilmes at a dinner, and catching up a bit on W3C Provenance (I’ve mentioned briefly before) of which he is a working group member.
I got to spend a little time outside the conference. I’d been to Taipei once before, but failed to notice its beautiful setting — surrounded and interspersed with steep and very green hills.
I visited National Palace Museum with Puneet Kishor. I know next to nothing about feng shui, but I was struck by what seemed to be an ultra-favorable setting (and made me think of feng shui, which I never have before in my life, without someone else bringing it up) taking advantage of some of the aforementioned hills. I think the more one knows about Chinese history the more one would get out of the museum, but for someone who loves maps, the map room alone is worth the visit.
It was also fun hanging out a bit with Christopher Adams and Sophie Chiang, catching up with Bob Chao and seeing the booming Mozilla Taiwan offices, and meeting Florence Ko, Lucien Lin, and Rock of Open Source Software Foundry and Emily from Creative Commons Taiwan.
Finally, thanks to Tyng-Ruey Chuang, one of the main CODATA 2012 local organizers, and instigator of our session and workshop. He is one of the people I most enjoyed working with while at Creative Commons (e.g., a panel from last year) and given some overlapping technology and policy interests, one of the people I could most see working with again.