Greatest month in history?

Yesterday, 11 years ago, today, 22 years and 4 months. Recently I noticed an observation in slides by Glyn Moody on Open Acccess (related editorial):

25 August 1991 – Finnish student, Linus Torvalds, announced the start of Linux
23 August 1991 – World Wide Web released publicly
14 August 1991 – Launch of arXiv

Moody titled the slide with above items “greatest week in history?” — arXiv is listed as 19 August, which I think must be a transcription error. Still, perhaps the greatest month in some assessment which grants something like knowledge commons supreme importance; perhaps future conventional wisdom. Those three are a nice mix of software, protocols, literature, data, and infrastructure.

collapsed broadcast towerThe world’s tallest broadcast tower collapsed 8 August 1991 to make way for somewhat less centralized communications.

Linux and the Web make Wikipedia’s short list of August 1991 events, which is dominated by the beginning of the final phase of the dissolution of the Soviet Union. (I have an old post which is a tiny bit relevant to tying this all together, however unwarranted that may be.)

arXiv isn’t nearly as well known to the general public as Linux, which isn’t nearly as well known as the Web. In some ways arXiv is still ahead of its time. The future takes a long time to be distributed — Moody’s cover slide is titled “half a revolution”. Below I’ve excepted a few particularly enjoyable paragraphs and footnotes from It was twenty years ago today… by arXiv founder Paul Ginsparg (who, Moody notes, knew of GNU via a brother). I’ve bolded a couple phrases and added one link for additional entertainment value. The whole 9 page paper (PDF) is worth a quick read (I can’t help but notice and enjoy the complete absence of two words: “copyright” and “license”).

The exchange of completed manuscripts to personal contacts directly by email became more widespread, and ultimately led to distribution via larger email lists.13 The latter had the potential to correct a significant problem of unequal access in the existing paper-preprint distribution system. For purely practical reasons, authors at the time used to mail photocopies of their newly minted articles to only a small number of people. Those lower in the food chain relied on the beneficence of those on the A-list, and aspiring researchers at non-elite institutions were frequently out of the privileged loop entirely. This was a problematic situation, because, in principle, researchers prefer that their progress depends on working harder or on having some key insight, rather than on privileged access to essential materials.

By the spring of 1991, I had moved to the Los Alamos National Laboratory, and for the first time had my own computer on my desk, a 25 MHz NeXTstation with a 105 Mb hard drive and 16 Mb of RAM. I was thus fully cognizant of the available disk and CPU resources, both substantially larger than on a shared mainframe, where users were typically allocated as little as the equivalent of 0.5 Mb for personal use. At the Aspen Center for Physics, in Colorado, in late June 1991, a stray comment from a physicist, concerned about emailed articles overrunning his disk allocation while traveling, suggested to me the creation of a centralized automated repository and alerting system, which would send full texts only on demand. That solution would also democratize the exchange of information, leveling the aforementioned research playing field, both internally within institutions and globally for all with network access.

Thus was born,18 initially an automated email server (and within a few months also an FTP server), powered by a set of csh scripts.19 It was originally intended for about 100 submissions per year from a small subfield of high-energy particle physics, but rapidly grew in users and scope, receiving 400 submissions in its first half year. The submissions were initially planned to be deleted after three months, by which time the pre-existing paper distribution system would catch up, but by popular demand nothing was ever deleted. (Renamed in late 1998 to, it has accumulated roughly 700,000 total submissions [mid Aug 2011], currently receives 75,000 new submissions per year, and serves roughly one million full text downloads to about 400,000 distinct users per week. The system quickly attracted the attention of existing physics publishers, and in rapid succession I received congenial visits from the editorial directors of both the American Physical Society (APS) and Institute of Physics Publishing (IOPP) to my little 10’x10’ office. It also had an immediate impact on physicists in less developed countries, who reported feeling finally in the loop, both for timely receipt of research ideas and for equitable reading of their own contributions. (Twenty years later, I still receive messages reporting that the system provides to them more assistance than any international organization.)

In the fall of 1992, a colleague at CERN emailed me: ‘Q: do you know the worldwide-web program?’ I did not, but quickly installed, serendipitously written by Tim Berners-Lee for the same NeXT computer that I was using, and with whom I began to exchange emails. Later that fall, I used it to help beta-test the first US Web server, set up by the library at the Stanford Linear Accelerator Center for use by the high-energy physics community.

Not everyone appreciated just how rapidly things were progressing. In early 1994, I happened to serve on a committee advising the APS about putting Physical Review Letters online. I suggested that a Web interface along the lines of the prototype might be a good way for the APS to disseminate its documents. A response came back from another committee member: “Installing and learning to use a WorldWideWeb browser is a complicated and difficult task — we can’t possibly expect this of the average physicist.”

13The most significant of these was maintained by Joanne Cohn, then a postdoctoral associate at the IAS Princeton, who manually collected and redistributed preprints (originally in the subject area of matrix models of two dimensional surfaces) to what became a list of over a hundred interested researchers, largely younger postdocs and grad students. This manual methodology provided an important proof of concept for the broader automated and archival system that succeeded it, and her distribution list was among those used to seed the initial userbase.

18The name xxx was derived from the heuristic I’d used in marking text in TeX files for later correction (i.e., awaiting a final search for all appearances of the string ‘xxx’, which wouldn’t otherwise appear, and for which I later learned the string ‘tk’ is employed by journalists, for similar reasons).

19The csh scripts were translated to Perl starting in 1994, when NSF funding permitted actual employees.

