Deployment Matters

Most popular descriptions of why BitTorrent works so well are off the mark, The BitTorrent Effect in the current Wired Magazine included. Excerpts:

The problem with P2P file-sharing networks like Kazaa, he reasoned, is that uploading and downloading do not happen at equal speeds. Broadband providers allow their users to download at superfast rates, but let them upload only very slowly, creating a bottleneck: If two peers try to swap a compressed copy of Meet the Fokkers – say, 700 megs – the recipient will receive at a speedy 1.5 megs a second, but the sender will be uploading at maybe one-tenth of that rate.

Paradoxically, BitTorrent’s architecture means that the more popular the file is the faster it downloads – because more people are pitching in. Better yet, it’s a virtuous cycle. Users download and share at the same time; as soon as someone receives even a single piece of Fokkers, his computer immediately begins offering it to others. The more files you’re willing to share, the faster any individual torrent downloads to your computer. This prevents people from leeching, a classic P2P problem in which too many people download files and refuse to upload, creating a drain on the system. “Give and ye shall receive” became Cohen’s motto, which he printed on T-shirts and sold to supporters.

Sites like Kazaa and Morpheus are slow because they suffer from supply bottlenecks. Even if many users on the network have the same file, swapping is restricted to one uploader and downloader at a time.

Most home and many business broadband connections are asymmetric — the downstream pipe is much fatter than the upstream pipe. That’s a problem any net application that requires significant upstream bandwidth has to contend with. There is no protocol solution. A BitTorrent client can’t upload any faster than a Gnutella client.

Kazaa, eDonkey and various Gnutella clients (e.g., LimeWire) have incorporated multisource/swarming downloads for three years, and the latter two also use partial file sharing (I’m not sure about Kazaa and PFS). These two key features — download from multiple peers, and begin uploading parts of a file before you’ve completed downloading — don’t set BitTorrent apart, though it may be slightly ahead in pushing the state of the art (I haven’t examined the protocols side by side).

So why does BitTorrent work so well? Deployment.

Gnutella et al users start a client and typically download and share files in one or more local directories. All files a user has collected are shared simultaneously. A client connects to random peers and accepts random queries and requests to download any files the client is sharing. There are significant refinements, e.g., ultrapeers and supernodes.) Downloads will be spread across a huge number of files and peers downloading the same file won’t necessarily know about each other (and thus won’t be able to upload to each other while downloading). Again, there are significant refinements — Gnutella peers maintain a list of other peers sharing a given file — knows as an alternate location download mesh.

For BitTorrent such refinements are superfluous. A BitTorrent user finds a “torrent” for a file the user wants to download, a BitTorrent client is launched and connects to a tracker specified by the torrent. All clients connecting to a tracker via a single torrent are by definition all downloading the same file, and they know about each other — the ideal situation for swarming distribution. And here’s the key to BitTorrent’s success in spite of typically limited upload rates: Because users are sharing only one or a few files — the one(s) they’re downloading — their precious upstream bandwidth is used to enhance a virtuous cycle in which everyone downloading the same file downloads faster.

This ideal situation also allows BitTorrent to utilize tit-for-tat (PDF) leech resistance — “Give and ye shall receive” above. A typical filesharing client can’t effectively use this strategy as it is unlikely to have multiple interactions with the same peer, let alone simultaneous mutually beneficial interactions.

There are technologies (possibly Distributed Hash Tables), tweaks (a client giving preference to uploads of files the client has recently downloaded has been proposed) and practices (encourage filesharing client users to initiate downloads from editorially controlled lists of files rather than via ad hoc peer searches) that can be implemented in typical filesharing clients to make the average user’s download experience better, perhaps someday approaching the average BitTorrent user’s experience when downloading a popular file.

There’s also lots of talk about decentralizing BitTorrent. See eXeem, supposedly to be released in a few weeks. It appears that eXeem will attempt to keep BitTorrent’s beneficial characteristics by limiting files shared to those a user has obtained or created a .torrent file for — perhaps similar to a hypothetical Gnutella client that only shared files for which it had alternate sources. I don’t have high hopes for the decentralized bits of eXeem, whatever they turn out to be. It may serve as a decent standard BitTorrent client, but there’s no need for another of those, and eXeem will supposedly be riddled with spyware.

4 Responses

  1. […] RPOW. Reusable Proofs of Work is a system for sequential reuse of hashcash mediated by a server written by the great signal-to-noise enhancer Hal Finney. RPOW has many potential uses — apparently initially motivated by a desire to implement “P2Poker” with interesting “chips” and currently being experimented with in a modified BitTorrent client in which downloaders can pay for priority wit RPOW tokens, possibly encouraging people to leave clients running after completing a download (serving as seeds in BT lingo) in order to earn tokens which may be spent on future downloads. As the BTRP page notes, people could acquire RPOWs out of band, and not contribute more upload bandwidth, or even contribute less. The net effect is hard to predict. If buying download priority with RPOWs proves useful, I expect non-BT filesharing clients, which have far less reason to cooperate, would benefit more than BT clients. Perhaps the most interesting thing about the RPOW system is its great effort to ensure that there can be no cheating, in particular by the server operator. The RPOW server will zero all data if it is physically tampered with, it is possible for anyone to verify the code it is running, and that code can verify that its database in its untrusted host has not been tampered with, using a Merkle hash tree to verify (the secure board only has two megabytes of memory). The RPOW server may be the world’s first transparent server, which could facilitate a world of distributed, cooperating RPOW servers. Presentation slides. […]

  2. […] Localhost. A global directory shared with a modified version of the Azureus BitTorrent client. I tried about a month ago. Performance was somewhere between abysmal and nonexistent. BitTorrent is fantastic for large popular files. I’ll be surprised if localhost’s performance, which depends on transferring small XML files, ever reaches mediocrity. They’re definitely going away from BitTorrent’s strengths by uploading websites into the global directory as lots of small files (I gather). The idea of a global directory is interesting, though tags seem a more fruitful navigation method than localhost’s hierarchy. […]

  3. […] a little odd to include all those BitTorrent clients, given their very different nature. All but LimeWire, Ares, eMule, and BearShare are BT-only (their P2P download component — […]

  4. […] Deployment Matters uses an odd word to characterize what an article in part contrasting BitTorrent with other filesharing schemes gets wrong — only if read uncharitably. It would have been more clear to praise the article, and make the further point that seemingly global search is often a terrible discovery and trading mechanism. Web search has fooled us. Calibrate. […]

Leave a Reply