P2P – mlog

Post P2P

LimeWire Filtering & Blog

Wednesday, March 29th, 2006

Just noticed that the current LimeWire beta (4.11.0) includes optional copyright filtering. See the features history and brief descriptions for users and copyright owners:

In the Filtering System, copyright owners identify files that they don’t want shared and submit them for inclusion in a public list. LimeWire then consults this list and stops users from downloading the identified files “filtering” them from the sharing process.

If you sign up for an account as a copyright owner you can submit files (with file name, file size, SHA1 hash, creator, collection, description) for filtering. Users can turn the filter on and off via a preference.

LimeWire.org now features a blog with pretty random content. I notice that another PHP Base32 function (which makes a whole lot more sense than the one included in Bitcollider-PHP — I swear PHP’s bitwise operators weren’t giving correct results and worked around that, but was probably insane) is available with a hint that someone is building an “open source Gnutella Server in PHP5.”

Remember that LimeWire is Open Source P2P and thus pretty trustworthy — and you can always fork.

Posted in Bitzi, Open Source, P2P, Programming | No Comments »

Supply-side anti-censorship

Friday, February 17th, 2006

Brad Tempelton explains why a censor should want an imperfect filter — it should be good enough to keep verboten information from most users, but easy enough to circumvent to tempt dissidents, so they can be tracked and when desired, put away.

In the second half of the post, Tempelton suggests some anti-censor techniques: ubiquitous SSL and steganography. Fortunately he says these are “far off” and “does not scale”, respectively. To say the least, I’d add.

Cyber-activists have long dreamed that strong encryption would thwart censorship. Freenet is an example of a project that uses this as its raison d’être. While I’m a huge fan of ubiquitous encryption and decentralization (please install OTR, now!), these seem like terribly roundabout, demand-side means of fighting censorship — the price of obtaining information, which includes the chance of being caught, is lowered. But someone has to seek out or have the information pushed to them in the first place. If information is only available via hidden channels, how many people will encounter it regardless of lower risk?

An alternative, perhaps less sexy because it involves no technology adoption, is supply-side anti-censorship: make verboten information ubiquitous. Anyone upset about google.cn should publish information the Communist Party wants censored (my example is pathetic, need to work on that). This is of course not mutually exclusive with continuing to carp and dream of techno-liberation.

I guess I’m calling for Spread Free Speech projects. Or one of those chain letters (e.g, “four things”) that plagues the blogosphere.

Posted in Blogs, Economics, Free Speech, Intellectual Protectionism, P2P, Peeves, Politics | 2 Comments »

content.exe is evil

Thursday, February 16th, 2006

I occasionally run into people who think users should download content (e.g., music or video) packaged in an executable file, usually for the purpose of wrapping the content with DRM where the content format does not directly support DRM (or the proponent’s particular DRM scheme). Nevermind the general badness of Digital Restrictions Management, requiring users to run a new executable for each content file is evil.

Most importantly, every executable is a potential malware vector. There is no good excuse for exposing users to this risk. Even if your executable content contains no malware and your servers are absolutely impenetrable such that your content can never be replaced with malware, you are teaching users to download and run executables. Bad, bad, bad!

Another problem is that executables are usually platform-specific and buggy. Users have enough problem having the correct codec installed. Why take a chance that they might not run Windows (and the specific versions and configurations you have tested, sure to not exist in a decade or much less)?

I wouldn’t bother to mention this elementary topic at all, but very recently I ran into someone well intentioned who wants users to download content wrapped in jar files, if I understand correctly for the purposes of ensuring users can obtain content metadata (most media players do a poor job of exposing content metadata and some file formats do a poor job of supporting embedded metadata, not that hardly anyone cares — this is tilting at windmills) and so that content publishers can track use (this is highly questionable), all from a pretty cross platform GUI. A jar file is an executable Java package, so the platform downside is different (Windows is not required, but a Java installation, of some range of versions and configurations, is), but it is still an executable that can do whatever it wants with the computer it is running on. Bad, bad, bad!

The proponent of this scheme said that it was ok, the jar file could be digitally signed. This is no help at all. Anyone can create a certificate and sign jar files. Even if a creator did have to have their certificate signed by an established authority it would be of little help, as malware purveyors have plenty of resources that certificate authorities are happy to take. The downsides are many: users get a security prompt (“this content signed by…”) for content, which is annoying, misleading as described above and conditions the user to not pay attention when they install things that really do need to be executable, and a barrier is raised for small content producers.

If you really want to package arbitrary file formats with metadta, put everything in a zip file and include your UI in the zip as HTML. This is exactly what P2P vendor LimeWire‘s Packaged Media File format is. You could also make your program (which users download only once) look for specific files within the zip to build a content-specific (and safe) interface within your program. I believe this describes Kazaa‘s Kapsules, though I can’t find any technical information.

Better yet put your content on the web, where users can find and view it (in the web design of your choice), you get reasonable statistics, and the zombies don’t get fed. You can even push this 80/20 solution to 81/19 by including minimal but accurate embedded metadata in your files if they support it — a name users can search for or a URL for your page related to the content.

Most of the pushers of executable content I encounter when faced with security concerns say it is an “interersting and hard problem.” No, it is a stupid and impossible problem. In contrast to web, executable content is a 5/95/-1000 solution — that last number is a negative externality.

If you really want an interesting and hard problem, executable content security is the wrong level. Go work on platform security. We can now run sophisticated applications within a web browser with some degree of safety (due to Java applet and Flash sandboxes, JavaScript security). Similar could be pushed down to the desktop, so that executables by default have no more rights to tamper with your system than do web pages. Capability-based security is an aggressive approach to this problem. If that sounds too hard and not interesting enough (you really wanted to distribute “media”), go the web way as above — it is subsuming the desktop anyhow.

Posted in DRM, Economics, Intellectual Protectionism, P2P, Peeves, Programming, Semantic Web | 2 Comments »

CodeCon Extra

Monday, February 13th, 2006

A few things I heard about at CodeCon outside the presentations.

Vesta was presented at CodeCon 2004, the only one I’ve missed. It is an integrated revision control and build system that guarantees build repeatability, in part by ensuring that every file used by the build is under revision control. I can barely keep my head around the few revision control and build systems I occasionally use, but I imagine that if I were starting (or saving) some large mission-critical project that found everyday tools inadequare it would be well worth considering Vesta. About its commercial equivalents, I’ve mostly heard second hand complaining.

Allmydata is where Zooko now works. The currently Windows-only service allows data backup to “grid storage” presumably a distributed file store as used by MojoNation. Dedicate 10Gb of local storage to the service, you can back up 1Gb, free. Soon you’ll be able to pay for better ratios, including $30/month for 1Tb of space. I badly want this service. Please make it available, and for Linux! Distributed backup has of course been a dream P2P application forever. Last time I remember the idea getting attention was a Cringely column in 2004.

Some people were debating whether the Petname Tool does anything different from what SPKI/SDSI specify and whether either would make phishing substantially harder. The former is debated in comments on Bruce Schneier’s recent post on petnames, inconclusively AFAICT. The Petname Tool works well and simply for what it does (Firefox only), which is to allow a user to assign a name to a https site if it is using strong encryption. If the user visits the site again and it is using the same certificate, the user will see the assigned name in a green box. Any other site, including one that merely looks like the original (in content or URL), or even has hijacked DNS, appears to be “secure” but uses a different certificate, will appear as “untrusted” in a yellow box. That’s great as far as it goes (see phollow the phlopping phish for a good description of the attack this would save reasonable user from), though the naming seems the least important part — a checkbox to begin trusting a site would be nearly as good. I wonder though how many users have any idea that some pages are secure and others are not. The petname tool doesn’t do anything for non-https pages, so the user becomes inured to seeing it doing nothing, then does not see it. Perhaps it should be invisible when not on a secure site. Indicators like PageRank, Alexa rank (via the Google and Alexa toolbars) and similar, SiteAdvisor, and whether the visitor has previously visited the site in question before would all help warn the user that any site may not be what they expect — nearly everyone, including me, confers a huge amount of trust on non-https sites, even if I never engage in a financial transaction on such a site. I imagine a four-part security indicator in a prominent place in the browser, with readings of site popularity (rank), danger as measured by the likes of SiteAdvisor, the user’s relationship with the site (petname) and whether the connection is strongly encrypted.

Someone claimed that three letter agencies want to mandate geolocation for every net access device. No doubt some agency types dream of this. Anyway, the person said we should be ready to fight this if it were to become a real push for such a law, because what would happen to anonymity? No doubt such a mandate should be fought tooth and nail, but preserving anonymity seems like exactly the wrong battle cry. How about privacy, or even mere freedom? On that note, someone briefly showed a tiny gumstix computer attached to and powered by what could only be called a solar flap. This could be slapped on the side of a bus and would connect to wifi networks whenever possible and route as much Mixmaster traffic as possible.

Posted in Open Source, P2P, Politics, Programming | 3 Comments »

CodeCon Saturday

Sunday, February 12th, 2006

Delta. Arbitrarily large codebase triggers specific bug. Run delta, which attempts to provide you with only the code that triggers the bug (usually a page or so, no matter the size of the codebase) via a simulated annealing like algorithm (the evaluation function requires triggering the bug and considers code size). Sounds like a big productivity and quality booster where it can be used.

Djinni. Framework for approximation of NP-complete problems, supposedly faster and easier to use than more academic oriented approximation frameworks. An improved simulated annealing algorithm is or will be in the mix, including an analog of “pressue” in physical annealing. Super annoying presentation style. Thank you for letting us know that CodeCon is where the rubber meets the road.

iGlance. Instant Messaging with audio and video, consistent with the IM metaphor (recipient immediately hears and sees initiator) rather than telephone metaphor (recipient must pick up call). Very low bitrate video buddy lists. Screen and window sharing with single control and dual pointers so that remote user can effectively point over your shoulder. Impressive for what seems to be a one person spare time project. Uses OVPL and OVLPL licenses, very similar to GPL and LGPL, but apparently easier to handle contributor agreements, so project owner can move code between application and library layers. Why not just make the entire application LGPL?

Overlay Anycast Service InfraStructure. Locality-aware server selection (to be) used by Coral CDN, easy to implement for your service. Network locality correlates highly with geographic locality due to the speed of light bound. Obvious, but the graph was neat. OpenDHT was also mentioned, another PlanetLab hosted service. OpenDHT clients can use OASIS to find a gateway. Super easy to play with a DHT with around 200 nodes. Someone has built capability secure fileshare using OpenDHT, see Octopod. As Wes Felter says, this stuff really needs to be moved to a non-research network.

Query By Example. Find and rank rows [dis]similar to others in SQL using extension for PostgreSQL, which uses a support vector machine for classification (last is not visible to user). Sounds great for data mining engagements.

Friday
Saturday 2005

Posted in Open Source, P2P, Programming | 2 Comments »

CodeCon Friday

Saturday, February 11th, 2006

This year Gordon Mohr had the devious idea to do preemtive reviews of CodeCon presentations. I’ll probably link to his entries and have less to say here than last year.

Daylight Fraud Prevention. I missed most of this presentation but it seems they have a set of non-open source Apache modules each of which could make phishers and malware creators work slightly harder.

SiteAdvisor. Tests a website’s evilness by downloading and running software offered by the site and filling out forms requesting an email address on the site. If virtual Windows machine running downloaded software becomes infected or email address set up for test is inundated with spam the site is considered evil. This testing is mostly automated and expensive (many Windows licenses). Great idea, surprising it is new (to me). I wonder how accurate evil readings one could obtain at much lower cost by calculating a “SpamRank” for sites based on links found in email classified as spam and links found on pages linked to in spams? (A paper has already taken the name SpamRank, though at a five second glance it looks to propose tweaks to make PageRank more spam-resistant rather than trying to measure evil.) Fortunately SiteAdvisor says that both bitzi.com and creativecommons.org are safe to use. SiteAdvisor’s data is available for use under the most restrictive Creative Commons license — Attribution-NonCommercial-NoDerivs 2.5.

VidTorrent/Peers. Streaming joke. Peers, described as a “toolkit for P2P programming with continuation passing style” I gather works syntactically as a Python code preprocessor, could be interesting. I wish they had compared Peers to other P2P toolkits, e.g., JXTA.

Localhost. A global directory shared with a modified version of the Azureus BitTorrent client. I tried about a month ago. Performance was somewhere between abysmal and nonexistent. BitTorrent is fantastic for large popular files. I’ll be surprised if localhost’s performance, which depends on transferring small XML files, ever reaches mediocrity. They’re definitely going away from BitTorrent’s strengths by uploading websites into the global directory as lots of small files (I gather). The idea of a global directory is interesting, though tags seem a more fruitful navigation method than localhost’s hierarchy.

Truman. A “sandnet” for investigating suspected malware in. Faux services (e.g., DNS, websites) can be scripted to elicit the suspected malware’s behavior, and more.

Posted in Bitzi, Creative Commons, Open Source, P2P, Peeves, Programming | 2 Comments »

[Hot]link policy

Sunday, January 15th, 2006

I’m out of the loop. Until very recently (upon reading former Creative Commons intern Will Frank’s writeup of a brief hotlink war) I thought ‘hotlink‘ was an anachronistic way to say ‘link’ used back when the mere fact that links led to a new document, perhaps on another server, was exciting. It turns out ‘hotlink’ is now vernacular for inline linking — displaying or playing an image, audio file, video, or other media from another website.

Lucas Gonze, who has lots of experience dealing with hotlink complaints due to running Webjay, has a new post on problems with complaint forms as a solution to hotlinks. One thing missing from the post is a distinction between two completely different sets of complainers who will have different sets of solutions beyond complaining.

One sort of complainer wants a link to a third party site to go away. I suspect the complainer usually really wants the content on the third party site to go away (typically claiming the third party site has no right to distribute the content in question). Removing a link to that content from a link site works as a partial solution by making the third party hosted content more obscure. A solution in this case is to tell the complainer that the link will go away when it no longer works — in effect, the linking site ignore complaints and it is the responsibility of the complainer to directly pursue the third party site via takedown notices and other threats. This allows the linking site to completely automate the removal of links — those removed as a result of threatened or actual legal action look exactly the same as any other link gone bad and can be tested for and culled using the same methods. Presumably such a hands-off policy only pisses off complainers to the extent that they become more than a minor nuisance, at least on a Webjay-like site, though it must be an option for some.

Creative Commons has guidelines very similar to this policy concerning how to consider license information in files distributed off the web — don’t believe it unless a web page (which can be taken down) has matching license information concerning the file in question.

Another sort of complainer wants a link to content on their own site to go away, generally for one or two reasons. The first reason is that hotlinking uses bandwidth and other resources on the hotlinked site which the site owner may not be able to afford. The second reason, often coupled with the first, is that the site owner does not want their content to be available outside of the context of their own site (i.e., they want viewers to have to come to the source site to view the content).

With a bit of technical savvy the complainer who wants a link to their own site removed has several options for self help. Those merely concerned with cost could redirect requests without the relevant referrer (from their own site) or maybe cookie (e.g., for a logged in user) to the free Coral content distribution network or similar, which should drastically reduce originating site bandwidth, if hotlinks are actually generating many requests (if they are not there is no problem).

A complainer who does not want their content appearing in third party sites can return a small “visit my site if you want to view this content” image, audio file, or video as appropriate in the abscense of the desired referrer or cookie. Hotlinking sites become not an annoyance, but free advertising. Many sites take this strategy already.
Presumably many publishers do not have any technical savvy, so some Webjay-like sites find it easier to honor their complaints than to ignore them.

There is a potential for technical means of saying “don’t link to me” that could be easily implemented by publishers and link sites with any technical savvy. One is to interpret robots.txt exclusions to mean “don’t link to me” as well as “don’t crawl and index me.” This has the nice effect that those stupid enough to not want to be linked to also become invisible to search engines.

Another solution is to imitate rel=nofollow — perhaps rel=nolink, though the attribute would need to be availalable on img, object, and other elements in addtion to a, or simply apply rel=nofollow to those additional elements a la the broader interpretation of robots.txt above.

I don’t care for rel=nolink as it might seem to give some legitimacy to brutally bogus link policies (without the benefit of search invisibility), but it is an obvious option.

The upshot of all this is that if a link site operator is not as polite as Lucas Gonze there are plenty of ways to ignore complainers. I suppose it largely comes down to customer service, where purely technical solutions may not work as well as social solutions. Community sites with forums have similar problems. Apparently Craig Newmark spends much of his time tending to customer service, which I suspect has contributed greatly to making Craigslist such a success. However, a key difference, I suspect, is that hotlink complainers are not “customers” of the linking site, while most people who complain about behavior on Craigslist are “customers” — participants in the Craigslist community.

Posted in Creative Commons, Intellectual Protectionism, P2P, Peeves, Politics, Semantic Web | 2 Comments »

CodeCon 2006 Program

Thursday, January 12th, 2006

The CodeCon 2006 program has been announced and it looks fantastic. I highly recommend attending if you’re near San Francisco Feb 10-12 and any sort of computer geek. There’s an unofficial CodeCon wiki.

My impressions of last year’s CodeCon: Friday, Saturday, and Sunday.

Via Wes Felter

Posted in Computers, Open Source, P2P, Programming, Semantic Web | No Comments »

Lightnet!

Monday, January 9th, 2006

Congratulations to Lucas Gonze on the Webjay/Yahoo! merger. (Via Kevin Burton.)

Yahoo! made a very wise decision to be acquired by the light side rather than the dark side.

My favorite Gonze post: Totally fucking bored with Napster (more at CC).

Also have a listen to the best track on ccMixter (if you share my taste, probably not), also a Gonze creation.

I could gonze on, but enough of this!

Posted in Blogs, Creative Commons, Music, P2P | 3 Comments »

Darkfox

Tuesday, December 27th, 2005

I hate to write about software that could be vaporware, but AllPeers (via Asa Dotzler) looks like a seriously interesting darknet/media sharing/BitTorrent/and more Firefox extension.

It’s sad, but simply sending a file between computers with no shared authority nor intermediary (e.g, web or ftp server) is still a hassle. IM transfers often fail in my experience, traditional filesharing programs are too heavyweight and are configured to connect to and share with any available host, and previous attempts at friendnet clients (e.g., W.A.S.T.E.) were not production quality. Merely solving this problem would make AllPeers very cool.

Assuming AllPeers proves a useful mechanism for sharing media, perhaps it could also become a lightnet bridge–ccPublisher as a Firefox extension.

Do check out AllPeers CTO Matthew Gertner’s musings on the AllPeers blog. I don’t agree with everything he writes, but his is a very well informed and well written take on open source, open content, browser development and business models.

…

Songbird Media Player looks to be another compelling application built on the Mozilla platform (though run as a separate program rather than as a Firefox extension), to be released real soon now. 2006 should be another banner year for Firefox and Mozilla technology generally.

…

Lucas Gonze’s original lightnet post is now near the top of results for ‘lightnet’ on Google, Yahoo!, and MSN, and related followups fill up much of the next few dozen results, having displaced most of the new age and lighting sites that use the same term.

Posted in Creative Commons, Music, Open Source, P2P, Programming, Semantic Web | No Comments »