Post Peeves

content.exe is evil

Thursday, February 16th, 2006

I occasionally run into people who think users should download content (e.g., music or video) packaged in an executable file, usually for the purpose of wrapping the content with where the content format does not directly support DRM (or the proponent’s particular DRM scheme). Nevermind the general badness of Digital Restrictions Management, requiring users to run a new executable for each content file is evil.

Most importantly, every executable is a potential vector. There is no good excuse for exposing users to this risk. Even if your executable content contains no malware and your servers are absolutely impenetrable such that your content can never be replaced with malware, you are teaching users to download and run executables. Bad, bad, bad!

Another problem is that executables are usually platform-specific and buggy. Users have enough problem having the correct codec installed. Why take a chance that they might not run Windows (and the specific versions and configurations you have tested, sure to not exist in a decade or much less)?

I wouldn’t bother to mention this elementary topic at all, but very recently I ran into someone well intentioned who wants users to download content wrapped in , if I understand correctly for the purposes of ensuring users can obtain content metadata (most media players do a poor job of exposing content metadata and some file formats do a poor job of supporting embedded metadata, not that hardly anyone cares — this is tilting at windmills) and so that content publishers can track use (this is highly questionable), all from a pretty cross platform GUI. A jar file is an executable Java package, so the platform downside is different (Windows is not required, but a Java installation, of some range of versions and configurations, is), but it is still an executable that can do whatever it wants with the computer it is running on. Bad, bad, bad!

The proponent of this scheme said that it was ok, the jar file could be . This is no help at all. Anyone can create a certificate and sign jar files. Even if a creator did have to have their certificate signed by an established authority it would be of little help, as malware purveyors have plenty of resources that certificate authorities are happy to take. The downsides are many: users get a security prompt (“this content signed by…”) for content, which is annoying, misleading as described above and conditions the user to not pay attention when they install things that really do need to be executable, and a barrier is raised for small content producers.

If you really want to package arbitrary file formats with metadta, put everything in a zip file and include your UI in the zip as HTML. This is exactly what P2P vendor ‘s Packaged Media File format is. You could also make your program (which users download only once) look for specific files within the zip to build a content-specific (and safe) interface within your program. I believe this describes ‘s Kapsules, though I can’t find any technical information.

Better yet put your content on the web, where users can find and view it (in the web design of your choice), you get reasonable statistics, and the don’t get fed. You can even push this to 81/19 by including minimal but accurate embedded in your files if they support it — a name users can search for or a URL for your page related to the content.

Most of the pushers of executable content I encounter when faced with security concerns say it is an “interersting and hard problem.” No, it is a stupid and impossible problem. In contrast to web, executable content is a 5/95/-1000 solution — that last number is a .

If you really want an interesting and hard problem, executable content security is the wrong level. Go work on platform security. We can now run sophisticated applications within a web browser with some degree of safety (due to Java applet and Flash sandboxes, JavaScript security). Similar could be pushed down to the desktop, so that executables by default have no more rights to tamper with your system than do web pages. is an aggressive approach to this problem. If that sounds too hard and not interesting enough (you really wanted to distribute “media”), go the web way as above — it is subsuming the desktop anyhow.

CodeCon Friday

Saturday, February 11th, 2006

This year Gordon Mohr had the devious idea to do preemtive reviews of CodeCon presentations. I’ll probably link to his entries and have less to say here than last year.

Daylight Fraud Prevention. I missed most of this presentation but it seems they have a set of non-open source Apache modules each of which could make phishers and malware creators work slightly harder.

SiteAdvisor. Tests a website’s evilness by downloading and running software offered by the site and filling out forms requesting an email address on the site. If virtual Windows machine running downloaded software becomes infected or email address set up for test is inundated with spam the site is considered evil. This testing is mostly automated and expensive (many Windows licenses). Great idea, surprising it is new (to me). I wonder how accurate evil readings one could obtain at much lower cost by calculating a “SpamRank” for sites based on links found in email classified as spam and links found on pages linked to in spams? (A paper has already taken the name SpamRank, though at a five second glance it looks to propose tweaks to make PageRank more spam-resistant rather than trying to measure evil.) Fortunately SiteAdvisor says that both bitzi.com and creativecommons.org are safe to use. SiteAdvisor’s data is available for use under the most restrictive Creative Commons license — Attribution-NonCommercial-NoDerivs 2.5.

VidTorrent/Peers. Streaming joke. Peers, described as a “toolkit for P2P programming with continuation passing style” I gather works syntactically as a Python code preprocessor, could be interesting. I wish they had compared Peers to other P2P toolkits, e.g., .

Localhost. A global directory shared with a modified version of the BitTorrent client. I tried about a month ago. Performance was somewhere between abysmal and nonexistent. BitTorrent is fantastic for large popular files. I’ll be surprised if localhost’s performance, which depends on transferring small XML files, ever reaches mediocrity. They’re definitely going away from BitTorrent’s strengths by uploading websites into the global directory as lots of small files (I gather). The idea of a global directory is interesting, though tags seem a more fruitful navigation method than localhost’s hierarchy.

Truman. A “sandnet” for investigating suspected malware in. Faux services (e.g., DNS, websites) can be scripted to elicit the suspected malware’s behavior, and more.

@:^#

Friday, February 10th, 2006

That’s the Net Prophet, a new four-character, blasphemous emoticon invented by Sandy Sandfort:

Please note the turban and matted beard. Net Prophet is suitable for e-mail, websites and graffiti. And I think it’s a lot btter symbol for free speech than some stupid ribbon.

Not to mention better than flying the flag of a jurisdiction. The beauty of the Net Prophet is that it is not merely a symbol for free speech, it is free speech (where “free speech” is communication that someone wants to forcefully suppress).

Why “support” free speech when you can engage in it? There may be no other issue where direct action is so easy, so do it!

Muhammad with camel

Monday, February 6th, 2006

The first thing to note about the is their timidity.

The timidity of the selection turns out to have been pure genius (mine would have aimed for maximum depravity) as it highlights just how bizarre the reaction has been.

Many have expressed disappointment in the tepid support for free speech from many western governments. I am completely unsurprised. The U.S. government and its allies have taken on around as constituents. The government of Denmark has more freedom to do the Wright thing.

As I am on a very minor photo remix kick, here is my contribution to the universe of images of Mohammed:

muhammad licking camel asshole
licking a camel’s asshole under orders from .
Original photo by Saffanna licensed under cc-by-2.0.

I believe this image complies with putative , though some may claim they see him in the camel’s face. (Yes, this is a remix with zero diff.)

How do I know Muhammad and not Jesus is with the lucky camel? Because a camel couldn’t feel an imaginary person‘s licks.

What’s your Freedom/China Ratio?

Thursday, January 26th, 2006

Fred Stutzman points out that for the query site:ibiblio.org google.com estimates 7,640,000 hits while google.cn estimates 1,610,000, perhaps explained in part by support of freedom in Tibet.

That’s an impressive ratio of 4.75 pages findable in the relatively free world to 1 page findable in , call it a domain FCR of 4.75.

The domain FCR of a few sites I’m involved with:

bitzi.com: 635,000/210,000 = 3.02
creativecommons.org: 213,000/112,000 = 1.90
gondwanaland.com: 514/540 = 0.95

Five other sites of interest:

archive.org: 5,900,000/427,000 = 13.82
blogspot.com: 24,300,000/15,400,000 = 1.58
ibiblio.org: 5,260,000/ 1,270,000 = 4.14
typepad.com: 13,100,000 /2,850,000 = 4.60
wikipedia.org: 156,000,000/17,000,000 = 9.18

If you are cool your FCR will be very high. The third site above is my personal domain. I am obviously very uncool and so loved by the that they have twisted Google’s arm to make more of my blog posts available in China than are available elsewhere.

The is obviously the coolest site by far amongst those surveyed above, followed by . Very curious that apparently blocks a far higher percentage of pages at the blog service than of those at Google property .

It must be noted that the number of hits any web scale search engine claims are only estimates and these can vary considerably. Presumably Stutzman and I were hitting different Google servers, or perhaps his preferences are set slightly differently (I do have “safe search” off and accept results in any language — the obvious variables). However, the FCR from our results for site:ibiblio.org roughly agree.

Here’s a feeble attempt to draw the ire of PRC censors and increase my FCR:

Bryan Caplan’s Museum of Communism
Human Rights in China
Tiananmen Square Massacre
Government of Tibet in Exile
Tibet Online
民主進步黨 (Taiwan )

Note that I don’t really care about which jurisdiction or jurisdictions , , the or elsewhere fall under. would be preferable to the current arrangement, if the former led to more freedom, which it plausibly could. I post some independence-oriented links simply because I know that questions of territorial control matter deeply to states and my goal here is to increase my FCR.

You should attempt to increase your FCR, too. No doubt you can find better links than I did. If enough people try, the Google.cn index will become less interesting, though by one global method of guestimation, it is already seriously lacking. Add claimed hits for queries for html and -html to get a total index size.

google.com: 4,290,000,000 + 6,010,000,000 = 10,300,000,000
google.cn: 2,370,000,000 + 3,540,000,000 = 5,910,000,000

So the global FCR is 10,300,000,000/5,910,000,000 = 1.74

Although my domain FCR is lame, my name FCR is not bad (query for linksvayer) — 98,200/21,500 = 4.57.

Give me ∞ or give me the death of censorship!

(I eagerly await evidence that my methodology and assumptions are completely wrong.)

[Hot]link policy

Sunday, January 15th, 2006

I’m out of the loop. Until very recently (upon reading former Creative Commons intern Will Frank’s writeup of a brief hotlink war) I thought ‘‘ was an anachronistic way to say ‘link’ used back when the mere fact that links led to a new document, perhaps on another server, was exciting. It turns out ‘hotlink’ is now vernacular for inline linking — displaying or playing an image, audio file, video, or other media from another website.

Lucas Gonze, who has lots of experience dealing with hotlink complaints due to running Webjay, has a new post on problems with complaint forms as a solution to hotlinks. One thing missing from the post is a distinction between two completely different sets of complainers who will have different sets of solutions beyond complaining.

One sort of complainer wants a link to a third party site to go away. I suspect the complainer usually really wants the content on the third party site to go away (typically claiming the third party site has no right to distribute the content in question). Removing a link to that content from a link site works as a partial solution by making the third party hosted content more obscure. A solution in this case is to tell the complainer that the link will go away when it no longer works — in effect, the linking site ignore complaints and it is the responsibility of the complainer to directly pursue the third party site via and other threats. This allows the linking site to completely automate the removal of links — those removed as a result of threatened or actual legal action look exactly the same as any other link gone bad and can be tested for and culled using the same methods. Presumably such a hands-off policy only pisses off complainers to the extent that they become more than a minor nuisance, at least on a Webjay-like site, though it must be an option for some.

Creative Commons has guidelines very similar to this policy concerning how to consider license information in files distributed off the web — don’t believe it unless a web page (which can be taken down) has matching license information concerning the file in question.

Another sort of complainer wants a link to content on their own site to go away, generally for one or two reasons. The first reason is that hotlinking uses bandwidth and other resources on the hotlinked site which the site owner may not be able to afford. The second reason, often coupled with the first, is that the site owner does not want their content to be available outside of the context of their own site (i.e., they want viewers to have to come to the source site to view the content).

With a bit of technical savvy the complainer who wants a link to their own site removed has several options for self help. Those merely concerned with cost could redirect requests without the relevant referrer (from their own site) or maybe cookie (e.g., for a logged in user) to the free or similar, which should drastically reduce originating site bandwidth, if hotlinks are actually generating many requests (if they are not there is no problem).

A complainer who does not want their content appearing in third party sites can return a small “visit my site if you want to view this content” image, audio file, or video as appropriate in the abscense of the desired referrer or cookie. Hotlinking sites become not an annoyance, but free advertising. Many sites take this strategy already.
Presumably many publishers do not have any technical savvy, so some Webjay-like sites find it easier to honor their complaints than to ignore them.

There is a potential for technical means of saying “don’t link to me” that could be easily implemented by publishers and link sites with any technical savvy. One is to interpret exclusions to mean “don’t link to me” as well as “don’t crawl and index me.” This has the nice effect that those stupid enough to not want to be linked to also become invisible to search engines.

Another solution is to imitate — perhaps rel=nolink, though the attribute would need to be availalable on img, object, and other elements in addtion to a, or simply apply rel=nofollow to those additional elements a la the broader interpretation of robots.txt above.

I don’t care for rel=nolink as it might seem to give some legitimacy to brutally bogus link policies (without the benefit of search invisibility), but it is an obvious option.

The upshot of all this is that if a link site operator is not as polite as Lucas Gonze there are plenty of ways to ignore complainers. I suppose it largely comes down to customer service, where purely technical solutions may not work as well as social solutions. Community sites with forums have similar problems. Apparently Craig Newmark spends much of his time tending to customer service, which I suspect has contributed greatly to making such a success. However, a key difference, I suspect, is that hotlink complainers are not “customers” of the linking site, while most people who complain about behavior on Craigslist are “customers” — participants in the Craigslist community.

Credit card numbers from π

Sunday, January 15th, 2006

credit card numbers from pi

I had to run an errand and was disappointed to find Andi had left the channel. I really wanted to help him in his quest for credit card numbers. They are all to be found in . If Andi is any good he could’ve fleeced others searching for credit card numbers with that one.

Addendum: It’s an old joke. I probably heard it before and forgot.

Fraud of War in Iraq

Friday, January 13th, 2006

Cost of War in Iraq, a new paper from Linda Bilmes and Joseph Stiglitz, has already been discussed, at least superficially, on a large number of blogs. Comments at Marginal Revolution helpfully cite a number of related papers.

Bilmes and Stiglitz conservatively project that the total economic costs for the U.S. jurisdiction at $1 to $2 trillion. Direct budgetary costs are projected to be $750 billion to $1.2 billion. I have only skimmed the paper, which looks interesting enough, but nothing really new.

I’ve mentioned increasing cost projects several times last year and before, directly in Trillion dollar fraud (August), $700 billion fraud (July) and A lie halfway fulfilled (January 2005).

I won’t bother to explain the fraud this time, read the past posts. Hint: it involves repeatability.

One thing I’m struck by, skimming comments contesting Bilmes and Stiglitz (the political ones, not the technical ones concerning borrowing costs should be included, though they overlap) is that after the fact, I think many people would claim that the invasion was justified, economically and otherwise, regardless of the final cost. $5 trillion? (NB, that is a hypothetical, not a prediction!) It was worth getting rid of Hussein and deterring would-be Husseins. $10 trillion? Just goes to show how nasty “our” opponents are. $100 trillion? Civilization must be destroyed to save civilization!

All the more reason to be cognizant of probable costs before going to war. There’s not really a need for prediction markets here. Just multiply proponents’ estimates by ten. However, people stupidly believe words that come out of politicians’ mouths. Prediction market estimates could, ironically, provide a countervailing authority.

A better way? See Wright, Scheer, Zakaria, Hardar, Tierney, and Pape.

Going overboard with Wikipedia tags

Thursday, January 12th, 2006

A frequent correspondent recently complained that my linking to articles about organizations rather than the home pages of organizations is detrimental to the of this site, probably spurred by my linking to a stub article about Webjay.

I do so for roughly two reasons. First, I consider a Wikipedia link more usable than a link to an organization home page. An organization article will link directly to an organization home page, if the latter exists. The reverse is almost never true (though doing so is a great idea). An organization article at Wikipedia is more likely to be objective, succinct, and informational than an organizational home page (not to mention there is no chance of encountering , window resizing, or other annoying distractions — less charitably, attempts to control my browser — at Wikipedia). When I hear about something new these days, I nearly always check for a Wikipedia article before looking for an actual website. Finally, I have more confidence that the content of a Wikipedia article will be relevant to the content of my post many years from now.

(link to webjay.org) is actually a good example of these usability issues. Perhaps I have an unusually strong preference for words, but I think its still very brief Wikipedia article should allow one to understand exactly what Webjay is in under a minute.1 If I were visiting the Webjay site for the first time, I’d need to click around awhile to figure the service out — and Webjay’s interface is very to the point, unlike many other sites. Years from now I’d expect webjay.org to be a yet another site — or since the Yahoo! acquisition, to redirect to some Yahoo! property — or the property of whatever entities own Yahoo! in the future. (Smart browser integration with the ‘s Wayback Machine could mitigate this problem.)

Anyway, I predict that in the forseeable future your browser will be able to convert a Wikipedia article link into a home page link if that is your preference, aided by Semantic Mediawiki annotations or similar.

The second reason I link to Wikipedia preferentially2 is that Wikipedia article URLs conveniently serve as “, as specified by the . If Technorati and its competitors happen to index this blog this month, it will show up in their tag-based searches, the names of the various Wikipedia articles I’ve linked to serving to name tags. I’ve never been enthusiastic about the overall utility of author applied tags, but I figure linking to Wikipedia is not as bad as linking to a tagreggator.

Also, Wikipedia serves as a tag disambiguator. Some tagging service is going to use Wikipedia data to disambiguate, cluster, merge, and otherwise enhance tags. I think this is pretty low hanging fruit — I’d work on it if I had concentration to spare.

Update: Chris Masse responds (see bottom of page). Approximate answer to his question: 14,000 links to www.tradesports.com, 17 links to en.wikipedia.org/wiki/Tradesports (guess where from). I’ll give Masse convention.

In the same post Masse claims that his own “following of Jakob Nielsen’s guidelines is responsible for the very high intergalactic popularity of my Internet presence.” How very humble of Masse to attribute the modest success of his site to mere guideline following rather than his own content and personality. Unfortunately I think there’s a missing counterfactual.

1 I would think that, having written most of the current Webjay article.

2 Actually my first link preference is for my past posts to this blog. I figure that if someone is bothering to read my ramblings, they may be interested in my past related ramblings — and I can use the memory aid.

Pro abortion

Tuesday, January 10th, 2006

Why would anyone, especially a self-styled economist say something as silly as the following?

In spite of the slander of pro-lifers, nobody is in favor of abortion. Abortion is horrible. Ask anybody who had one.

Clearly anybody who has had an favored abortion over giving birth, just as anybody who has had a root canal favored enduring the operation over an eventual jaw infection and chronic pain. People don’t bother saying “nobody is in favor of root canals.” Of course few people look forward to a medical procedure, be it abortion, root canal, hernia repair, or far more unpleasant. An economist of all people should recognize the nullity of claiming nobody favors a choice that many people actually make, given real world constraints.

I favor abortion. Strongly. Kill the parasite! I favor even more strongly, but abortion is a good backup plan.