Post Blogs

Defeatist dreaming

Sunday, October 22nd, 2006

Jimmy Wales of Wikipedia says to dream a little:

Imagine there existed a budget of $100 million to purchase copyrights to be made available under a free license. What would you like to see purchased and released under a free license?

I was recently asked this question by someone who is potentially in a position to make this happen, and he wanted to know what we need, what we dream of, that we can’t accomplish on our own, or that we would expect to take a long time to accomplish on our own.

One shouldn’t look a gift horse in the mouth and this could do a great deal of good, particularly if the conditions “can’t accomplish on our own…” are stringently adhered to.

However, this is a blog and I’m going to complain.

Don’t fork over money to the copyright industry! This is defeatist and exhibits static world thinking.

$100 million could fund a huge amount of new free content, free software, free infrastructure and supporting institutions, begetting more of the same.

But if I were a donor with $100 million to give I’d try really hard to quantify my goals and predict the most impactful spending toward those goals. I’ll just repeat a paragraph from last December 30, Outsourcing charity … to Wikipedia:

Wikipedia chief considers taking ads (via Boing Boing) says that at current traffic levels, Wikipedia could generate hundreds of millions of dollars a year by running ads. There are strong objections to running ads from the community, but that is a staggering number for a tiny nonprofit, an annual amount that would be surpassed only by the wealthiest foundations. It could fund a staggering Wikimedia Foundation bureaucracy, or it could fund additional free knowledge projects. Wikipedia founder Jimmy Wales has asked what will be free. Would an annual hundred million dollar budget increase the odds of those predictions? One way to find out before actually trying.

Via Boing Boing via /.

Prediction market aggregator

Sunday, September 24th, 2006

Chris F. Masse points out Smartcrowd, a blog that gathers prices from several markets as the primary component of its commentary. I’d really like to see a service that only gathers prices for related contracts from several markets in an automated fashion, but Smartcrowd’s apparently manual index on GOP control of the U.S. House is a useful start.

Masse’s summary and comment on U.S. House control contracts are contradictory:

[real-money political prediction markets predict a GOP-controlled House while play-money political prediction markets predict a Dem-controlled House.]

So the crowds at Casual Observer and Newsfutures currently favour Democrats to win the House of Representatives, while the crowds at Tradesports and WSX suggest the Republicans will retain control of the House of Reps.

But WSX is a play-money market.

Aggregation should highlight a problem with play-money markets — play money is not fungible, so one can’t arbitrage between play-money markets, effectively reducing their size. I say should because there’s a pretty big discrepancy between Betfair and Tradesports real-money prices for US. House control. I’m guessing that with more active markets price difference among real money markets would shrink. There should be mountains of evidence one way or the other for sports bets. Anyone know?

By the way, Masse’s collective blog on prediction markets isn’t really launched yet but you may as well subscribe preemptively. Same for his insider blog which has a clever tagline (“the sidebar blog of prediction markets”).

Update 20060926: Masse points out Oddschecker, which does what I want for sports bets (hopefully they’ll expand) and a paper that has some evidence for lack of arbitrage opportunities between real money exchanges. See the comments for details.

When supply exceeds demand

Sunday, September 10th, 2006

Tim Lee has a wonderful take on Chris Anderson’s The Long Tail. The punchline, in my estimation:

When supply exceeds demand, as it seems to for both music and punditry, the equilibrium price is zero.

I think to be technically correct “at p=0” needs to be inserted before the first comma, but nevermind, read the whole thing.

Friends don’t let friends click spam

Thursday, September 7th, 2006

Doc Searls unfortunately decided the other day that offering his blog under a relatively restrictive Creative Commons NonCommercial license instead of placing its contents in the public domain is chemo for splogs (spam blogs). I doubt that, strongly. Spam bloggers don’t care about copyright. They’ll take “all rights reserved” material, that which only limits commercial use, and stuff in the public domain equally. Often they combine tiny snippets from many sources, probably triggering copyright for none of them.

A couple examples found while looking at people who had mentioned Searls’ post: all rights reserved material splogged, commenter here says “My blog has been licensed with the CC BY-NC-SA 2.5 for a while now, and sploggers repost my content all the time.” A couple anecdotes prove nothing, but I’d be surprised to find that sploggers are, for example, using CC-enabled search to find content they can legally re-splog. I hope someone tries to figure out what characteristics make blog content more likely to be used in splogs and whether licensing is one of them. I’d get some satisfaction from either answer.

Though Searls’ license change was motived by a desire “to come up with new forms of treatment. Ones that don’t just come from Google and Yahoo. Ones that come from us” I do think blog spam is primarily the search engines’ problem to solve. Search results that don’t contain splogs are more valuable to searchers than spam-ridden results. Sites that cannot be found through search effectively don’t exist. That’s almost all there is to it.

Google in particular may have mixed incentives (they want people to click on their syndicated ads wherever the ads appear), but others don’t (Technorati, Microsoft, Ask, etc. — Yahoo! wishes it had Google’s mixed incentives). At least once where spam content seriously impacted the quality of search results Google seems to have solved the problem — at some point in the last year or so I stopped seeing Wikipedia content reposted with ads (an entirely legal practice) in Google search results.

What can people outside the search engines do to fight blog and other spam? Don’t click on it. It seems crazy, but clickfraud aside, real live idiots clicking on and even buying stuff via spam is what keeps spammers in business. Your uncle is probably buying pills from a spammer right now. Educate him.

On a broader scale, why isn’t the , or the blogger equivalent, running an educational campaign teaching people to avoid spam and malware? Some public figure should throw in “dag gammit, don’t click on spam” along with “don’t do drugs.” Ministers too.

Finally, if spam is so easy for (aware) humans to detect (I certainly have a second sense about it), why isn’t human-augmented computation being leveraged? Opportunities abound…

Just false

Sunday, August 27th, 2006

If is what does that make Christianity and other faith-based religions? Just pseudo.

I’ve been wanting to use that throwaway comment for awhile, but not having any desire to discuss ID (I’m glad there are people out there dedicated to debunking obvious crackpot pseudoscientific claims, though for some reason the targets don’t particularly provoke in me amusement, interest, or outrage, unlike purveyors of economic-political-religious-social crackpot ideas — odd peronality quirk I suppose), I haven’t until now, occasioned by the Technology Liberation Front group blog’s (I’ve cited several times) incredibly stupid decision to add a employee to their roster.

My ignorant comment on ID: Overall, a positive development. Theists feel compelled to justify their faith on scientific-sounding grounds and are eager to debate real scientists. Even if the theists’ reasons for wanting to debate are completely disingenuous clearly they are on a slippery slope away from faith, which requires no evidence.

Update 20060901: Tim Lee’s TLF post addressing the Discovery Instituteis an utmost example of decency.

AOLternative history

Monday, August 7th, 2006

Tim Lee1 (emphasis added):

The relentless march of open standards online continues, as AOL effectively abandons its paid, premium offerings in favor of a free, advertising-supported model.

I’m happy to see open standards win and happy to acknowledge good news — I am, for the most part, an optimist, so good news feels validating.

Time Lee, closing his post (my emphasis again):

Fundamentally, centrally planned content and services couldn’t keep up with the dynamism of the decentralized Internet, where anyone could publish new content or launch a new service for very low cost.

But just how hard is it to imagine a world in which closed services like AOL remain competitive, or even dominant, leaving the open web to hobbyists and researchers?

One or two copyright-related alternative outcomes could have put open networks at a serious disadvantage.

First, it could have been decided that indexing the web (which requires making and storing copies of content) requires explicit permission. This may have stunted web search, which is critical for using the open web. Many sites would not have granted permission to index if explicit permission were required. Their lawyers would have advised them to not give away valuable intellectual property. A search engine may have had to negotiate deals with hundreds, then thousands (I doubt in this scenario there would ever be millions) of websites, constituting a huge barrier to entry. Google? Never happened. You’re stuck with .

Second, linking policies could have been held to legally constrain linking, or worse, linking could have been held to require explicit permission. ? Never mentioned in the context of the (stunted) web.

In the case of either or both of these alternative outcomes the advantage tilts toward closed systems that offer large collections “exclusive” content and services, which was exactly the strategy pursued by AOL and similar for years. Finding stuff amongst AOL’s exclusive library of millions of items may have been considered the best search experience available (in this reality Google and near peers index billions of web pages).

Some of the phenomena we observe on the web would have occurred anyway in stunted form, e.g., blogging and social networking — even now services like LiveJournal and MySpace feel like worlds unto themselves although they are not technically closed and services like FaceBook are closed. Journaling and networking on AOL would have been hot (but pale in comparison to the real blogosphere or even real closed systems, which face serious competition). It is hard to see how something like Wikipedia could have developed in AOLternative reality.

Fortunately aggressive copyright was not allowed to kill the web.2 As a result the march of open standards appears relentless. I’d prefer an even more relentless march, even if it means diminishing copyright (and patents).

1. I’m just using Tim Lee’s post as a jumping off point for an editorial I’ve been meaning to write, no criticism intended.

2. What is aggressive intellectual protectionism being allowed to kill or stunt? Online music is obvious.

Wordcamp and wiki mania

Monday, August 7th, 2006

In lieu of attending maybe the hottest conference ever I did a bit of wiki twiddling this weekend. I submitted a tiny patch (well that was almost two weeks ago — time flies), upgraded a private MediaWiki installation from 1.2.4 to 1.6.8 and a public installation from 1.5.6 to 1.6.8 and worked on a small private extension, adding to some documentation before running into a problem.

1.2.4->1.6.8 was tedious (basically four successive major version upgrades) but trouble-free, as that installation has almost no customization. The 1.5.6->1.6.8 upgrade, although only a single upgrade, took a little fiddling make a custom skin and permissions account for small changes in MediaWiki code (example). I’m not complaining — clean upgrades are hard and the MediaWiki developers have done a great job of making them relatively painless.

Saturday I attended part of , a one day unconference for WordPress users. Up until the day before the tentative schedule looked pretty interesting but it seems lots of lusers signed up so the final schedule didn’t have much meat for developers. Matt Mullenweg’s “State of the Word” and Q&A hit on clean upgrade of highly customized sites from several angles. Some ideas include better and better documented plugin and skin APIs with more metadata and less coupling (e.g., widgets should help many common cases that previously required throwing junk in templates).

Beyond the purely practical, ease of customization and upgrade is important for openness.

Now listening to the Wikimania Wikipedia and the Semantic Web panel…

Pig assembler

Friday, July 21st, 2006

The story of The Pig and the Box touches on many near and dear themes:

  • The children’s fable is about DRM and digital copying, without mentioning either.
  • The author is raising money through Fundable, pledging to release the work under a more liberal license if $2000 is raised.
  • The author was dissuaded from using the sampling licnese (a very narrow peeve of mine, please ignore).
  • I don’t know if the author intended, but anyone inclined to science fiction or nanotech will see a cartoon .
  • The last page of the story is Hansonian.

Read it.

This was dugg and Boing Boing’d though I’m slow and only noticed on Crosbie Fitch‘s low-volume blog. None of the many commentators noted the sf/nano/upload angle as far as I can tell.

Suppressing spam in search results

Wednesday, May 31st, 2006

Google’s post introducing nofollow was unfortunately titled “preventing comment spam” leading some to call nofollow a complete failure as comment spam is “thicker than ever.”

If nofollow works it does not prevent comment spam but works against sites linked to in comment spam from appearing in search results. For what it’s worth spammy search results seemed to me a growing problem perhaps a year ago. I haven’t reached a spammy site from a Google search result in a long time. If my experience of less spammy search results lately is not anomalous, has nofollow helped achieve this? Only Google and near peers are in a position to know for themselves, but I certainly wouldn’t write nofollow off as a failure, complete or otherwise.

Google should publish its findings regarding whether nofollow has improved search results. If the answer is yes, web software creators and web publishers wouldn’t make the mistake of turning it off for untrusted links. If the answer is no it should be deprecated.

Addendum 20060601: The next time Google or similar ask publishers to do something the results of which can only be evaluated by the asker a committment to publish an evaluation should accompany the ask.

I should have said Google web search above. As Chris Masse points out in a comment below, blog search still stinks and so far Google and Yahoo! blog search do not improve the state of the art, contrary to my expectations.

Media Trends

Thursday, May 11th, 2006

Bo Cowgill notes that more people search for ‘blog’ and variants than “new york times” and variants. I suspect a more relevant comparison is between ‘blog’ and ‘newspaper’. The former’s slow rise looks faster than the latter’s very slow decline.

TV and radio still rule.