Shortest day 2007

Took a lunch break and my camera to the Van Tuyll – Van Serooskerkenplein. There was snow-like frilly stuff on all the traditionally green things, er, plants.





(Due to untranslatableness of some words, rest of this entry be in Dutch.)

Om de een of andere reden associeer ik creatieve, kindvriendelijke scheldwoorden zo zeer met Hergé’s Kapitein Haddock, dat toen ik dergelijke scheldwoorden tegenkwam in een boek uit 1869, de bijzonderheid daarvan me niet eens opviel. Tegenwoordig kun je iedereen een koektrommel of wafelijzer noemen zonder dat het tot noemenswaardig trekken van wenkbrauwen leidt.

Het boek is Ontboezemingen van Gabriël (pseudoniem van Carel van Nievelt), en in een toneelspel slaan twee vrienden elkaar speels met hun hoeden; de een probeert een “serieuze” monoloog te houden, de ander onderbreekt hem daarbij met scheldwoorden: boekworm! … kinderkanibaal! … hutspotverknoeier! … mottige foliant! … vogelverschrikker!

Distributed translation experiment, conclusions

A couple of lessons I learned from my distributed translation experiment:

1. Don’t worry about volunteers showing up. Initially nobody seemed to be interested in participating, but after a while somewhere from ten to twenty people turned up, which was more than enough for my purposes. I had advertised my experiment in four places: this blog, the Dutch forum at Distributed Proofreaders, a chatty general purpose Usenet group, and a mailing list for (non-literary) professional translators. OK, so do worry, a lot. :) Thing is, if you’ve made something interesting, people will come and take a peek.

2. Don’t just dabble. I set up the site as minimally as possible using the very simple Usemod wiki. Usemod is great because it so small; you can easily modify it if you have simple needs. Unfortunately, spammers found out about the site rather quickly and began hitting it heavily. If I had used better developed software, such as the Mediawiki, I could probably have turned on all kinds of anti-spam measures that were now not available to me, and that would have been too much work to develop. Even then I could probably have switched to Mediawiki, but that seemed too much work to me for a simple experiment. In hind-sight that would have allowed me to keep the experiment running, so it’s a pity I chose not to take that path.

3. Don’t underestimate your volunteers. I had assumed that the level of quality would be fairly high, but perhaps a little too consistent; and in order to remedy this I had planned to add a few bad translations myself (remember, the experiment was to measure differences in consistency). Not necessary, it turned out. The quality of submitted translations was both high and varied.

4. Let your volunteers find things out for themselves. I had planned a translation dictionary, but nobody used the pages I set up for that. No need to provide your volunteers with things you think they would need, only provide them with what they actually need.

Looking at other translation projects:

5. There are more ways to skin a cat. My experiment was set up to find out what happens when different volunteers tackle one paragraph at a time. That idea was borrowed from Distributed Proofreaders, where volunteers work at one page at a time. My fear was that you cannot slowly build a literary translation when every translated paragraph ends up with a different style (Wikipedia syndrome). My hope was that you could solve this problem by having post-processors try to smooth out the differences.

Harry auf Deutsch worked this way; volunteers would each get assigned a small bunch of pages; then chapter managers would iron out the differences chapter-wide, and a book manager would do something similar for an entire book.

I have since seen another distributed translation project that takes a radically different approach. Although volunteers there are still free to tackle a work one paragraph at a time, in practice they work on much more, sometimes even on entire novels at a time. The difference is that they limit themselves in the quality levels they try to achieve. The first volunteer or set of volunteers uses software to generate a machine translation. The second volunteer for a work tries to produce a rough translation from the machine translation. The third tries to clean up that rough translation a bit.

9 Badd-ass Bible Verses

Christians like to pretend that they are meek, they like to put themselves in a victim role and turn that other cheek (“and turn, two, three; turn, two, three…”). But if you want to convince others of the friendliness of your religion it helps to have a friendly holy book, and this is where Christians have a bit of a problem.

If the Bible had been written by King Leonidas and the rest of the Spartans from 300, it would probably read pretty much the same as it does now.

It turns out, the Bible is already chock full of ass kicking. Here are the verses that make us want to take to the streets and put some unbelievers to the sword.

(From: The 9 Most Badass Bible Verses.)

Now everyone who lived in Sodom and Gomorrah had crazy sex with everyone and just about everything: flora, fauna, fire, they had sex with rocks painted to look like God’s face, and most of them couldn’t even get off without eating filth. Kaka was very popular. Well, it was almost as popular as the grave-yard. It was a horrible place.

(From: Professor Brothers – Bible History #1.)

Of course, only true Christians can say meaningful things about the Bible.

Buffer states are just anvil states

“Buffer States are just anvil States.”

H.G. Wells in his essay “Holland’s Future”, in Current History, A Monthly Magazine: The European War, March 1915.

Log in to register

I feel like such a rube! The Discovery Channel is organizing a reality show “in which the contestants will build elaborate Rube Goldberg machines“, according to BoingBoing. Being of a sometimes curious nature, I decided to check out the site they linked to,, but in order to view the entire application form I had to log in. Still being of a curious nature, I decided to look at the registration page. This is what I saw:


In case you don’t get the joke: in order to register, you need to log in first. In order to log in, you need to register first. And so on ad infinitum.

Perhaps they want you to make Escheresque Rube Goldberg machines? Or perhaps they want to test your inventiveness? (The Rube Goldberg site has a similar test: a “skip intro” link that doesn’t work. Or does it…?)

Dutch e-books from Project Gutenberg, DBNL and Project Laurens Jz Coster

About a month a go I promised I would blog a bit about the difference between the major Dutch projects for public domain e-books.

I’m talking about:

  • books
  • in electronic format
  • with the copyrights expired
  • in Dutch
  • available for free
  • over the internet
  • in a format that allows mix, rip and burn.

That’s a pretty narrow subset of all literature ever created, but it works for me, because I’m Dutch, I can read, I have an internet connection, and I don’t like others to dictate what I should and should not do with that which I download. Also I don’t mind reading off a screen as long as that screen is attached to a pocket-sized lightweight hand-held device.

The major distinctions between Project Gutenberg, Project Laurens Jz. Coster (henceforth: Project Coster) and the Digitale Bibliotheek der Nederlandse Letteren (DBNL) in terms of literary content are:

  • Project Gutenberg also produces non-fiction, magazines, and translations of foreign classics,
  • Project Coster seems to have most of the Dutch classics
  • Project DBNL has in-copyright works

All three projects carry some of the major public domain classics, and all three projects carry obscure novels.

There are some differences in process that may or may not matter to you, depending on your needs. The DBNL claims copyrights on all of its works, regardless of whether they are really in the public domain or not. I tend to regard copyright notices on public domain works as declarations of intent to bite, and will stay away from them.

Project Coster seems to be “dead”. I e-mailed with its head honcho Marc van Oostendorp a couple of years ago, and he as good as confirmed that nothing was happening at Project Coster. Perhaps that has changed in the meantime; at least someone is still taking care of the hosting. On the other hand the broken image on its homepage may be a gentle reminder that you need not look for new versions of old books there.

Project Gutenberg takes all of its works from volunteers, and most of them from a volunteer organisation called Distributed Proofreaders. What’s that to you? Well, if you have scans of public domain books, you might try and run them through Distributed Proofreaders. They’ll do a large part of the error correction and formatting, leaving the stitching together of the pages to you.

Although the DBNL and Project Coster do not release data on the size of their catalog, sampling of their database leads me to believe that their catalogues are bigger than the one of Project Gutenberg, which does release such data.

At the time of writing Project Gutenberg is about to hit 300 etext numbers for Dutch works, which equates approximately to 300 unique works (there are a few bundled works there that are also available separately).

This just in: when checking the DBNL link, I noticed they now prominently feature a rich linguistics section on their front page.

Distributed translation experiment, two years later

Summary: two years ago, I asked people on the internet to help me create a public domain translation of a public domain source text, Poe’s The Tell-tale Heart. The goal was to help establish whether it was possible for a disparate group of translators to create a literary translation. You will find both a description of the experiment and the results below.

Read the rest of this entry »

CP: Soft Cell’s Tainted Love

Coolest song on the eighties channel, mainly for the complete change in mood right smack in the middle.

Edit: I am talking about the Tainted Love/Where did our Love Go? version. The mood change is the switch to Where did our Love Go.

F.A.S.T. wants to swap courts for ISPs

The British “Federation Against Software Theft,” a sort of RIAA for software, wants ISPs to determine whether their paying customers are file sharers. Until now F.A.S.T. had to go through that most horrid forms of mediation: the legal system. The organisation’s boss John Lovelock thinks that to “go through the courts and get a court order […] is […] awfully long-winded [and] archaic.” Dutch internet lawyer Remy Chavannes comments: “The vigilantes of F.A.S.T. are frustrated […] and would like to play judge. […] But even on the internet taking the law in your own hands is not a solution.”


Via the Iusmentis blog (Dutch).