Archive for the 'Project Gutenberg' Category

Lees meer antieke teksten

Saturday, January 10th, 2009

This posting in Dutch, as it wouldn’t work in English on so many levels.

Enfin! eindelijk kom je klaar. Je eindigt altijd met min of meer klaar te komen per auto. Met vuile handen zit je weer aan je stuur, hijgend en zweetend nog van ’t zwoegen, en met bezorgd gezicht staar je naar de lucht en naar den hemel, die reeds zijn avondkleur begint te krijgen.

Cyriël Buysse, De vrolijke tocht. Daarmee heeft het gewone volk ook wat leuks te lezen wat al meer dan 2 jaar geleden geschreven is.

Foei, Branko!

Werewolf? Wehr-wolf? Werwolf?

Sunday, November 30th, 2008

Taking, then, the actual existence of werwolves to be an established fact, it is, of course, just as impossible to state their origin as it is to state the origin of any other extraordinary form of creation. Every religious creed, every Occult sect, advances its own respective views—and has a perfect right to do so, as long as it advances them as views and not dogmatisms.

I, for my part, bearing in mind that everything appertaining to the creation of man and the universe is a profound mystery, cannot see the object on the part of religionists and scientists in being arbitrary with regard to a subject which any child of ten will apprehend to be one whereon it is futile to do other than theorize. My own theory, or rather one of my own theories, is that the property of transmutation, i.e., the power of assuming any animal guise, was one of the many properties—including second sight, the property of becoming invisible at will, of divining the presence of water, metals, the advent of death, and of projecting the etherical body—which were bestowed on man at the time of his creation; and that although mankind in general is no longer possessed of them, a few of these properties are still, in a lesser degree, to be found among those of us who are termed psychic.

From Werwolves by Elliott O’Donnell (1912).

Barbara Tozier also produced the fictional Wagner, the Wehr-Wolf by George W. M. Reynolds.

Also blogging elsewhere

Friday, August 15th, 2008

Although my posting frequency here never has been a thing to brag about much, lately it has dropped below the “once a week” that I unconsciously saw as a minimum. This is not because of the dreaded blogging fatigue, but because I’ve joined a couple of other blogs—which I must have written about once or twice before, so let this be just a gentle reminder.

Most of my time goes to 24 Oranges, weird and wonderful news about the Netherlands (English). (Or: just my postings.)

I used to post about twice a week at the Teleread blog, but since 2007 my Teleread posting frequency has also suffered. At first that was because of lots of paid work, but when I had more time later it went to 24 Oranges. (Or: just my postings.)

Finally, the past few weeks I have had four guest blogs up at the Iusmentis blog, which is Arnoud Engelfriet’s blog about the meeting of technology and law. Writing mainly about copyright and Project Gutenberg, I have posted the following items there (in Dutch):

I will try and translate, and then post these four entries either here or at Teleread, when I have the time. I put a lot of research into these postings, so it would be a pity to limit them to speakers of Dutch. Also, the readers of the Iusmentis blog have added some valuable comments that could use a larger audience.

Getting a little bit back from Elsevier

Tuesday, April 8th, 2008

The British-Dutch mega-publisher Reed Elsevier spent more than 3 million dollars in bribes lobbying fees in the US last year. What the publisher hopes to get back for this money? It probably won’t be a more balanced and more honest form of copyright. The US politicians that were bolstered by this “support” have been bullying most of the rest of the world into accepting always stronger and more bizarre forms of copyright. Those countries unwilling to participate are threatened with economic sanctions.

On January 1 of this year ‘t was more than 70 years ago that son of Elsevier founder Jacob G. Robbers died. In our current climate copyrights last insanely long, but not for ever. To be precise, in the Netherlands copyrights last until 70 full calendar years after the death of the author. On January 1 of this year I uploaded Herman Robbers’ De Vreemde Plant (The Strange Plant) to The Internet Archive. Please consider that a tiny remuneration from Elsevier for whatever copyright hell it’s going to loose on Dutch citizens.

(Lobbying story via Teleread.)

Ontboezemingen by Gabriël

Thursday, February 28th, 2008

Last week I posted a book to Project Gutenberg that I had talked about earlier (”Haddockisms“): Ontboezemingen by Gabriël, Carel van Nievelt’s pseudonym. Van Nievelt was a writer of fantasy and travel stories. Oddly enough he does appear from time to time in translated collections, but he has almost been forgotten in the Netherlands. Only his stories about Dutch India (what is now Indonesia) have recently been reprinted in their original language.

His fame declined during his lifetime. As Metamorfoze, the digitization project of the Dutch national library, writes:

[...] Van Nievelt was not popular with the Tachtigers [a literary movement that made l'art pour l'art, Branko]. They thought him old-fashioned, pathetic and sentimental.

[But] in his productive years he was a well-read author, and literary historians and critics paid much attention to his work: “The novelist Van Nievelt is Somebody,” a reviewer wrote in De Gids in 1884. But after that his fame faded quickly, and oblivion remained.

Snatch! Thanks to Project Gutenberg his name lives on a little longer. Ontboezemingen (Confidences) is Van Nievelt’s first book, and it contains a number of short stories and one farcical play. There are a number of stories about his travels to and time in India, and three love-letters (he continuously calls young women “nonnas”, the Italian for “grannies”). The play appears to be referenced earlier, when he describes how he got so bored at sea that he wrote a play, and he and his friends performed it, to pass the time.

With the help of countless volunteers I have transcribed the two song fragments in the book into Lilypond format, which means you can turn them into anything you want: Project Gutenberg has PDF and MIDI files of both songs. According to Van Nievelt the songs are supposed to be local, Indonesian compositions, but that is doubtful, as they follow Western chord progressions. The second tune (Gamelan) sounds supiciously like the first few notes of the theme tune to Dallas, by the way.

Haddockisms

Saturday, December 15th, 2007

(Due to untranslatableness of some words, rest of this entry be in Dutch.)

Om de een of andere reden associeer ik creatieve, kindvriendelijke scheldwoorden zo zeer met Hergé’s Kapitein Haddock, dat toen ik dergelijke scheldwoorden tegenkwam in een boek uit 1869, de bijzonderheid daarvan me niet eens opviel. Tegenwoordig kun je iedereen een koektrommel of wafelijzer noemen zonder dat het tot noemenswaardig trekken van wenkbrauwen leidt.

Het boek is Ontboezemingen van Gabriël (pseudoniem van Carel van Nievelt), en in een toneelspel slaan twee vrienden elkaar speels met hun hoeden; de een probeert een “serieuze” monoloog te houden, de ander onderbreekt hem daarbij met scheldwoorden: boekworm! … kinderkanibaal! … hutspotverknoeier! … mottige foliant! … vogelverschrikker!

Distributed translation experiment, conclusions

Friday, December 7th, 2007

A couple of lessons I learned from my distributed translation experiment:

1. Don’t worry about volunteers showing up. Initially nobody seemed to be interested in participating, but after a while somewhere from ten to twenty people turned up, which was more than enough for my purposes. I had advertised my experiment in four places: this blog, the Dutch forum at Distributed Proofreaders, a chatty general purpose Usenet group, and a mailing list for (non-literary) professional translators. OK, so do worry, a lot. :) Thing is, if you’ve made something interesting, people will come and take a peek.

2. Don’t just dabble. I set up the site as minimally as possible using the very simple Usemod wiki. Usemod is great because it so small; you can easily modify it if you have simple needs. Unfortunately, spammers found out about the site rather quickly and began hitting it heavily. If I had used better developed software, such as the Mediawiki, I could probably have turned on all kinds of anti-spam measures that were now not available to me, and that would have been too much work to develop. Even then I could probably have switched to Mediawiki, but that seemed too much work to me for a simple experiment. In hind-sight that would have allowed me to keep the experiment running, so it’s a pity I chose not to take that path.

3. Don’t underestimate your volunteers. I had assumed that the level of quality would be fairly high, but perhaps a little too consistent; and in order to remedy this I had planned to add a few bad translations myself (remember, the experiment was to measure differences in consistency). Not necessary, it turned out. The quality of submitted translations was both high and varied.

4. Let your volunteers find things out for themselves. I had planned a translation dictionary, but nobody used the pages I set up for that. No need to provide your volunteers with things you think they would need, only provide them with what they actually need.

Looking at other translation projects:

5. There are more ways to skin a cat. My experiment was set up to find out what happens when different volunteers tackle one paragraph at a time. That idea was borrowed from Distributed Proofreaders, where volunteers work at one page at a time. My fear was that you cannot slowly build a literary translation when every translated paragraph ends up with a different style (Wikipedia syndrome). My hope was that you could solve this problem by having post-processors try to smooth out the differences.

Harry auf Deutsch worked this way; volunteers would each get assigned a small bunch of pages; then chapter managers would iron out the differences chapter-wide, and a book manager would do something similar for an entire book.

I have since seen another distributed translation project that takes a radically different approach. Although volunteers there are still free to tackle a work one paragraph at a time, in practice they work on much more, sometimes even on entire novels at a time. The difference is that they limit themselves in the quality levels they try to achieve. The first volunteer or set of volunteers uses software to generate a machine translation. The second volunteer for a work tries to produce a rough translation from the machine translation. The third tries to clean up that rough translation a bit.

Buffer states are just anvil states

Tuesday, November 27th, 2007

“Buffer States are just anvil States.”

H.G. Wells in his essay “Holland’s Future”, in Current History, A Monthly Magazine: The European War, March 1915.

Dutch e-books from Project Gutenberg, DBNL and Project Laurens Jz Coster

Friday, November 16th, 2007

About a month a go I promised I would blog a bit about the difference between the major Dutch projects for public domain e-books.

I’m talking about:

  • books
  • in electronic format
  • with the copyrights expired
  • in Dutch
  • available for free
  • over the internet
  • in a format that allows mix, rip and burn.

That’s a pretty narrow subset of all literature ever created, but it works for me, because I’m Dutch, I can read, I have an internet connection, and I don’t like others to dictate what I should and should not do with that which I download. Also I don’t mind reading off a screen as long as that screen is attached to a pocket-sized lightweight hand-held device.

The major distinctions between Project Gutenberg, Project Laurens Jz. Coster (henceforth: Project Coster) and the Digitale Bibliotheek der Nederlandse Letteren (DBNL) in terms of literary content are:

  • Project Gutenberg also produces non-fiction, magazines, and translations of foreign classics,
  • Project Coster seems to have most of the Dutch classics
  • Project DBNL has in-copyright works

All three projects carry some of the major public domain classics, and all three projects carry obscure novels.

There are some differences in process that may or may not matter to you, depending on your needs. The DBNL claims copyrights on all of its works, regardless of whether they are really in the public domain or not. I tend to regard copyright notices on public domain works as declarations of intent to bite, and will stay away from them.

Project Coster seems to be “dead”. I e-mailed with its head honcho Marc van Oostendorp a couple of years ago, and he as good as confirmed that nothing was happening at Project Coster. Perhaps that has changed in the meantime; at least someone is still taking care of the hosting. On the other hand the broken image on its homepage may be a gentle reminder that you need not look for new versions of old books there.

Project Gutenberg takes all of its works from volunteers, and most of them from a volunteer organisation called Distributed Proofreaders. What’s that to you? Well, if you have scans of public domain books, you might try and run them through Distributed Proofreaders. They’ll do a large part of the error correction and formatting, leaving the stitching together of the pages to you.

Although the DBNL and Project Coster do not release data on the size of their catalog, sampling of their database leads me to believe that their catalogues are bigger than the one of Project Gutenberg, which does release such data.

At the time of writing Project Gutenberg is about to hit 300 etext numbers for Dutch works, which equates approximately to 300 unique works (there are a few bundled works there that are also available separately).

This just in: when checking the DBNL link, I noticed they now prominently feature a rich linguistics section on their front page.

Distributed translation experiment, two years later

Thursday, November 15th, 2007

Summary: two years ago, I asked people on the internet to help me create a public domain translation of a public domain source text, Poe’s The Tell-tale Heart. The goal was to help establish whether it was possible for a disparate group of translators to create a literary translation. You will find both a description of the experiment and the results below.

Read the rest of this entry »