Saturday, June 2, 2012

Yellow Marker Poetics


I am sitting with a used copy of the 1996 4th Ed. of Gary Geddes compilation of 20th Century poetry and poetics.  The publisher allowed him to add a note to the Preface especially for the 4th edition.  That note comes shortly after his advising the reader (student?) that he agrees with the ( then living) “Czech (sic) poet Cseslaw Milosz that poetry is [...]”

So what about an electronic edition?  The Lithuanian-Polish Czesław Miłosz, then teaching at Berkeley, might have a footnote, a reference – even a correction.

The reader who preceded me was not concerned with annotation or glosses - a yellow marker pen sufficed.

Did editor and acclaimed poet Geddes even after 3 editions not yet realize that he, too, might benefit from an editor?

Let the newly-minted teaching assistants and lecturers belly-ache: if the e-book has a competent editor and a facility for annotation we may not fall as far short of the mark as some paper editions of the recent past.



Sunday, April 29, 2012

Japanese-English Dictionaries for Poetry


Given the importance of Kanji which are not of Chinese origin, it is striking that a large standard Japanese-English on-line dictionary lacked the English words for fish through stages of their growth, such as, 'fry' and 'fingerling'.

So, neophytes, you are forewarned : if you are interested in Japanese poetry, it is worth dipping into an English translation of a good book by a native Japanese linguist on the subject of the Kanji.

This is one more reason to be interested in the topic of web page markup for poetry in translation, as the standard pop-up translation plugins for browsers are based on those default internet dictionary resources.

One source of Kanji information in English is ... wiktionary !  Try en.wiktionary.org/wiki/




Tuesday, April 24, 2012

When are translations transitive ?

At Tatoeba.org I find the assertion that if sentence A in language 'a' is translated into sentence B in language 'b' and then someone has translated B into sentence C in language 'c' that there is thereby a network of translation links which includes A - C.

This, is of course, the curse of Google translate, where an error in Russian - English is perpetuated as an error in Russian - German, Russian - French etc. as all pass through English as their way-station.

One symptom may be that tatoeba.org relies on ISO 639-3 language codes in which Nihon-go is 'jpn' and Castillian Spanish is not 'esp' but 'spa'.  Notably deutsch and français escape as 'deu' and 'fra' rather than 'ger' and 'fre'.

Perhaps the parlour game of rumour is out of favour.

One simple Japanese sentence
イソップ 童話 に 『 すっぱい 葡萄 』 という 話が あり ます。
is translated into English as
In Aesop's Fables is a story called "Sour Grapes".
and not
In Aesop's Fables there is a story called "Sour Grapes".
or
There is a story called "Sour Grapes" in Aesop's Fables.
My own comment to the "owner" of the translation, follows:
There is a story called "The Fox and the Grapes" in Aesop's Fables.
The story name is in quotes: but what is standard English today for the book title?
Note that we are not using the characters now designated as English start and end quote and English apostrophe ... (yes, there is a UNICODE character for 3 dots as well.)

Monday, April 23, 2012

edict and edict2 Japanese-English dictionary files

I have placed two utf-8 encoded HTML pages of edict and edict2 dictionaries at

 http://kanji.aule-browser.com/edict-utf-8.html  
 http://kanji.aule-browser.com/edict2-utf-8.html

for a reader to assess whether these two widely-used files might be more useful as multiple files.

One option is to have a separate file of the obscure and the archaic.

Multiple files which once appeared sensible - medical and electronic - are less obviously so if to be of use at Medtronic or elsewhere in high-tech medicine.

It is not true that a single CSV file is more useful than multiple - any more than that XML is more obviously useful than JSON or  yaml.

As for processing the edict2 file:  I will begin with the ICON language in either its UNICON or Object Icon variants.

Friday, March 16, 2012

Pierrot lunaire


Here is an alternate markup of the verses (German with English) of Hartleben's rendering of the Albert Giraud poem.  Each markup "element" can be defined to suit the purpose ( suppress the Engish translation, place the English in a pop-up responding to a mouse click or other effect.) Note that the r=  is used to encode the repeating lines so that they can be presented with some indication or none ( r=0  is a non-repeating line.)  n=  gives optional line-numbering.

The markup is defined using the Curl web content language from www.curl.com and my example markup can be viewed as ordinary HTML text at poets.aule‑browser.com/hartleben‑pierrot.html.

The markup can be implemented as macros, procedures, text-formats, text-format procedures or as new Curl syntax or any mix of the above as suits the text presentation task.

[ the entire sequence is 21 verses comprising 273 lines ]


Thursday, March 15, 2012

Graphic folly in text presentation on the web


There was a movement in the Adobe Flash vein, to have non-Latin script on the web replaced by a graphic image in the browser.  Why would that be folly?

I would suggest installing PeraPera or some such plug-in for the Firefox browser and then placing your cursor in the text of a Japanese poem.  If the poem is by text character (for Haiku, that sometimes will mean more simple Hiragana than complex Kanji), you have a good chance to capture the sense if you have a smattering of Japanese grammar and a pop-up tool with multiple senses for the Kanji.

If the poem is in a Java applet as a graphic, or embedded as Flash, or is an image in PNG or JPG or some such format, you will not have access to pop-up hints from tools such as PeraPera.

Some of the tendency to use graphics may have been based on a 'web-myth' that UNICODE could not be adequate for the graphemes which can be identified in the great variety of scripts in use around the globe. This was nonsense.  UNICODE after 2.0 is in no way like the old code-pages for character-encoding. Visit unicode.org if in doubt. Validate any claim by eccentric or luddite Eastern Language professors who are not themselves computer linguists (linguists specializing in computer science language encoding, or computer scientists specializing in linguistics in the area of phonograms as graphemes.)  Be prepared to learn at least three new technical terms with a very specific sense in the case of UNICODE.

In one of my favourite web languages, Curl, ( from www.curl.com ) there is the class for TextShape to hold an unbroken sequence of characters. But a series of characters as a single graphic is not the same as a series of graphical characters within a single container. Since Shape is also a container for Shape's, this can be a subtle point.

The use of a single graphic need not completely defeat annotation and obtaining a gloss, if the image is accompanied by text as an alternative.  However, for an unusual Kanji variant in a classical text, this is unlikely to be the answer as we are now back to the very issue at hand: how best to present infrequently encountered or otherwise difficult graphemes in text presentations. The snake has taken hold of its tale but need not begin swallowing.

Here are a few related Kanji for Albrecht Haushofer's "Der Vater" from his posthumous sonnets:
    and   and, of course,  魚釣

The above kanji characters (the last is a two-kanji compound or JuKuGo) should be visible if your browser view has char-encoding set to AUTO-DETECT or to UTF-8.

Monday, March 5, 2012

Indolence at Interest

A recent reading of James Thomson's 1748 The Castle of Indolence in honour of the Beaverbrook's The Fountain of Indolence (Turner 1755-1851  ) passed over the poet's remarks on interest without comment ( I chose to say nothing, being more interested in Whistler.)

But today I have a Kanji contra-EP. Pound on usury meets his match in the Japanese Kanji for breath and respiration: ( iki, oki ) as in profit ( ri ) and profitable interest 利息 or interest income.

Dividend-paying stocks, anyone?  Teach our children to save or to invest wisely?  Deride speculators and speculation.

Did Eliot often contradict Keynes when when that American heard him in Bloomsland?  Arthur Waley was there, after all.

adam, Atem, ahem

息 - musuko ; soku  ( son, penis ) colloquial, non-formal, familial speech

Today I must be feeling like a lettuce, but one that is いきのいい

Sunday, February 26, 2012

TEI Mandelstam


Today I was at a U Virginia e-text web page for Mandelstam: the HTML declares the character encoding as iso-8859-1 but it is not.  To my amazement the characters appear by setting the browser view to use char encoding KOI8-R.  Setting to Western Windows-1252 gave what at first appeared to be Cyrillic in column three (the columns are 'phonetic' Russian, English transl. and Russian) but in truth was Greek.

Here is their page note:
Creation of machine-readable version: Bruce A. McClelland
Creation of digital images: Bruce A. McClelland, Electronic Text Center, University of Virginia
Conversion to TEI2-conformant markup: University of Virginia Library Electronic Text Center
University of Virginia Library
Charlottesville, Va.
What to do when so much money has been spent so unwisely?



Saturday, February 18, 2012

Swinburne Virtual Library


Here is another grotesque example to illustrate why I prefer Curl to HTML for poetry markup: "Before Parting" cannot display without a stanza being broken by page numbering.

There is a pop-up item for document information, but no means to suppress the intrusive pagination.

To convince yourself, ask your web browser to view/display the page source. Hmmm.