Characters Outside Unicode

		4.1. Ken Whistler on Adscripts
		Home > Greek > Unicode > Titlecase & Adscripts
Language: ENG ELL EPO JBO TLH LAT		Home > Greek > Unicode > Titlecase & Adscripts

Ken Whistler sent me an email on 2006-11-21 clarifying the complicated history of adscript iotas in Unicode and its ISO precedessor. I reproduce it here with permission.

==========================================================

In reading through the page, I noted a couple paragraphs that have the history a little garbled, to wit:

"The glyphs ELOT gave for the capital versions of the subscript iota in Unicode 1.0 followed the Subscript tradition: the iota was subscripted, rather than adscripted, under the capital letters. ..."

"Acceptable or no, though, the classicists have had their way. By version 2.0 of Unicode, the code chart glyphs were presented with adscript, rather than subscript iotas. ELOT went a step further, and actually got Unicode to rename the characters, from ypogegrammeni (subscript) to prosgegrammeni (adscript): U+1FBC is now Greek Capital Letter Alpha With Prosgegrammeni."

This contains several errors.

Here is the actual history.

Unicode 1.0

Unicode 1.0 was produced in reaction to what SC2/WG2 was doing in drafts for ISO/IEC 10646 at the time. The Greek approach in Unicode 1.0 was simple, and assumed that all polytonic Greek would be represented by sequences of base characters and diacritics. There were no precomposed polytonic Greek characters in it at all.

With respect to iota, in particular, it contained both:

U+0370 GREEK NON-SPACING IOTA BELOW

which was expected to be used for polytonic Greek, but also the spacing clone for it:

U+03F5 GREEK SPACING IOTA BELOW

The spacing clones for diacritics were not such a hot idea in general, but were needed for mapping various preexisting standards.

Unicode 1.1

Unicode 1.1 was the result of the great merger of Unicode 1.0 and the failed DIS-1 for ISO/IEC 10646. The merger resulted in the synchronized repertoires and the continuing cooperation we have been living with for 13 years now. Greek was hit particularly hard in the merger, because the DIS for ISO/IEC 10646 contained the polytonic Greek repertoire that ELOT had proposed, and because ELOT also wanted some other modifications to the Greek block from Unicode 1.0.

The character names for the polytonic Greek repertoire additions were those from the DIS for ISO/IEC 10646. And the finally approved 10646 names were adopted wholesale into the Unicode Standard. So we ended up in Unicode 1.1 with:

U+1F88 GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROSGEGRAMMENI

That was the Unicode 1.1 name -- its original name in Unicode, and has never been changed in the Unicode Standard since.

Unicode 1.1 was published as a names list and associated discussion, but without character charts, so Unicode 1.1 was actually agnostic about the glyphs for the polytonic Greek "PROSGEGRAMMENI" -- as to whether they would be displayed with the subscript tradition or the adscript tradition.

The decompositions provided in the names list for Unicode 1.1 told a somewhat different story, however. This is well before canonical decompositions were made immutable by being tied to the normalization algorithm, however. (That happened only as of Unicode 3.0.) And in fact Unicode 1.1 notes:

"A large number of mappings were introduced by the very large number of additional characters in Unicode 1.1 from ISO/IEC 10646-1. Should errors be found in the mapping list in the future, errata notices will be made through Unicode, Inc. and on the corresponding unicode.org FTP site."

[That was, by the way, a backhanded slap at the very large collection of precomposed Latin letters, the Arabic ligatures and presentation forms and, yes, all the polytonic Greek -- those were the major source of the requirements for all the new mappings.]

At any rate, for the iota forms, Unicode 1.1 has:

U+1F80 GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI

canonical decomposition: [03B1]&[0313]&[0345]

U+1F88 GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROSGEGRAMMENI

canonical decomposition: [0391]&[0313]&[0399]

So for Unicode 1.0, those decompositions listed a combining iota subscript for the lowercase, but a capital iota for the capital form. So taken on face value, the mappings, which were created by Mark Davis, would have reflected the capital iota tradition, rather than subscript or adscript. But at that point in time, the UTC hadn't formally worked through all the casing and decomposition issues involving Greek.

The history up to that point in 10646 was, however, interesting.

DIS 10646-1.2

The actual DIS 10646-1.2 (the 2nd DIS, the one that succeeded and resulted in ISO/IEC 10646-1:1993), did have code charts, of course. And for polytonic Greek, those used subscripted iotas for the uppercase letters. And their names in the DIS 2 were as following:

U+1F90 GREEK CAPITAL LETTER ALPHA WITH IOTA BELOW AND PSILI

[Note that was in the DIS 2 -- both the code point and name changed as a result of ballot comments before the final standard was published.]

ELOT's input on this question came in comments on the earlier failed DIS 10646-1.1, which they did not in fact vote on at the time. But in a letter to the SC2 secretariat, dated 1991-06-04, ELOT stated (in a letter signed by E. Melagrakis):

"Our comments concern the recent version of ISO/DIS 10646, with the proposed inclusion of UNICODE in it. Because of the problems we have encountered a. in the proposed inclusion of non-advancing characters b. in the formulation of the Greek tables, our position against the possibility of this version of ISO/DIS 10646 to become an ISO standard is negative

"However, our position will change to positive, if you both accept our comments on subject a and replace the existing Greek tables with the new ones which you will find hereby attached."

In the comments with the attached table, Melagrakis continues:

"Our approach to the problem is that we must identify all the characters of each language and if possible include them in a basic plane of ISO 10646, as they are, without tricks between composed and non-composed characters. ...

"Greece is not opposite to the idea of including UNICODE into ISO 10646. If ISO 10646 has to change to incorporate UNICODE, so must UNICODE to incorporate the whole range of characters for the Greek language, as they are listed below.

"If non-advancing characters have to be included into the ISO 10646, then enough space must be provided for each language to include its own non-advancing characters, in the basic plane."

It was that input from ELOT that led to incorporation of the entire polytonic set into DIS 10646-1.2.

However, the Table 1 that accompanies that input from ELOT contains hand-drawn glyphs for all the Greek characters to be added, and shows the uppercase letters with iota in adscript form. And the names they suggested were:

088 GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROSGEGRAMMENI

I would have to dig deeper that I can easily do right now to reconstruct the exact details of how this Greek input, which came in response to the failure of DIS 10646-1.1 and the work then underway to create the merger, was transformed into the DIS 10646-1.2 draft, but I recall having a hand in it. At the time, I think Masami Hasegawa was still the editor (just before Bruce Paterson took over), but he needed a lot of help from the Unicode folks to actually produce the new DIS document, since most of the new content was from Unicode 1.0. Joe Becker contributed fonts and I believe actually typeset the charts for the ballot document, and I worked with my then database of Unicode 1.0 and added in all the required additional characters from WG2. I then provided typeset versions of the names list pages to Masami for the ballot document.

What I don't recall is who decided that the DIS 10646-1.2 draft (dated December, 1991) would have the subscript iotas and character names "WITH IOTA BELOW", rather than what Melagrakis had requested in June, 1991. I suspect, though, that the names were created to match the glyphs in the font.

It soon became moot, of course, because Greek comments on the DIS 10646-1.2 draft required that the names be changed to "AND PROSGEGRAMMENI", and at that point every vote on 10646 mattered.

I can't locate a printed copy of ISO/IEC 10646:1993, the finally published standard, right now, so I can't verify absolutely that the glyphs in the printed standard were modified from the draft to show the adscripts, although
I suspect so, since that would have been ELOT's request, to go with the name fixes. Easy enough to pin down the facts, if someone with ISO/IEC 10646:1993 on a shelf will take a look, though.

Unicode 2.0

Unicode 2.0 was the first printing of Unicode code charts subsequent to the Unicode/10646 merger. It used cobbled together fonts that were completely different from the Unicode 1.0 font.

For polytonic Greek, Unicode 2.0 used subscript iotas for the uppercase letters in question. The names of the characters were, of course, unchanged from Unicode 1.1.

The canonical decompositions for polytonic Greek were updated, however. So Unicode 2.0 has:

U+1F80 GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI

canonical decomposition: 1F00 + 0345 (--> 03B1 + 0313 + 0345)

U+1F88 GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROSGEGRAMMENI

canonical decomposition: 1F08 + 0345 (--> 0391 + 0313 + 0345)

This change was made as a result of analysis of the casing issues and a realization that having 0345 <--> 0399 for casing was going to cause problems.

So it would be reasonable to presume that as of Unicode 2.0 the standard had shifted to a consistent subscript position on the iotas -- at least for processing. And I think Unicode 2.0 said the right thing about it:

"The non-spacing mark ypogegrammeni (also know as iota-subscript in Engish [sic hehe]) can be applied to the vowels alpha, eta, and omega to represent historic diphthongs. This mark appears as a small iota below the vowel. When applied to uppercase vowels, it can also be rendered as a small iota at the lower right-hand corner of the vowel."

So basically, for presentation, the standard wasn't taking sides, but acknowledged both subscript and adscript renderings.

Unicode 3.0

Unicode 3.0 (2000) was synchronized with ISO/IEC 10646:2000 (the second edition). The important thing to note is that the second edition of 10646 was done with code charts created by using the same technology that was being used to print the code charts for Unicode 3.0. As a result, there was a forced convergence of the glyphs used between the two standards -- the two sets of code charts were produced with the same fonts, but with different settings for the format for the pages and the names list generation.

It was during the process of ironing out all the accumulated differences in glyphs between Unicode 2.0 and ISO/IEC 10646:1993 (and subsequent amendments to 10646) that the polytonic Greek fonts were shifted over to using adscript iotas for the glyphs.

The decompositions were unchanged from Unicode 2.0, and became immutable as of Unicode 3.0 because of normalization.

The description was modified in Unicode 3.0 as a result of various feedback on the topic, as well as in an effort to explain why the charts shifted over to adscript glyphs for the uppercase. The text from 2.0 was changed to indicate:

"... When applied to a single uppercase vowel, the iota does not appear as a subscript, but is instead normally rendered as a regular lowercase iota to the right of the uppercase vowel. ..."

So the text switched over from presumptively favoring the subscript rendition (while acknowledging adscript) to presumptively favoring adscript rendition.

Summary

So to sum up, here are the points that I think need correcting.

As best I can tell, ELOT always advocated for the adscript forms and the PROSGEGRAMMENI names.

Unicode 1.0 didn't have precomposed polytonic Greek at all. These arrived in Unicode 1.1, and when they did, already had PROSGEGRAMMENI names which never changed.

Unicode 2.0 had subscript iota glyphs for the capitals shown in the charts.

It was Unicode 3.0 that switched to the adscript iota glyphs that have been in the standard since, and the change was forced as much by the technical need to use a single font to print both Unicode 3.0 and ISO/IEC 10646:2000 as by any determination to favor classicist typographical tradition over modern Greek
typographical tradition. The issue was discussed during this time period, but the subscript forms might as well stayed in the charts except for the need to present (and justify) the adscript forms for 10646.

Nick Nicholas, opoudjis [AT] optusnet . com . au
Created: 2006-11-28; Last revision: 2006-11-28
URL: http://www.opoudjis.net/unicode/ken_adscripts.html

4.1. Ken Whistler on Adscripts

Unicode 1.0

Unicode 1.1

DIS 10646-1.2

Unicode 2.0

Unicode 3.0

Summary