12. Maybes |
||
Language:
ENG ELL EPO JBO TLH
LAT |
Herewith I list all possible codepoints that may in the future need to be considered or reconsidered for inclusion in Unicode, along with my opinion on whether they should, based on my current knowledge. I cross-refer to elsewhere in this site for more extensive discussion.
Character | Codepoints | Reference | Status | Discussion | Evaluation |
---|---|---|---|---|---|
Tonos | 1 | 1.4 | Disunifiable | Between Unicode 1 and 2, the Greek monotonic accent was moved from U+030D Combining Vertical Line Above, and was unified with U+0301 Combining Acute Accent. This resulted from a legislative decision in 1986 that, despite practice to date, the monotonic accent henceforth shall be the acute. In theory, the decision could be rescinded, and there is the slight possibility that researchers may want to differentiate between monotonic and polytonic accents. | Very unlikely. Too much informatic infrastructure invested in the post-1986 status quo, and the differentiation between polytonic and monotonic in a corpus is best done through tagging. |
Smooth breathings on capital upsilon | 4 | 2.3.1 | Composed | No provision has been made for precomposed codepoints with that combination; this can occur in psilotic ancient dialects. | Unnecessary. The characters can be composed, and Unicode is committed to not expanding the number of precomposed codepoints. |
Circumflex on epsilon and omicron (lowercase) | 2 | 2.3.2 | Composed | No provision has been made for precomposed codepoints with that combination; this can occur in epigraphy and diplomatic editions. | Unnecessary. The characters can be composed, and Unicode is committed to not expanding the number of precomposed codepoints. |
Circumflex on capitals | 7 | 2.3.3 | Composed | No provision has been made for precomposed codepoints with that combination; this can occur in reproductions of early modern typography, where all-caps words could bear accents. | Unnecessary. The characters can be composed, and Unicode is committed to not expanding the number of precomposed codepoints. |
Non-canonical subscripts (lower + upper) | 2 | 2.3.4 | Composed | No provision has been made for precomposed codepoints with that combination; this can occur in editions of mediaeval philology, addressing the early deletion of iota after upsilon. | Unnecessary. The characters can be composed, and Unicode is committed to not expanding the number of precomposed codepoints. |
Smooth breathing on capital rho (lower + upper) | 2 | 2.3.5 | Composed | No provision has been made for precomposed codepoints with that combination; this can occur in psilotic ancient dialects. | Unnecessary. The characters can be composed, and Unicode is committed to not expanding the number of precomposed codepoints. |
Corinthian EI (lower + upper) | 2 | 3.4 | Unrepresented | Corinthian local alphabet had developed a distinct character for /eː/, which in all other alphabets was written with the digraph epsilon iota. | Unlikely. It is conventional to represent this letter with the digraph in transcriptions of Corinthian. Only occasionally is a single letter used, in discussion of the alphabet; but the shape of the Corinthian EI (identical to normal epsilon) makes the adoption of a distinct codepoint problematic. |
Greek tack heta (lower + upper) | 2 | 5.4-5.5 | Unrepresented | There is a widespread convention for using a left half capital-heta (cased) to represent the distinct /h/ letter in many epichoric alphabets in linguistics and epigraphy. | Necessary. The heta is equivalent to the rough breathing, but the point of using the codepoint is to represent /h/ as a letter rather than a diacritic. A conflation with the other transcription, as Latin H, would be impractical. |
Greek right tack heta (lower + upper) | 2 | 5.1 | Unrepresented | Shortlived used in Southern Italian epigraphy to represent absence of /h/, corresponding to modern smooth breathing. | Possibly necessary. Same rationale as tack heta, but unlike tack heta usage is extremely rare, if any. |
Modern Iota Yot | 1 | 6.4.2-6.4.3 | Unrepresented | No glyph available from Unicode codepoints to display upside-down iota circumflex, the representation of non-syllabic iota in Greek dialectology and 19th century vernacular typography | Unnecessary. The handwritten and capital form of non-syllabic iota makes it clear this is a surrogate for U+032F Combining Inverted Breve Below. At most, this should be treated as a glyph ligature issue. |
Modern Upsilon Yot | 1 | 6.4.2-6.4.3 | Unrepresented | No glyph available from Unicode codepoints to display upside-down upsilon circumflex, the representation of non-syllabic upsilon in Greek dialectology and 19th century vernacular typography | Unnecessary. The handwritten and capital form of non-syllabic iota makes it clear this is a surrogate for U+032F Combining Inverted Breve Below. At most, this should be treated as a glyph ligature issue. |
Eteocretan Tsade | 1 | 6.6.3 | Unrepresented | One letter in the small Eteocretan corpus has been argued by Duhoux to be a reflex of Phoenecian tsade rather than mu | Unnecessary. Eteocretan is undeciphered, and the letter is unique and thus idiosyncratic; at any rate, if Duhoux's hypothesis is correct, the letter should be conflated with san, the Greek reflex of tsade. |
Archaic Sampi (lower + upper) | 2 | 8.1.1 | Unrepresented | Shortlived letter used in Ionic for /ss/, which gave rise to the numeric symbol. | Possible. As with the other numerals, the archaic letter is glyphically distinct. On the other hand, it is not clear there is any tradition of using it in Modern typography, and the corpus it applies to is small. |
Zigzag Iota (lower + upper) | 2 | 8.1.2 | Unrepresented | Character appearing in Sikinos, and in the non-Hellenic scripts of Lemnian, Phrygian, and Eteocretan | Probably unnecessary. Current research indicates that all instances should be conflated with extant letters—chi, iota, yot, or zeta. |
Tsan (lower + upper) | 2 | 8.1.3 | Unrepresented | Appears in one inscription of Arcadian, representing /ts/ as a distinct reflex of Indo-European *kʷ | Possible. The use of this character is severely restricted, and it is often transcribed with sigma and a diacritic. Nonetheless the character is important in the history of Greek phonology, and is occasionally cited as is in linguistic work. It maybe possible to conflate the letter with san. |
Ou ligature (upper + lower) | 2 | 8.2.2 | Composed | Widespread ligature | Unnecessary. Does not cease being a ligature, at least in Greek. |
Capital Yot | 1 | 9.5 | Unavailable | Arguably need uppercase equivalent to lowercase yot; some font designers have already done so | Possible. Uppercase yot could be covered by Latin J; then again, so could lowercase yot. |
IPA Beta | 1 | 10.1.3 | Unavailable | Like the other IPA characters of Greek origins, IPA beta is typographically distinct from Greek beta—with a bottom serif | Probably unnecessary. The serif was sufficient motivation to disunify IPA gamma or phi, but it is too late, given widespread Unicode usage, to do the same in this instance. |
IPA phi | 1 | 10.1.3 | Unavailable | The disunification has already been argued for, but only on the basis of avoiding script mixing. | Unnecessary. Script mixing is allowed by current UTC thinking. It is now too late, given widespread Unicode usage, to entertain disunification. |
IPA theta | 1 | 10.1.3 | Unavailable | The disunification has already been argued for, but only on the basis of avoiding script mixing. | Unnecessary. Script mixing is allowed by current UTC thinking. It is now too late, given widespread Unicode usage, to entertain disunification. |
Nick
Nicholas, opoudjis [AT] optusnet . com . au Created: 2004-11-07; Last revision: 2004-11-07 URL: http://www.opoudjis.net/unicode/maybes.html
|