Latin and Cyrillic fonts: adapted glyphs and OpenType/Graphite features (stylistic sets) [REVISED]

  1. IPA
    I propose to add a stylistic set for IPA.
    This should:
    a) use the two-storey U+0061 ⟨a⟩ in all styles, even italic, to maintain the imperative distinction between basic Latin ⟨a⟩ and ⟨ɑ⟩ of Unicode’s “IPA Extensions block”.
    b) use the preferred IPA glyphs where appropriate. As principle 3 of the “Handbook of the International Phonetic Association” (1999:159) states: “… The non-roman symbols of the IPA have, as far as possible, been made to harmonize with the roman letters. For instance, the Greek letters included in the IPA are roman adaptions; as the ordinary shape of the Greek letter β does not harmonize with roman type, in the IPA it has been given the form β [β with a serif is shown here, Lili]…” Apart from ⟨β⟩ may can also concern Greek ⟨θ⟩ and ⟨χ⟩, depending on font design. (Greek ⟨λ⟩ also has an IPA number, 295, but is not used in IPA).

  2. Accented non-decomposable Latin and Cyrillic capital letters
    I encountered this problem with the Cyrillic Montenegrin capital letters ⟨З́⟩ and ⟨С́⟩ which are not composable and, following Unicode policy, never will be… In fonts like “Gentium” the acute accent above (Latin and Cyrillic) capital letters looks different from the one above x-height letters: it is less steep and less high. I think it is appropriate to use such “capital accents” above all Latin and Cyrillic capital letters that are not composable, as such accent are used in various writing systems around the world.

  3. Hànyǔ Pīnyīn
    There is a Romanization for Mandarin Chinese called Hànyǔ Pīnyīn. It is more than a transliteration. Rather, there is a PRC national standard that defines spelling rules for writing text in Hànyǔ Pīnyīn (word boundaries, capitalization rules, etc.). It might be appropriate to add a stylistic set with the following features:
    a) The combining accents of ⟨Ế ế Ề ề⟩ should be stacked one above the other, unlike in Vietnamese.
    b) The preferred glyph for the acute accent is rising from the lower left to the upper right, unlike in French etc., to be suggestive of the rising tone contour (as the three other tone accents of Hànyǔ Pīnyīn are). This means that the “thicker” part of the acute accent is on the lower left and its point at the upper right. The characters concerned are ⟨´ ˊ ́ Á á É é Ế ế Í í Ḿ ḿ Ń ń Ŋ́ ŋ́ Ó ó Ú ú Ǘ ǘ⟩. (Tone accents above ⟨Ê ê M m N n Ŋ ŋ⟩ are rare but do exist.) Again, the glyphs of accents above capital letters may be different in some fonts.
    c) The one-storey form is preferred for ⟨a⟩ and ⟨g⟩. This includes the accented letters ⟨ā á ǎ à⟩.
    d) A glyph reminiscent of lowercase ⟨ŋ⟩ is preferred for capital ⟨Ŋ⟩. This is an extremely rare letter in Hànyǔ Pīnyīn that is only used in shorthand style, cf. 汉语拼音方案, table 3, note (6).

Hello Charlie -

Most of what you propose can be done using existing features. There are small differences between how our four Latin/Cyrillic families handle some of these, but I’ll refer to Gentium in most cases. See the features documentation for Andika and Gentium Plus.

I propose to add a stylistic set for IPA.
This should:
a) use the two-storey U+0061 ⟨a⟩ in all styles, even italic, to maintain the imperative distinction between basic Latin ⟨a⟩ and ⟨ɑ⟩ of Unicode’s “IPA Extensions block”.

In Andika you can turn this on with ss13 (Double-story a). In the other fonts (Gentium Plus, Charis SIL, Doulos SIL) you can use ss05 (Slant italic specials) to get that form.

b) use the preferred IPA glyphs where appropriate. As principle 3 of the “Handbook of the International Phonetic Association” (1999:159) states: “… The non-roman symbols of the IPA have, as far as possible, been made to harmonize with the roman letters. For instance, the Greek letters included in the IPA are roman adaptions; as the ordinary shape of the Greek letter β does not harmonize with roman type, in the IPA it has been given the form β [β with a serif is shown here, Lili]…” Apart from ⟨β⟩ may can also concern Greek ⟨θ⟩ and ⟨χ⟩, depending on font design. (Greek ⟨λ⟩ also has an IPA number, 295, but is not used in IPA).

The seriffed beta is cv14. The theta and chi are already what is shown in the IPA charts. I may be misunderstanding your request. Could you give visual examples of what you would like to see different about these characters?

Accented non-decomposable Latin and Cyrillic capital letters
I encountered this problem with the Cyrillic Montenegrin capital letters ⟨З́⟩ and ⟨С́⟩ which are not composable and, following Unicode policy, never will be… In fonts like “Gentium” the acute accent above (Latin and Cyrillic) capital letters looks different from the one above x-height letters: it is less steep and less high. I think it is appropriate to use such “capital accents” above all Latin and Cyrillic capital letters that are not composable, as such accent are used in various writing systems around the world.

This only applies to Gentium Plus. We would like to be more consistent in using the ‘low-profile’ diacritics for all the capital combinations. Thanks for pointing this example out.

Hànyǔ Pīnyīn
There is a Romanization for Mandarin Chinese called Hànyǔ Pīnyīn. It is more than a transliteration. Rather, there is a PRC national standard that defines spelling rules for writing text in Hànyǔ Pīnyīn (word boundaries, capitalization rules, etc.). It might be appropriate to add a stylistic set with the following features:
a) The combining accents of ⟨Ế ế Ề ề⟩ should be stacked one above the other, unlike in Vietnamese.

This is already the case. The Viet-style stacking is turned on with cv75. Are you seeing something different? Can you send a screenshot of what is not stacking properly (and tell us the exact character sequence you’re using)?

b) The preferred glyph for the acute accent is rising from the lower left to the upper right, unlike in French etc., to be suggestive of the rising tone contour (as the three other tone accents of Hànyǔ Pīnyīn are). This means that the “thicker” part of the acute accent is on the lower left and its point at the upper right. The characters concerned are ⟨´ ˊ ́ Á á É é Ế ế Í í Ḿ ḿ Ń ń Ŋ́ ŋ́ Ó ó Ú ú Ǘ ǘ⟩. (Tone accents above ⟨Ê ê M m N n Ŋ ŋ⟩ are rare but do exist.) Again, the glyphs of accents above capital letters may be different in some fonts.

I have not seen or heard of this before. Can you show some examples and give references?

c) The one-storey form is preferred for ⟨a⟩ and ⟨g⟩. This includes the accented letters ⟨ā á ǎ à⟩.

There seems to be no agreement that this is preferred for Pinyin (for example see the Wikipedia article). But if you want this it’s already controllable with features ss01, ss11, ss12, ss13, ss14.

d) A glyph reminiscent of lowercase ⟨ŋ⟩ is preferred for capital ⟨Ŋ⟩. This is an extremely rare letter in Hànyǔ Pīnyīn that is only used in shorthand style, cf. 汉语拼音方案, table 3, note (6).

See the options available through cv43.

I hope that helps - Victor

Thank you very much, Victor,

  1. IPA
    I was thinking of situations where it is unknown which SIL font is ultimately selected by the software to render the text. For example, in HTML I might define an embedded font (call it “CharisEmbedded”) that is only used if no other SIL font is installed on the user device (e.g. by specifying “font-family: ‘Charis SIL’, ‘Gentium Plus’, ‘Doulos SIL’, Andika, CharisEmbedded, serif;”). The easiest way to ensure IPA’s ⟨a⟩–⟨ɑ⟩ distinction is maintained would be to turn on a single “unified” feature for all IPA-capable SIL fonts.

  2. “Low-profile” and “high-profile” diacritics
    I appreciate that improvement. Once you have a list of “at least capital-height” glyphs and one of the correspondences between the glyphs for “low-profile” and “high-profile” diacritics, it should not be too time-consuming to create OpenType subtables that do the replacements. I am looking forward to the new font version.

  3. Pinyin
    See section “Typographic form” of Wikipedia article “Acute accent” for that Pinyin-style acute accent. Beside the text, there is also an illustration that shows that none of the fonts produced for Chinese (“Adobe HeiTi Std” by California-based Adobe as well as “SimHei” and “SimSun” by Beijing-based ZhongYi Electronics) has an acute accent tapering Western-style. Probably at the time “Adobe HeiTi Std” was produced Ken Lunde, now technical director of the Unicode Consortium, was responsible for the decisions made in the design of Adobe’s East Asian fonts, so you might want to ask him for details of Chinese typographic tradition. (BTW, the same three fonts also boast single-storey ⟨á⟩.)

Carpe diem,
Lili

Oh, replace “HTML” with “CSS” in the above reply. Sorry for the inconvenience.

Luckily, there is a high consistency with OpenType features in SIL’s fonts. But the general rule of thumb, if you need non-generic OpenType features, control the font used and apply font specific rules as required.

The font stack you suggest: “font-family: ‘Charis SIL’, ‘Gentium Plus’, ‘Doulos SIL’, Andika, CharisEmbedded, serif;”

Life would be simpler if you reduce the font stack to

font-family: ‘Charis SIL’, CharisEmbedded, serif;

In theory, that could be reduced to font-family: ‘Charis SIL’, serif; by using local() in @font-face.

The overall design would be more controlled and a single set of features could be applied to the root element or body, although personally I prefer to apply the OpenType features via a language pseudo-selector with the appropriate BCP47 language tag.

Lili -

  1. IPA
    I was thinking of situations where it is unknown which SIL font is ultimately selected by the software to render the text…The easiest way to ensure IPA’s ⟨a⟩–⟨ɑ⟩ distinction is maintained would be to turn on a single “unified” feature for all IPA-capable SIL fonts.

I’ll add this to our list of requests, however since the symbols are accessible through other features it may not be a very high priority, and I can’t guarantee we will ever add it. We haven’t had any other linguists request this behavior for italic. When there is a clear need to distinguish the two characters most people present it in upright form only, where the distinction is clear.

  1. Pinyin
    See section “Typographic form” of Wikipedia article “Acute accent” for that Pinyin-style acute accent. Beside the text, there is also an illustration that shows that none of the fonts produced for Chinese (“Adobe HeiTi Std” by California-based Adobe as well as “SimHei” and “SimSun” by Beijing-based ZhongYi Electronics) has an acute accent tapering Western-style. Probably at the time “Adobe HeiTi Std” was produced Ken Lunde, now technical director of the Unicode Consortium, was responsible for the decisions made in the design of Adobe’s East Asian fonts, so you might want to ask him for details of Chinese typographic tradition. (BTW, the same three fonts also boast single-storey ⟨á⟩.)

Thanks. However, those are examples of Latin within East Asian fonts, where there are also a lot of other changes (punctuation, spacing, etc.) that affect the style of the Latin text but would normally not apply in a Latin font. I would be uncomfortable creating a special ‘Pinyin’ stylistic set because there is not agreement that Pinyin requires it. If we did support this we would likely make it a special feature just for the acute and grave. I’m not sure what we’d call it. I will add that to the list of requests, but cannot guarantee that we’ll add it.

Pinyin is a Latin alphabet orthography, just like English, Polish or Vietnamese orthography. Contrary to Chinese character orthography, it doesn’t use punctuation marks or spacing that differ in any way from the ones used in English or Dutch. (This means that punctuation marks and spacing around them differ from French, Spanish, or German usage.)
Also, there is not the slightest graphic difference between the Pinyin and the French/Spanish/Portuguese grave accent, which are all falling: they start on the upper left, and the pen is raised on the lower right.

Thanks - that’s helpful.