Character display issues in Microsoft Excel and Word

Sometimes characters do not display correctly in one or more programs due to language differences and renderings…

Recently, we have been getting many languages from Asia and India region. We have found out that there are Font Display issues if the translation text is exported to Microsoft Programs such as Word or Excel because these languages have combination of letters and combining marks. Unfortunately, these languages display correctly in ParaText (no errors are detected or shown) so we do not know that there is an issue until after we export the text. In other words, Paratext displays combining characters correctly but other programs do not.

This issue is common with texts from the India region (especially Devangari).

https://hosanna-my.sharepoint.com/:b:/g/personal/purquidez_fcbhmail_org/EaNFxfMrLedLhJ4wjtygQIsBbkQH4R_U_E3GIxTZyU6hIQ?e=4LvPLJ

Have you tried using LibreOffice Calc (instead of Excel)? That might be a viable solution, but even if not, it might help identify where the problem lies.
(PS: Love the Luke 5:26 quote!)

1 Like

Hello Peter,

Short answer: I think you want the Devanagari character U+0910.

dfgnjbdndiffpljl.png

Putting a vowel sign (in this case 0947) on an independent vowel (090F) is not a valid combination. Vowel signs are used to modify the inherent vowel in a consonant cluster.

So why does it work in some apps and not others? The issue here come down to two “competing” rendering systems. One is OpenType which is driven by the DirectWrite/Uniscribe rendering system by Microsoft in their products. The other is the Graphite rendering system created by us (SIL) and used in SIL apps and the LibreOffice suite. Microsoft will not render any combination of characters they think are invalid, which is why you are getting the dotted circles. The SIL model is more liberal when it comes to rendering combinations but in this case I think the Graphite code should have been more restrictive. I’ll need to look into that once I work on the Annapurna project later this year.

Summary: If you exchange the U+090F U+0947 characters for the U+0910 character in Paratext and then export to Word or Excel, it should work. If for some reason it doesn’t work, then there is an issue with the export process.

Jon Coblentz
Writing Systems Technology engineer

1 Like

Peter

Your document shows three screenshots … in first screenshot you seem to have font substitution so curious what the second font is.

In the next two the characters don’t combine.

Andrew

1 Like

Looks like a spelling mistake in the original text. And paratext is substituting a font.

I assume U+0910: DEVANAGARI LETTER AI

1 Like

new link https://hosanna-my.sharepoint.com/:b:/g/personal/purquidez_fcbhmail_org/ER8Kr8NyehJBjskWm4Hs8JgBUFz8VPH0SaAqmx4RTtwIVw?e=G01l7M

You should replace the U+090F U+0947 characters for the U+0910 character in Paratext as I mentioned in my reply earlier. The U+090F U+0947 combination is not valid and needs to be corrected in the Graphite code for Annapurna SIL. We’ll work on correcting that later this year.

As Andrew pointed out, there is font substitution happening when you export. The spreadsheet screenshots show a font different from Annapurna SIL. That is a secondary issue. You need to use U+0910 to correct the primary issue.

Jon Coblentz
Writing Systems Technology engineer

For this situation, I agree with @joncoblentz that you are seeing the output from Microsoft DirectWrite as it process the OpenType in whatever font you are using. You should keep in mind that Microsoft DirectWrite is not the only OpenType implementation. HarfBuzz is another one, which is used in Chrome (but maybe not on iOS) and in Microsoft Edge and other programs as well.

Table 12-1 on page 450 of the chapter on Devanagari in the Unicode Standard lists several combinations of characters that should not be used. The combination of codepoints that you are trying to use are mentioned here, along with many other combinations to avoid.

perfect! thanks for your input.

Here is the Excel spreadsheet with the Annapurna SIL font

Here it is after the character sub

1 Like