Is it possible to develop a keyboard without any Unicode implementation

Hi everyone,

Is it possible to construct custom letters/symbols for developing a keyboard?
I’m looking to create a keyboard for the Vietnamese script, quốc âm tân tự.
This is a native Vietnamese script created in the 1800’s during the Nguyen Dynasty.
However, it never gained traction due to the encroachment of French colonialism.

Currently, there is no Unicode implementation–and I also doubt there will be any time soon.
However, it is my dream along with many other kindred spirits to help get this script accessible to more Vietnamese people.
I’m hoping Keyman could be the catalyst to bootload a sort of grassroots beginning.

I’d greatly appreciate any insight.
Thank you, xin cảm ơn~

Thanks for your question.

Do you have a font that can used to display this script? Perhaps it assigns the characters of this script to Unicode values in the private use area. In that case, Keyman could be used to create a keyboard that could type those characters.

If the font assigns the characters of this script to Unicode values assigned to different characters (that is, a “hacked” font), then you might still be able to use Keyman to create a keyboard to produce these hacked code points, which would only display the characters you want if that particular font were used.

If this doesn’t answer your question, please write again.

The best approach would be to assign characters in the script to codepoints in one of the private use areas, I would avoid the Private Use Area in the Basic Multilingual Plane. So either SPUA-A (U+F0000…U+FFFFD) or SPUA-B (U+100000…U+10FFFD)

Modify any fonts to use the codepoint assignments and develop a keyboard based on it. It is an approach we use when we need to create tools to use before writing an encoding proposal.

But use of PUA assumes text is written left to wright, progressing from top to bottom. It also assumes no complex rendering is needed.

1 Like

I know that Unicode has created character sets for many (obscure) ancient orthographic systems. I don’t know what that process is like, but I do imagine it would be quite involved.

That said, I think many people would be very much interested to see what the orthography looks like!

Perhaps this Tom Scott video may inspire someone:

I read on wikipedia that SIL is one of the organizations that is helping to standardize the PUA codepoints. How do we get a suggestion for which codepoints to use for a new script if we want to avoid overlap?

PUA codepoint assignments are often standardized internally within organizations. SIL is not involved, as far as I know, in any effort for public agreement on PUA assignments. It’s called ‘private’ for a reason :slight_smile:

If it’s a conlang, then ConScript Unicode Registry, also referenced on that page, is your best option.

As @Marc says, you cannot expect standardised PUA, but we created a map for within SIL and within our own font usage. Then we used that as a roadmap for getting characters into Unicode. Then we have systematically removed the PUA codes from our fonts and used the correct Unicodes as they got added to Unicode. You can look here if you are interested: SIL’s Private Use Area (PUA) and this is how we divided up the BMP PUA blocks: We are definitely not trying to work with outside organizations to standardise it. We want people to submit proposals to Unicode so they are truly standard.

@scho Have the responses above answered your question? Let us know if you need further assistance, otherwise this topic will be closed in three week.

Avoid the PUA characters in the Basic Multilingual Plane, i.e don’t use U+E000–U+F8FF.

There are two other PUA codespaces: U+F0000–U+FFFFD and U+100000–U+10FFFD.

Personally, I’d be inclined to use the U+100000–U+10FFFD codespace.