How to map one key stroke to a cluster of Unicode Characters

I’m tying to modify an existing keyboard ( Telugu Winscript (NLCI) keyboard)

In this keyboard, typing ‘ksh’ will produce U+0c05 U+0c0e U+0c37 (క్ష).
To minimize the keystrokes I want to mimic what ‘ksh’ can do with a single letter. Lets say the single letter is ‘f’. Pressing ‘f’ must produce U+0c05 U+0c0e U+0c37 (క్ష).

I understand how to map multiple keystroke to one Unicode character. But I’m lost when trying map one key stroke to multiple Unicode clusters. Hardcoding each letter that can be formed from ‘ksh’ is one way but that is not in the spirit of programming. If I succeed in this I also want to extend to nukta (U+0C3C) in the future. So, I don’t want to hardcode. Any input is appreciated.

Thank you.

I believe that this is the code you need:

+ [K_F] > U+0c05 U+0c0e U+0c37

or

+ [K_F] > 'క్ష'

Hello Matthew. Are you asking that if insert + [K_F] > U+0c05 U+0c0e U+0c37 (or + [K_F] > ‘క్ష’), am I getting the desired output?

I believe that either version (the version with letters or the version with unicode values) will give the desired output. Choose the version which is easier for you to read.

Let me know if it doesn’t work.

It would get mapped to the single instance. But I also want all the vowel diacritic derivatives to follow f.
Example of desired output:
f → క్ష
fA → క్షా
fi → క్షి
fI → క్షీ
fu → క్షు
fU → క్షూ
etc…

I tried
store(vowelKeys) “AiIuUReEYoOVMH”
store(vowelMatras) “ాిీుూృెేైొోౌంః”

  • ‘f’ > ‘క’ ‘్’ ‘ష’ dk(ksha)
    dk(ksha) + any(vowelKeys) > ‘క’ ‘్’ ‘ష’ index(vowelMatras, 2)

Now when I type, fA, fi, I’m getting క్షక్షా, క్షక్షి instead of క్షా, క్షి.

Please let me know if my question still doesn’t have clarity.

Ok, that’s helpful context. What’s happening is that f runs before fA.

Once the letters have been output, they become part of the context (before the +).

This is one possible (simplified) way to do it:

+ [K_F] > 'క్ష'
'క్ష' + 'A' > 'క్షా'
'క్ష' + 'i' > 'క్షి'

The letter ‘f’ outputs ‘క్ష’ and the vowels replace ‘క్ష’ with the full form.

I’m on my phone and can’t examine all of the characters. If you’re only ever adding one character/diacritic for the vowel, you could write it with the groups and index method you proposed, but that may become difficult as group/index rules can only output one character at a time.

To be honest, I’m not sure what is going on with your dead keys (I’ve never seen them in the output side), but I’m proposing using the output of ‘f’ as the context of your vowel rules. That should work.

This should work, replacing the output of ‘f’ with the new full string.

store(vowelKeys) "AiIuUReEYoOVMH"
store(vowelMatras)    "ాిీుూృెేైొోౌంః"

'f' > 'క్ష'
'క్ష' + any(vowelKeys) > 'క' '్' 'ష' index(vowelMatras, 2)

I got the following error:
KM0200A: Invalid token found character offset: 4token

  • [K_F] > ‘క్ష’
    ‘క్ష’ + ‘A’ > ‘క్షా’
    ‘క్ష’ + ‘i’ > ‘క్షి’
    Hardcoding is an option like you mentioned earlier. But there are two problems:
  1. For all the other consonants, longer forms are vowels work with A and aa; I and ii; U and uu etc. If I hardcode like above, I would only capture capital letters.
  2. Telugu language doesn’t have sounds such as z, f, and a(as in apple). One workaround to overcome this is use nukta ( U+093C) under existing letters. For example nukta under జ(ja) would become జ with dot below it and it would be pronounced as za. To implement such practice, we would again run into challenge that we are facing with క్ష. Since nukta is not part of official language, any letter with a nukta below will become a combination of at least two distinct Unicode characters. Hence, I want to avoid hardcoding.