How to deal with variable length strings in Keyman Developer?

I am working on incorporating Tamil numbers into a keyboard layout using Keyman Developer, and I’ve implemented the following Keyman Developer code:

c *** numbers
store(numberKeys) '123456789'
c *** 1-௧ 2-௨ 3-௩ 4-௪ 5-௫ 6-௬ 7-௭ 8-௮ 9-௯
store(tamilNumbers) U+0BE7 U+0BE8 U+0BE9 U+0BEA U+0BEB U+0BEC U+0BED U+0BEE U+0BEF
c *** 0-௦ 10-௰ 100-௱ 1000-௲
store(tamilTens) U+0BE6 U+0BF0 U+0BF1 U+0BF2

c adding 0
'`' + '0' > '௦'
'`' + any(numberKeys) > index(tamilNumbers, 2)

c adding 10, 100, 1000
'௧' + '0' > '௰'
any(tamilNumbers) + '0' > index(tamilNumbers, 1) '௰'
'௰' + '0' > '௱'
'௱' + '0' > '௲'
'௲' + '0' > '௰௲'
'௰௲' + '0' > '௱௲'
'௱௲' + '0' > '௲௲'

any(tamilNumbers) + any(numberKeys) > context index(tamilNumbers, 2)
any(tamilTens) + any(numberKeys) > index(tamilTens, 1) index(tamilNumbers, 2)

Now the Tamil numeral system works differently than just writing the numerals.
In that I use a backtick to switch between the two. For example, `0 represents Tamil zero, `1 represents Tamil 1, and so on. To write a number in Tamil, such as 254, it needs to be expressed as 200 50 4. My code works correctly for this, as it would give ‘௨௱௫௰௪’ when I type `200504. Similarly, for 1113, it would produce ‘௲௱௰௩’ when I type `1000100103, which is the expected result.

The issue arises when we need to represent numbers greater than 10^3. In Tamil, symbols such as ‘௰௲’ (10^4), ‘௱௲’ (10^5), ‘௲௲’ (10^6), ‘௰௲௲’ (10^7), ‘௱௲௲’ (10^8), ‘௲௲௲’ (10^9), ‘௱௲௲௲’ (10^11), ‘௱௲௲௲௲’ (10^13), ‘௲௲௲௲௲’ (10^15), ‘௱௲௲௲௲௲’ (10^17), ‘௰௲௲௲௲௲௲’ (10^19), ‘௲௲௲௲௲௲௲’ (10^21), ‘௰௲௲௲௲௲௲௲௲’ (10^25), and so on are used.

In essence, I need to automate the process of handling variable length strings. When there is a sequence of ‘௲’ characters,
for example, ‘something+௲௲…’, and I press the ‘0’ key, it should change to ‘something+௰௲௲…’.
If I have ‘something+௰௲௲…’ and press ‘0’, it should become ‘something+௱௲௲…’,
and if I have ‘something+௱௲௲…’ and press ‘0’, it should transform into ‘something+௲௲௲…’.

My question is, how can I automate this process of handling variable length strings in my Keyman Developer code?"

I don’t believe we support variable-length handling directly, or any way to temporarily shift the caret position… but you should still be able to define something useful for the vast majority of cases.

Here’s my first-pass stab at it:

c *** numbers
store(numberKeys) '123456789'

c *** 1-௧ 2-௨ 3-௩ 4-௪ 5-௫ 6-௬ 7-௭ 8-௮ 9-௯
store(tamilNumbers) U+0BE7 U+0BE8 U+0BE9 U+0BEA U+0BEB U+0BEC U+0BED U+0BEE U+0BEF

c *** 0-௦ 10-௰ 100-௱ 1000-௲
store(tamilTens)     U+0BE6 U+0BF0 U+0BF1 U+0BF2
store(tamilTensNext) U+0BF0 U+0BF1 U+0BF2 U+0BF2

c [your rules here]

c Trigger special handling of ௲ + 0 first...
c https://help.keyman.com/developer/language/reference/use
U+0BF2 + '0' > U+0BF2 U+0BE6 use(tensShifter)
c Then rotation of the 'tens' symbol when not already ௲.
any(tamilTens) + '0' > index(tamilTensNext, 1)


group(tensShifter)

c ௲ + ௦ > ௰௲ - handle this case first in each set.
c https://help.keyman.com/developer/language/guide/rules#rule-order
c > Rules have a special order.  They are ordered first by length of context, with longest context first, and then by line order...

c With one U+0BF2...
         U+0BF2 U+0BE6 \
> U+0BF0 U+0BF2

  any(tamilTens)          U+0BF2 U+0BE6 \
> index(tamilTensNext, 1) U+0BF2

c With two U+0BF2...
         U+0BF2 U+0BF2 U+0BE6 \
> U+0BF0 U+0BF2 U+0BF2

  any(tamilTens)          U+0BF2 U+0BF2 U+0BE6 \
> index(tamilTensNext, 1) U+0BF2 U+0BF2

c With three U+0BF2...
         U+0BF2 U+0BF2 U+0BF2 U+0BE6 \
> U+0BF0 U+0BF2 U+0BF2 U+0BF2

  any(tamilTens)          U+0BF2 U+0BF2 U+0BF2 U+0BE6 \
> index(tamilTensNext, 1) U+0BF2 U+0BF2 U+0BF2

c ... and so on.

While I don’t believe you can get truly variable-length string processing… you can define a pair of rules for each order of magnitude following this pattern within group(tensShifter) as I’ve defined it here. I’d imagine the cases where you run into 10^30 and beyond to be extremely rare, let alone 10^100 or beyond… so just add pairs of rules deep enough for all practical use cases. That said, there may be a limit on how deep you can go in that regard, though I don’t see anything documented for this on the help site if so.

It may also be worth taking a look at what other, already-submitted Tamil keyboards have done in this regard: Keyman for Tamil99.

I see that a few of them don’t make full use of the shift layer, so adding a key that can be used directly for such very long numbers may make more sense. Granted, I’m not a Tamil speaker, but I imagine shifting layer once and tapping 33 times is better than mashing the 0 key 100 times to represent 10^100… and it would provide a workaround for the rare cases that do need numbers larger than whatever your stopping point is for the rule pattern suggested above.

This topic was automatically closed after 14 days. New replies are no longer allowed.