Lexical Model - Customised Search Key term - Is it a benefit?

Dear Lexical Model experts,

I am trying to get my head around whether or not the touch-screen keyboard users would benefit by having a customised Search-term-to-key. (I have copied the help file below).

From what I understand, it is helpful with non composed characters, like:
U_025B + 0301 ɛ́ and several more like … ŋ́ Ɛ́ etc ect In the language I am working on these are very frequent.

What do you advice me to do?

Thanks in advance for any help you can give,
Bart.

PS : https://help.keyman.com/developer/13.0/guides/lexical-models/advanced/search-term-to-key

Search term to key

To look up words quickly, the trie model creates a search key that takes the latest word (as determined by the word breaker and converts it into an internal form. The purpose of this internal form is to make searching for a word work, as expected, regardless of things such as accents , diacritics , letter case , and minor spelling variations . The internal form is called the key. Typically, the key is always in lowercase, and lacks all accents and diacritics. For example, the key for “naïve" is naive and the key for “Canada” is canada .

The form of the word that is stored is “regularized” through the use of a key function, which you can define in TypeScript code.

Note: this function runs both on every word when the wordlist is compiled and on the input, whenever a suggestion is requested . This way, whatever a user types is matched to something stored in the lexical model, without the user having to type things in a specific way.

The key function takes a string which is the raw search term, and returns a new string, being the “regularized” key. As an example, consider the default key function ; that is, the key function that is used if you do not specify one:

Hi @Bart_Eenkhoorn!

I am the designer of the search-term-to-key functionality.

First I would try it without a search-term-to-key function. By default, it already makes assumptions about letter casing and striping diacritics, which makes typing in certain orthographies much quicker! We have written the default to hopefully address many language’s requirements.

Once you test using the default and find that it is not matching words the way you expect, only then would I customize the search-term-to-key function.

Does this make sense?

Hello Eddie, Thanks for replying. OK, sure we will try the basis first. But if the search-term-to-key functions like a fith gear for us I would like to know how to engage it instead of stating in 4th gear all the time. :slightly_smiling_face:

Sure! Does this mean improving the documentation? We can give general guidelines of how to write your own function. Would that help?

Hi Eddie. I don’t know if extra documentation would help. I would need to understand what I am doing :slightly_smiling_face:
For instance, when typing the letter ɛ I am presented with ɛ́ɛ́ and rɔ as alternatives by the model. I would have expected ɛ́ɛ́ or ɛ̀ɛ̀ and not rɔ. Can the function you are talking about take care of that ?

Hi Bart,

For instance, when typing the letter ɛ I am presented with ɛ́ɛ́ and rɔ as alternatives by the model.

Honestly, that surprises me!

I would have expected ɛ́ɛ́ or ɛ̀ɛ̀ […]

Correct; that is what should happen.

[…] and not rɔ.

This is the weird part that should not happen.


I honestly don’t know what’s going on in this case, but yes, the default search-term-to-key should be able to handle “ɛ́ɛ́” and “ɛ̀ɛ̀”

Feel free to DM me so you can send me your model files, and I can try to figure out what’s going wrong!

This could be suggesting a correction (“fat-finger”). Try turning off ‘corrections’ in the Keyman app prediction settings.

OK, I can try. Can you tell me in what file I can make the change? Can you give an exemple?

In the Keyman app, go to Settings, Installed Languages, [your language], and then turn off Enable corrections:

If you are using Keyboard App Builder, I’m not sure if that setting is exposed or not.

OK thanks, I’ll try it


Virus-free. www.avast.com