Word order in Lexical-Models

I wrote a lexical model for the Esperanto language, which also contains the following words (the number on the right is the number of occurrences of the word in the source text):

manĝi 180
manĝas 138
manĝaĵo 129
manĝis 103
manĝos 53
manĝu 31
manĝo 30

But when writing “manĝ” on the Keyman Developer Keyboard Test Site, the words “manĝi”, “manĝaĵo”, “manĝo” are suggested in this order, instead of “manĝi”, “manĝas”, “manĝaĵo” as they are listed in the list. Why?


That certainly sounds odd. Would it be possible for you to send us a copy of your model so far so that we can attempt to reproduce the problem?

Also, which keyboard were you using when testing the model?

I attempted to reproduce your problem with just the information you’ve provided here, but…

The order stays the same once I type manĝ. (Raw keystrokes: mang, then x to add the ^.)

I get this both with our current 17.0.238-alpha release and the 16.0.144-stable release. (You’re more likely using the latter, but I figured I’d check with both.)

The words and frequencies you provided above was the only data in my test model.

The easiest way for the problem above to occur would be if those words appear multiple times in the wordlist and the sum of the listed frequencies for each word across all appearances lined up to the order you’ve noticed. That said… you probably wouldn’t be here asking the question if this were the case; I’d imagine you’d have noticed if the words had duplicate entries.

Past that… all I can really do is speculate, since I can’t reproduce the issue you’re having.

If you can get me a copy of the model source and the keyboard (or just the keyboard’s ID, if it’s one we already distribute), that should allow me to reproduce the issue and get you a better answer.

A screenshot of the issue may also help - note how you can see which keyboard I’m using (esperuni) in the screenshot above.

@joshua_horton Thank you for your prompt reply. I rebuilt everything and now it works. Anyway I used the “Esperanto Plus” keyboard and this “Esperanto” lexical-model.

The words of the lexical model are taken from this list. Do you know if there is a better one somewhere?

av.eo.esperanto.model.js.zip (49.3 KB)

We’re glad the insightful instructions solved the issue for you, @anvaolon.

We look into many websites that has the information as close to the one provided:

  1. 1000 Most Common Esperanto Words
  2. 6000 Most Frequent Esperanto Words (Requires an additional application, Anki, to run)

We hope this help you add more words into the lexical model. This topic is resolved.
Thank you!

@mengheng Thank you.

