It’s not clear what your issue here actually is. Why do you want to avoid wordbreaking on a space? Surely you only want that to occur occasionally, not all the time?
Keep in mind that if it worked “correctly”, always joining words when there was an intervening space, each sentence above would be “just one word”:
To be extra clear:
“It’s not clear what your issue here actually is” - the system would think just one word.
“Why do you want to avoid wordbreaking on a space” - same here - just one word
The sentence-ending punctuation would likely keep the sentences from merging, as would commas, hence leaving them out of the text in this example.
I severely doubt that this is what you actually want. We expect you to give us a wordlist, not a “sentence list,” after all. A “sentence list” would be much harder to provide for you… and also way less performant or usable.
If this somehow is what you actually want… well, our models aren’t designed to work that way. It’s a limitation of the current system.
If you’re just looking to have a few small phrases appear as predictions - say “tha” predicting “thank you”, that should be fine to include in the word list. It’ll only appear as a prediction when typing “thank” due to current limitations, but if high-frequency enough, “thank you” should actually appear as a prediction.
I have just uploaded my Taiwanese model.
Although this model is not perfect now, this model is a big step forward for keyman model. Because the existed keyman lexical model are so few, every new model will be a progress.
I have tried my best to program the resource file.
I hope the keyman lexical model will be more and more perfect. This year is 2025, the keyman lexical model is still in the stage of 2004 to 2008, very rough-made, from my feeling. Keyman need more engineers to complish the keyman lexical model to a practical and effective tool.
I totally agree that lexical models need more work. We now have one engineer working on lexical models, for the last month or so, after several years of having no one available to work on them – because we have very limited resources. It’s going to take time to improve them – even 1 engineer can only do so much.
We work with the resources we’ve got to implement functionality as best as we can – lexical models are just one small part of what we do. And without additional resources, we can’t just magically do the 17 years worth of development that we apparently need to add to Keyman to get lexical models to a ‘2025’ level.
By the way, what language are you using to code the keyman lexical model resource file.
It seem like it is not a ordinary language such as python or C language but a specific language.
And I want to ask the meaning of export default in the final sentence.
Can I modify this sentence and let it can export something else?