software indication for a small-scale community dictionary development project

Dear all,

Which software and configuration would you indicate for the following setting:

  • I do have a Fieldwork project on a language where I gloss texts and build a dictionary on the way and to which I want to add the results of word collection by community members
  • there will be, for the time being, no larger dictionary or rapid word collection workshops
  • rather, I will provide one community member with a laptop and he will occasionally do small word collection sessions on his own or with some few other speakers (mostly offline)
  • the working language is Portuguese (this includes the semantic fields and stimulus questions)
  • It would be ideal to be able to collect more than one form per lexical entry so as to determine word and inflection class membership; just one citation form may be ambiguous
  • The community member should be able to either upload the latest version easily or save it in a format that can be sent by email or uploaded on some cloud drive (or to The Language Depot, if that is not too technical)
  • Being able to add audio recordings of citation forms (and, in a second step, sample sentences) would be great
  • Version control is another issue – when integrating results in FLEx, one does not want to create lots of doubled entries…

I understand that WeSay would have been an appropriate too, but it is outdated (inactive for more than 5 years); also, I was not able to install a Portuguese version of WeSay.

Browsing through the software catalogues at SIL, my guess is that nowadays “The Combine” may be the most appropriate tool, but does it meet the requirements (Portuguese version for elicitation, ideally also interface, offline use, possibly providing more than one word form).

Or is there another tool, ideally compatible with FLEx, out there that would meet the requirements? I am open to third-party software if there are possible paths to integrate the results into the FLEx database without copying-and-pasting individual data fields.

Thanks in advance for your advice!
Seb

1 Like

Hi Seb, I’m a software developer on the lexical tools team at SIL.

Based on your requirements Fieldworks is the only tool that meets all of them. Using Language Depot (now called Lexbox) is easy once it’s been setup the first time and if you’re also working in Fieldworks that will make collaboration easier. That said Fieldworks can be overwhelming when you just want to edit and add entries. Unfortunately this means you’ll need to compromise.

The Combine meets many of your requirements, including Portuguese support and audio recordings, however it only works online (there is an offline option for workshops but that’s not practical in this situation).

Language Forge is another option, however it’s more limiting than the Combine, is also online but does not have a Portuguese version, it does however have closer integration with Fieldworks which allows 2 way sync. However similar to WeSay it’s outdated and is not being actively developed. My team is responsibility for supporting Language Forge and we’re focusing our efforts on Lexbox and Fieldworks Lite.

Fieldworks Lite is a new tool my team is working on, it’s currently in alpha so it’s not yet ready for your community member. It’s designed as a companion tool to Fieldworks (and WeSay replacement) which is easy to use and works on Windows, Mac, Android and iOS. It’s compatible with Fieldworks and will sync directly with your Fieldworks project via Lexbox. That said it’s lacking in some important features for you at the moment. It does not have Portuguese support, it does not support audio (it will eventually). We’re hoping to have a beta in the next couple of months, I can let you know if you’re interesting in checking it out.

A couple questions, you mentioned

collect more than one form per lexical entry so as to determine word and inflection class membership; just one citation form may be ambiguous

and

possibly providing more than one word form

does Fieldworks serve this need with multiple writing systems? or is there another way you might represent this in Fieldworks?

How would you rank these requirements in terms of importance for you or this project?

  • Audio recordings
  • Portuguese language in the interface
  • Mobile app (instead of using a laptop)
  • Ease of use
  • Offline use

Looking forward to your response, thanks.

Dear Kevin, all,

thanks a lot for your reply. I will answer below in between your text.

Based on your requirements Fieldworks is the only tool that meets all of them. Using Language Depot (now called Lexbox) is easy once it’s been setup the first time and if you’re also working in Fieldworks that will make collaboration easier. That said Fieldworks can be overwhelming when you just want to edit and add entries. Unfortunately this means you’ll need to compromise.

I see. I will have some questions about the usage of Fieldworks in this endeavor, at the end.

The Combine meets many of your requirements, including Portuguese support and audio recordings, however it only works online (there is an offline option for workshops but that’s not practical in this situation).

Right, this is what I understood. I am surprised that SIL, which knows about the conditions in remote areas, develops software which can only be used online. Yes, the internet is becoming accessible in more and more remote areas, (including in indigenous villages in Brazil), but this is, and will be for many years to come, too unreliable and often too slow to entirely base your work flow on online access.

Language Forge is another option, however it’s more limiting than the Combine, is also online but does not have a Portuguese version, it does however have closer integration with Fieldworks which allows 2 way sync. However similar to WeSay it’s outdated and is not being actively developed. My team is responsibility for supporting Language Forge and we’re focusing our efforts on Lexbox and Fieldworks Lite.

Fieldworks Lite is a new tool my team is working on, it’s currently in alpha so it’s not yet ready for your community member. It’s designed as a companion tool to Fieldworks (and WeSay replacement) which is easy to use and works on Windows, Mac, Android and iOS. It’s compatible with Fieldworks and will sync directly with your Fieldworks project via Lexbox. That said it’s lacking in some important features for you at the moment. It does not have Portuguese support, it does not support audio (it will eventually). We’re hoping to have a beta in the next couple of months, I can let you know if you’re interesting in checking it out.

Absolutely, thanks. Sounds interesting.

A couple questions, you mentioned

collect more than one form per lexical entry so as to determine word and inflection class membership; just one citation form may be ambiguous

and

possibly providing more than one word form

does Fieldworks serve this need with multiple writing systems? or is there another way you might represent this in Fieldworks?

What I want to collect is not only the citation form of a word, but also one or two diagnostic inflected forms which reveal the shape of the underlying stem or lexeme, for which the citation form in AwetĂ­ is often ambiguous. I would not use different writing systems for that, but rather (mis-)use some note-field.

How would you rank these requirements in terms of importance for you or this project?

From 0 (not relevant) to 5 (indispensable):

· Audio recordings – 3

· Portuguese language in the interface – 5

· Mobile app (instead of using a laptop) – 4

· Ease of use – 4

· Offline use – 5

Looking forward to your response, thanks.

As to using FLEX for my project, I understand that with frequent synchronization with LexBox I can always integrate my speaker colleague’s input in my project without creating repeated entries. Fine.

I also know how to change the user interface to Portuguese; also fine.

I believe the correct “working area” and functionality would be “Lexicon > Collect Words”. There I get the general ontological field (semantic domain) and a description as well as the questions and examples.

Unfortunately, I am not allowed to send a screenshot here…

Here come some questions:

  1. Can I configure for ONLY the Portuguese content to be shown, instead of the English and Portuguese versions?
  2. In the word collection area, I tried to add a column for the grammatical class. But the field remains greyed out and inaccessible. What am I doing wrong?
  3. Earlier collected words are also greyed out and inaccessible, so there are no corrections possible? Or am I doing something wrong?
  4. I understand that there should be a way to collect audio recordings with FLEx (as of the feature list at lingtransoft.info/apps/flex-fieldworks-language-explorer), but how? Could that be part of the word collection process?

Thanks again, and in advance, for answering these questions – and let me know whether I should address this rather to the FLEx discussion list?

Sebastian

Sebastian,
I just wanted to ensure you know that the Dictionary-Making and Lexicography Courses have been translated into Portuguese. They may be helpful to the people you work with. The Portuguese courses are online, however, this venue is not allowing me to post the link.

Verna
DLS Coordinator

Here’s the link for the Portuguese courses that Verna mentioned: https://sites.google.com/sil.org/dls-course/home#h.ngwnu7jwmeoa

thanks a lot for the link.
the questions regardind FLEx I will post in the FLEx forum, then.
Thanks again!
Sebastian

@sebdrude You described your offline word-collection need as follows:

The Combine may still be useful for that situation. I’ll email you more details about how The Combine can be hosted on a laptop for offline use. Once installed on a laptop, other nearby phone and computers can collect words in the same project.