Importing CSV file


#1

I have a dictionary in CSV format that I’d like to build an app for. Is there a simple way to get this into LIFT format in any of the windows apps? If not, I guess I have to build the XML file based on the standard found on github. Many thanks in advance for the help.

p.s. this is for Inuttitut of Nunavik, Canada


#2

If a stop on your journey from CSV is a stop off in Standard Format, check out SheetSwiper, which converts from a simple spreadsheet to Standard Format. I.e, you’d open your csv in a spreadsheet program, add a header row with labels, and then send the spreadsheet through SheepSwiper. The resulting SFM file can be imported into FLEx, which can export LIFT.

SFM looks like this

\lx aidano
\ps part
\ge maybe
\de a semi-polite refusal to answer a question

#3

awesome, I will try that out. Peace!


#4

I have a similar issue, but am a Linux user (hence while Fieldworks is indeed available, I am looking for a way to achieve what SheetSwiper would have). Could you give me any advice on how to reach the same goal, obtaining a Standardized Format spreasheet from an xlsx or csv?


#6

Sorry to anyone who saw my mangled message. I guess it doesn’t work to send a message with images via email!

Make a backup of your lexicon before doing this!!!

Open a CSV file in LO Calc and insert an extra column between each data column, then fill the new columns with the correct standard format marker that applies to each data column. Here’s a sample (containing totally made-up data, sorry). You will have to determine which SF markers apply to each column of the table. Delete any heading row so you don’t get a lexicon entry containing the headings, or leave the SF code cells blank in the Heading row, as here:

My spreadsheet here will have problems if I fill in an entire column with a SF marker. Can you see the problem? If I enter a ‘\lx’ on every line, the extra senses that should be attached to word2 and** word3** will become separate entries, not connected to the correct lexeme, and those blank lexeme fields will be floating. You must know how your lexical database is structured; if it’s a complex one this “simple” conversion method might not work.

So, when you have each SF element marked, save the file as CSV again, then open the CSV in Writer. Do a Replace in Writer like this. Be sure to tick the Regular expressions box:

I ended up with what’s below. Then I replaced all the commas with space (but be careful if you have commas in your lexicon fields; you can make a regex for that but I don’t have time to play with it at the moment). Then remove extra spaces with Find: ’ {1,}’ (space is the first character); Replace with a space (keep Regular expressions ticked in the Find box.

Delete the column header information if it’s still there, and save the resulting SF file as Text Only.

,Lexeme,Part of Speech,Gloss,Example Ref,Example Vern.,Example En.

\lx,word1,
\ps,N,
\ge,one,
\xr,BoatStory 1.2,
\xv,We untied the word1.,
\xe,Etc …

\lx,word2,
\sn,1,
\ps,N,
\ge,two,
\xr,
\xv,
\xe,
,
\sn,2,
\ge,twins,
,
\sn,3,
\ps,V,
\ge,double,

\lx,word3,
\sn,1,
\ps,N,
\ge,three,


#7

Thank you very much, incredibly detailed and helpful answer!