Bloom 4.6 Beta update (Fall 2019):
See post below for alternate method.
As of Bloom 4.0, the built-in way to make a talking book is to record one sentence at a time, from within the Bloom Talking Book Tool. In some situations, you can get better-sounding results by just recording the whole book in some recording software, then adding that audio to your Bloom book.
If you need to do this, but the following is too complicated/technical, we could build this into Bloom and make it much easier. If that’s what you need, consider adding a Feature Request (or voting on one if someone already did it).
Prepare the Bloom book
a. Open the book in Bloom
b. Open the Talking Book Tool
c. Step through the pages that are in the recording (without recording, skipping pages not recorded). (This allows Bloom to segment the pages and assign ids to spans.)
Run the Bloom HTML through Aeneas
Configure Aeneas for your language and tell it to process the bloom HTML file and your recording and make a json file. Use a command line like this (with the <> bits replaced):
python -m aeneas.tools.execute_task "<path to your audio file>.wav" "<path to the html file for your book>.htm" "task_language=<your language code>|is_text_type=unparsed|is_text_unparsed_id_regex=i?[0-9a-f]+|is_text_unparsed_class_regex=^audio-sentence$|os_task_file_format=json|is_text_unparsed_id_sort=unsorted" "<path to where you want the json>.json"
Convert the json into an audacity labels file.
Currently this requires a little stand-alone command-line program we wrote, and an accompanying dll. You can get them from here. Put both files in the same folder and either add it to your path or make that your current directory.). Use a command line like this:
AeneasToAudacityIdLabels <path to the json from step 3>.json .txt
Split up the mp3 into one per segment:
a. Run Audacity, open your audio file.
b. Import the labels file created by Aeneas.
c. File/Export multiple. You seem to have to click OK for each sentence, which is a pain…there may be a way to turn that off. Save the exports in a folder called audio in the book folder.
Open Bloom and check things in the talking book tool.
a. You can re-record any sentence that didn’t come out right.
b. Pay particular attention to the book title. I found that Aeneas made three wav files for the title, two more-or-less empty and one good one. They have the same names except two add -2 and -3. Keep the good one and rename it without the extra bit if necessary.
There are various ways we can streamline this process…this is just a starting point. One is that step 3 can probably be omitted. Try changing step 2 to:
python -m aeneas.tools.execute_task "<path to your audio file>.wav" "<path to the html file for your book>.htm" "task_language=<your language code>|is_text_type=unparsed|is_text_unparsed_id_regex=i?[0-9a-f]+|is_text_unparsed_class_regex=^audio-sentence$|os_task_file_format=tsv|is_text_unparsed_id_sort=unsorted" "<path to where you want the audacity labels file>.txt"