How to make a Bloom Talking Book from a single mp3

JohnHatton · November 1, 2017, 5:20pm

UPDATE: Bloom 4.7+ can import recordings:
See post below for details.

Bloom 4.6 Beta update (Fall 2019):
See post below for alternate method.

Original post:
As of Bloom 4.0, the built-in way to make a talking book is to record one sentence at a time, from within the Bloom Talking Book Tool. In some situations, you can get better-sounding results by just recording the whole book in some recording software, then adding that audio to your Bloom book.

The following requires Aeneas, Bloom, and Audacity.

If you need to do this, but the following is too complicated/technical, we could build this into Bloom and make it much easier. If that’s what you need, consider adding a Feature Request (or voting on one if someone already did it).

Prepare the Bloom book
a. Open the book in Bloom
b. Open the Talking Book Tool
c. Step through the pages that are in the recording (without recording, skipping pages not recorded). (This allows Bloom to segment the pages and assign ids to spans.)
Run the Bloom HTML through Aeneas
Configure Aeneas for your language and tell it to process the bloom HTML file and your recording and make a json file. Use a command line like this (with the <> bits replaced):

python -m aeneas.tools.execute_task "<path to your audio file>.wav" "<path to the html file for your book>.htm" "task_language=<your language code>|is_text_type=unparsed|is_text_unparsed_id_regex=i?[0-9a-f]+|is_text_unparsed_class_regex=^audio-sentence$|os_task_file_format=json|is_text_unparsed_id_sort=unsorted" "<path to where you want the json>.json"
Convert the json into an audacity labels file.
Currently this requires a little stand-alone command-line program we wrote, and an accompanying dll. You can get them from here. Put both files in the same folder and either add it to your path or make that your current directory.). Use a command line like this:
AeneasToAudacityIdLabels <path to the json from step 3>.json .txt
Split up the mp3 into one per segment:
a. Run Audacity, open your audio file.
b. Import the labels file created by Aeneas.
c. File/Export multiple. You seem to have to click OK for each sentence, which is a pain…there may be a way to turn that off. Save the exports in a folder called audio in the book folder.
Open Bloom and check things in the talking book tool.
a. You can re-record any sentence that didn’t come out right.
b. Pay particular attention to the book title. I found that Aeneas made three wav files for the title, two more-or-less empty and one good one. They have the same names except two add -2 and -3. Keep the good one and rename it without the extra bit if necessary.

There are various ways we can streamline this process…this is just a starting point. One is that step 3 can probably be omitted. Try changing step 2 to:

python -m aeneas.tools.execute_task "<path to your audio file>.wav" "<path to the html file for your book>.htm" "task_language=<your language code>|is_text_type=unparsed|is_text_unparsed_id_regex=i?[0-9a-f]+|is_text_unparsed_class_regex=^audio-sentence$|os_task_file_format=tsv|is_text_unparsed_id_sort=unsorted" "<path to where you want the audacity labels file>.txt"

andrew_polk · August 27, 2019, 11:22pm

Now that 4.6 is beta, here is an alternative, slightly less technical (but still manual and tedious) method:

Split your mp3 file into one file per text box.
In your Bloom book, open the Talking Book Tool.
Check the box for “Record by whole text box, then let Bloom split it into sentences later.”
Hold the record button long enough to record a dummy recording. This is about one second.
In the folder where the Bloom book is stored on disk (My Documents/Bloom/Collection/Book), go into the audio folder.
See the .mp3 file which has been generated from your dummy recording.
- It will be helpful to sort by the “date created” so the new ones are at the top of the list.
Replace this file with your pre-recorded audio file for this text box using Bloom’s generated file name.
- Click on the dummy mp3 file.
- Press the F2 key and copy the file name.
- Delete the dummy mp3 file.
- Move the real (pre-recorded) mp3 file inot the folder.
- Press F2 for that file and paste the Bloom-generated file name.
Leave the page in Bloom (by clicking on another page). Return to the page (by clicking on it).
Click the “Split” button.
- If you have not installed aeneas, you will get a message with a link to the installer. Install it now.
Repeat for each text box in the book.
There is quite a bit in the help files about this splitting process and how you can improve it, especially if you are not using a Roman-based orthography. Click the help link on the Talking Book Tool and click “Split Recordings”.
Before going into the Publish tab in Bloom, delete all the dummy .wav files which got added in the audio folder.

If you do not care about sentence-level highlighting, you can simply skip the splitting step.

andrew_polk · December 7, 2020, 2:49pm

Sorry this didn’t get updated earlier.
With Bloom 4.7 and following, you can Import Recordings directly for each text box and then split them if desired.

Click the Help link in the Talking Book Tool for more details.