My Review of Spoken.Press Audio
An article featured in one of the many newsletters I subscribe to was an interview with one of the guys behind Spoken. The site is labeled as "AI-enabled", but that can be misleading. Some of the voices are definitely human (a few even specify that they're Spoken staff members), but some, at least to my ears, are AI-generated. The "enabled" part, I believe, refers more to analyzing a manuscript, which, believe me, is quite time-consuming.
I joined Spoken as a beta tester, and thus it was free. Once everything is in place, the site will charge a fee for its services, and people like me are the guinea pigs for ensuring that everything ends up in place. That's a fair trade-off.
Audiobook creation is prohibitively expensive. If an author wants to go the AI route, they do so at their peril. I personally wouldn't listen to an AI narrator; I've heard some (I believe KDP even has a pilot program) and they're pretty bad. Hiring a talented voice actor is...well...only for independently wealthy writers or those who don't mind plunging into debt. That's why Spoken's beta program piqued my interest. I went into it with no expectations; it was an experiment; a diversion, really. Mostly, I was curious.
A user is instructed to upload a document of either 14,000 or 15,000 words (can't remember or easily verify), which to me seemed rather useless. Yes, I could upload a few chapters of my manuscript, but the point of that was lost on me. Would there be a way to merge all my chapters together if I uploaded them separately? That question was more complicated than I bargained for. All of my novellas exceed that word count, so I couldn't even play around with one of those. In the end, I cast caution to the wind and tried uploading my entire 140,000-plus word manuscript. It worked.
Oops ~ I need to backtrack. Before I could upload, I was instructed to insert specific tags at every section break: <INSTALLMENTMARKER> Naturally, I inserted them in the wrong spot; I think the instruction was "before the section break", so I inserted them on the line above the Chapter titles, and Spoken got snippy with me and made me go back and redo them. Forty chapters, so needless to say, that was a pain in the ass. And now I've got a manuscript with a bunch of tags that'll need to be deleted. Nevertheless...
Spoken's analysis took a really long time, but it was after all, a really long document. It even told me to just leave and they'd let me know when it was done. In the end, it identified a lot ~ a whole lot ~ of characters, which proved to be a problem when it came time to assign voices. Some of the characters were inanimate objects, such as "ATM machine", which, yes, did display a message, but could not be called a "character". Other characters were generic, such as "a girl" or "the woman". To me, they didn't warrant having voices assigned, especially since many of them only spoke two or three words, plus I had no idea who they were. And, of course, sometimes in place of a proper name, I referred to a character another way in the manuscript, just to break up the monotony. Thus, Paula was sometimes "the singer", which Spoken identified as a separate character. Fortuitously, there is a merge option for this situation, especially helpful to me since my main character, for example, goes by two different names throughout the course of the story.
My problem with assigning voices to unknown people was that I didn't know who they were at first. Spoken asks for a description of each minor character in order to generate voice suggestions, but until I heard the actual words spoken, I was proceeding blindly. So, I ended up inputting a description such as, "This is a man in his early thirties with no discernible accent, blah blah blah ~ you have to input a certain number of words in the description field. Then, once I heard the words spoken, I realized it was a woman, not a man, and in fact was a character I'd already assigned a voice to under a different name. It was very tedious. Again, not everyone deserved a voice. If they only uttered two words in the entire story, assigning them a voice seemed unwarranted. In the end, Spoken probably identified at least thirty separate characters. Even if they had a proper name, I often didn't remember who they were without going back to my manuscript. They were just "names" for a particular scene.
Now to the voices themselves. I was very satisfied with the choices Spoken presented to me for each character. It offered three "best matches", followed by a listing of other options, although many of them were premium voices, which I was unable to choose. As I got further into assigning voices, I realized that one of its top three matches was almost always the best, and I rarely bothered exploring others. My story is heavily weighted on the male side, and I initially worried that there wouldn't be enough variation among the matches to distinguish each character, but I did quite well choosing unique voices, overall. The app does make it clear which voices are AI, which are voice actors, and which are "we dragged some of our employees in to record themselves".
Let me just say from the outset that I only ever imagined my story to have one narrator throughout, so this exercise was jarring. I worked on choosing voices intermittently across a few days, and it was exhausting. You know how it is when you've done repetitive tasks and by the time you approach the end, you barely care anymore? I won't, of course, know how my impatience affected the outcome until I've listened to the entire audio, but let's just say, it's going to be a while. The entirety of the audio comprises more than ten hours. So far, I've managed to play back the first couple of chapters, and they're short chapters. Part of my reluctance to proceed is trepidation, and the other part is, I've read this thing so many damn times already that I can barely stand it.
However, here are my initial takeaways. Like with all AI, it struggles with pronunciation. I've encountered worse with Microsoft Word's text-to-speech functionality, in which the narrator couldn't pronounce my main character's name, and time will tell if I run into the same issue with Spoken. But from Chapter One, the narrator couldn't pronounce Visine, which is minor and I can live with it. It hasn't mispronounced any other words...yet.
As for multiple character voices, I'm still not sure I'm on board with it. So far, only a few characters have spoken, and Irma from the truck stop is appropriately grouchy, which is what I was going for. Burt (an important player) isn't quite right; not as calm and serene as he should be; again, I'm okay with it...for now. His voice was difficult to nail down. None of the three best options I was given were perfect.
Right now it's impossible for me to provide a complete review, but I will add to this post as I progress with my listening. One has to bear in mind that this is a free service (for now), so certain accommodations must be made. I'm questioning whether I should have made my audiobook public without first listening to it, but I am assuming I can change that.
Please stay tuned.
EDITED TO ADD: Some of the pacing is not ideal. Spoken does advise entering some prompts in order to change the pacing. I will consider those, but man, this is a long, long story, and fixing everything that's not quite right is a bit more than I bargained for.
UPDATE: I deleted it and started over. It wasn't so much that I was dissatisfied with the multiple voices, but that when I listened to the first couple of chapters yesterday, something seemed off. When Leah was exploring the town of Chance, she passed the saloon and only remarked that it smelled of nicotine-infused cedar. She didn't even mention that it was a saloon; she just walked past. Something was missing. Sure enough; when I formatted the manuscript, I left off a whole line. This iteration sounded like a half-thought. Worse, I did it with every version of my book ~ both ebook and paperback. Stupid, stupid! So, I had to re-upload both the Kindle document and the paperback template to KDP. How many copies have gone to people with this stupid omission?
Anyway, I do feel that the audio version should only have a single narrator. The original sounded too clunky. So now Spoken is converting my audio in its new format. More to come.
Comments
Post a Comment
Your comments are welcome! Feel free to help your fellow writers or comment on anything you please. (Spam will be deleted.)