Meta has launched an “open” implementation of the viral generate-a-podcast characteristic in Google’s NotebookLM.
Known as NotebookLlama, the mission makes use of Meta’s personal Llama fashions for a lot of the processing, unsurprisingly. Like NotebookLM, it might generate back-and-forth, podcast-style digests of textual content information uploaded to it.
NotebookLlama first creates a transcript from a file — e.g. a PDF of a information article or weblog put up. Then, it provides “more dramatization” and interruptions earlier than feeding the transcript to open text-to-speech fashions.
The outcomes don’t sound almost nearly as good as NotebookLM. Within the NotebookLlama samples I’ve listened to, the voices have a really clearly robotic high quality to them, and have a tendency to speak over one another at odd factors.
However the Meta researchers behind the mission say that the standard could possibly be improved with stronger fashions.
“The text-to-speech model is the limitation of how natural this will sound,” they wrote on NotebookLlama’s GitHub web page. “[Also,] another approach of writing the podcast would be having two agents debate the topic of interest and write the podcast outline. Right now we use a single model to write the podcast outline.”
NotebookLlama isn’t the primary try to duplicate NotebookLM’s podcast characteristic. Some initiatives have had extra success than others. However none — not even NotebookLM itself — have managed to resolve the hallucination downside that canines all AI. That’s to say, AI-generated podcasts are sure to include some made-up stuff.