HCI Deep Dive

Using Generative AI for Audio Creation

I’ve been experimenting with the Audio Overview of Google’s NotebookLM to create a podcast about Human-Computer Interaction (HCI), and I have to say, it’s surprisingly useful! Using their Audio Overview feature, I’ve uploaded several HCI publications, and NotebookLM has turned them into conversations.

Additionally, I also summarized most of my publications. Have a listen.

Google just announced that they allow to customize the Audio Overviews. I’m pretty excited about this feature, as the standard summarization is a bit too casual for me. Here is an example from the podcast without customization:

Notice that the speakers use “like” and other rather informal language a lot. Also they like to finish each others sentences. It would be nice if one of the speakers could be the interviewer and the other the expert.

For the later podcast episodes I tried several customization prompts. Some were longer introducing the roles of the two speakers in more details. Yet, I found shorter ones to work better. Listen to the difference, when prompting for “Use formal language. Don’t use the word “like.”

I also started to look into some open-source tools to perform similar tasks and found Podcastfy. Brave new world for podcasters.

I’ll continue experimenting. In the meantime you may want to listen to some HCI Deep Dive Conversations.