Content-Based Speech and Music Editing Tools for Audio Stories

Steve Rubin

UC Berkeley

Tuesday, June 10, 2014
12:30 p.m., Conference Room 5A

Audio stories are an engaging form of communication that combine speech and music into compelling narratives. Existing audio editing tools force story producers to manipulate speech and music tracks via tedious, low-level waveform editing. I will present my work in developing new tools that allow producers to work at a much higher level of semantics. These tools address several challenges in creating audio stories, including (1) navigating and editing speech, (2) selecting appropriate music for the score, and (3) editing music to complement the speech. We have used these tools to create audio stories from a variety of raw speech sources, including scripted narratives, interviews, and political speeches. I will also present our most recent work, a system that further simplifies the editing process by generating full, emotionally relevant audio scores for stories using only a small set of emotion labels on input speech and music.

Bio:

Steve Rubin is a third year computer science PhD student at UC Berkeley advised by Maneesh Agrawala. His research focuses on combining HCI with music and media editing algorithms, and he is also interested in information visualization and design. He graduated from Williams College in 2011, completing a BA in computer science and mathematics.