Interactive Approaches to Video Lecture Assessment
Korbinian Riedhammer
ICSI
Tuesday, October 30, 2012
12:30 PM, Conference Room 5A
Abstract
Folks that have been here last winter prior to ASRU might be familiar with the title of that talk. But don't be misled, I'll have something new for you. In this talk, I will give an overview over the FAU Lecture Browser which I developed in the context of my thesis. I will start out with the description of a novel data set: The LME Lectures are a corpus of two series of graduate level computer science lectures with 18 recordings each. The courses cover topics in medical image processing and pattern analysis/machine learning. The roughly 40 hours of speech were manually transcribed, and one particular lecture was annotated with key phrases by five human raters. Using this data set, I trained three different speech recognizers using regular continuous, multi-codebook semi-continuous and subspace Gaussian mixture models, that show an error rate of about 10% WER. I will then briefly describe the key phrase extraction and automatic ranking, which was then compared against five raters on one lecture recording. Finally, I will talk about a little usability study where 10 students were asked to perform a certain task-- with and without the proposed lecture browser. Although the number of contestants is limited, the numbers are interesting: the users that had the interface could complete the tasks about 30% faster than the control group, while maintaining about the same accuracy.
Speaker Bio
Korbinian received his Diploma in computer science in 2007 from the Univ. Erlangen-Nuremberg in Germany, where he focused on speech and natural language processing. In 2008, he was a visiting scholar at ICSI working on automatic speech summarization and contributed to the CALO project (which is now partly found in Siri). In 2009, he continued his graduate work at the Univ. Erlangen, now focusing on speech recognition. In 2010, Korbinian was again with ICSI for a few months working on harmonicity based features for ASR in reverberant environments. Just recently, in August 2012, he received his Ph. D. from the Univ. Erlangen. In his thesis, he combined the results he got since 2008 to form a prototype of an interactive lecture browser, making use of speech recognition, key phrase extraction and ranking, automatic summarization, and visualization. In October 2012, Korbinian went for the hat-trick and joined ICSI again as a postdoc funded by the DAAD. He is now working on ASR and keyword spotting in the SWORDFISH project.