Project OUCH has been completed, and the final report is available here.
The central idea behind this project is that if we want to improve recognition performance through acoustic modeling, then we should first quantify how the current best model — the hidden Markov model (HMM) — fails to adequately model speech data and how these failures impact recognition accuracy. We are undertaking a diagnostic analysis that is an essential component of statistical modeling but, for various reasons, has been largely ignored in the field of speech recognition. In particular, we believe that previous attempts to improve upon the HMM have largely failed because this diagnostic information was not readily available. In our initial research, we are using simulation and a novel sampling process to generate pseudo test data that deviate from the HMM in a controlled fashion. These processes allow us to generate pseudo data that, at one extreme, agree with all of the model's assumptions, and at the another extreme, deviate from the model in exactly the way real data does. In between, we precisely control the degree of data/model mismatch. By measuring recognition performance on this pseudo test data, we are able to quantify the effect of this controlled data/model residual on recognition accuracy.