ICSI BEARS Open House
Thursday, February 14, 2013
2:00 - 5:00pm
The ICSI BEARS Open House is held in conjunction with the Berkeley EECS Annual Research Symposium. ICSI scientists will be on hand throughout to discuss and demonstrate their latest research in networking and security, bioinformatics and computational biology, artificial intelligence, speech and natural language processing, machine learning, multimedia location estimation, and computer vision.
Featured talks will begin at 2:00, and will give visitors an overview of the work being done in five of our research areas.
From 3:30 to 5:00, enjoy the following interactive research demonstrations and light refreshments while chatting with ICSI scientists about their work.
- Ad-Hoc Wireless Microphone Array presented by Chuck Wooters
- Video2GPS: A Demo of Multimodal Location Estimation on Flickr Videos presented by Jaeyoung Choi
- Pushing the Limits of Mechanical Turk: Qualifying the Crowd for Video Geo-Location presented by Luke Gottlieb
- The SSL Tree of Trust presented by Johanna Amann
- An Interactive Lecture Browser presented by Korbinian Riedhammer
- Finding the Sweet Spot: Big Data and Energy presented by Ekaterina Gonina
- MetaNet presented by Srini Narayanan
- DPM-Based Realtime Object Detection presented by Daniel Goehring
To learn more about BEARS 2013 and register: www.eecs.berkeley.edu/BEARS. Please RSVP to [email protected].
Research Demonstration Abstracts:
Ad-Hoc Wireless Microphone Array
by Chuck Wooters, Speech Researcher
VOIP-based online meetings, such as those involving Skype, WebEx, Adobe Connect, etc., are painful. Remote participants often have a difficult time understanding what people are saying because the audio quality is so poor. There are commercial solutions that can provide high-quality audio for remote meeting participants. However, these solutions are expensive, and given that a VOIP-based meeting can take place just about anywhere there is a good wifi signal, this expensive equipment is not always available. Thus remote participants typically suffer through online meetings straining to try to understand what people are saying.
These days it is likely that most, if not all, meeting participants possess some sort internet-connected device (i.e., smart phone, tablet, or laptop). Since these devices have built-in microphones and network connectivity, it is fairly simple to record from each device's microphone and stream the audio, in real-time, over a local network connection. When multiple audio streams are collected simultaneously, techniques such as delay-and-sum beam forming can be used to enhance the audio.
We have built a demonstration system consisting of an audio-streaming app (running on both iOS and Android) and a signal-processing app (running on a laptop). The audio-streaming app(s) communicate with the signal-processing app over a wireless network connection via UDP. The signal-processing app receives the audio streams from each of the smart devices and combines them using the Generalized Cross-Correlation with Phase Transform (GCC-PHAT) algorithm. Thus, by making use of existing hardware (the smart devices and laptops that are typically present during a meeting) we have created an ad hoc wireless microphone array that is capable of providing an enhanced audio signal to remote meeting participants.
Video2GPS: A Demo of Multimodal Location Estimation on Flickr Videos
by Jaeyoung Choi, Audio and Multimedia Researcher
We demonstrate an approach to determining the geo-coordinates of where Flickr videos were recorded using both textual metadata and visual cues. The underlying system has been tested on the MediaEval 2012 Placing Task evaluation data, which consists of 5000 unfiltered "wild" test videos, and is able to classify 14 percent of the videos to within an accuracy of 10 m. The demo shows the history of the improvement of algorithms that ICSI has developed over 3 years.
Pushing the limits of Mechanical Turk: Qualifying the Crowd for Video Geo-Location
by Luke Gottlieb, Audio and Multimedia Researcher
We will demonstrate the methods we have developed for finding Mechanical Turk participants for the manual annotation of the geo-location of random videos from the Web. We require high-quality annotations for this project, as we are attempting to establish a human baseline for future comparison to machine systems. This task is different from a standard Mechanical Turk task in that it is difficult for both humans and machines, whereas a standard Mechanical Turk task is usually easy for humans and difficult or impossible for machines.
Related Paper: http://www.icsi.berkeley.edu/icsi/publication_details?n=3344.
The SSL Tree of Trust
by Johanna Amann, Networking and Security Visiting Scholar
In collaboration with ten large network sites, we have been collecting information about 15 billion SSL connections since the beginning of 2012, including 32 million unique SSL certificates extracted from online activity of about 300,000 users in total. We are using these to provide a public notary service that people can query for certificates they encounter. Furthermore, to better understand the relationships between root and intermediate Certificate Authorities (CAs), we used our data set to create the "Tree of Trust," an interactive graph visualizing global relationships between CAs. Our demonstration will show the Tree of Trust, highlight interesting parts, and provide information about the current state of the SSL ecosystem.
For more, see http://notary.icsi.berkeley.edu/#the-tree-of-trust.
An Interactive Lecture Browser
by Korbinian Riedhammer, Speech Visiting Scholar
A growing number of universities and other educational institutions provide recordings of lectures and seminars as an additional resource to the students. In contrast to educational films that are scripted, directed, and often shot by film professionals, these plain recordings are typically not post-processed in an editorial sense. Thus, the videos often contain longer periods of inactivity or silence, unnecessary repetitions, or corrections of prior mistakes. This paper describes the FAU Video Lecture Browser system, a Web-based platform for the interactive assessment of video lectures that helps to close the gap between a plain recording and a useful e-learning resource by displaying automatically extracted and ranked key phrases on an augmented time line based on stream graphs. In a pilot study, users of the interface were able to complete a topic localization task about 29 percent faster than users provided with the video only, while achieving about the same accuracy. The user interactions can be logged on the server to collect data to evaluate the quality of the phrases and rankings, and to train systems that produce customized phrase rankings.
Finding the Sweet Spot: Big Data and Energy
by Ekaterina Gonina, Speech Researcher
Traditionally, multimedia analysis algorithms have been optimized for accuracy and speed (i.e., time). With the advent of massive amounts of consumer-produced videos, a new factor comes into play: energy. With the help of new hardware donated by Intel, we show in this presentation that energy is a first class citizen when it comes to optimization factors because small changes in the algorithm can cause a huge difference in energy consumption, especially when viewed in large scale.
MetaNet
by Srini Narayanan, Artificial Intelligence Director
In the MetaNet project, researchers from ICSI, UC San Diego, University of Southern California, Stanford, and UC Merced are building a system capable of understanding metaphors used in American English, Iranian Persian, Russian as spoken in Russia, and Mexican Spanish. The team includes computer scientists, linguists, psychologists, and cognitive scientists. The system can extract linguistic manifestations of metaphor – expressions whose meaning relies on conceptual metaphors such as Life is a Journey – from texts in the four languages and to understand them automatically. Using frame semantics among other tools, researchers will represent the relationships among metaphors and between expressions’ literal and metaphorical meanings as networks within a metaphor repository. The repository will allow users to browse, navigate, annotate, and modify these networks and will include links to linguistic manifestations. In conjunction, cognitive linguists and neuroscientists will test how metaphor affects thinking and emotion in order to evaluate the effectiveness of the metaphor repository and methodologies developed by the team. The project will provide information for an analysis of the role metaphor plays in how people from different cultural backgrounds make judgments and decisions.For information about the MetaNet project, see https://www.icsi.berkeley.edu/icsi/gazette/2012/05/metanet-project.
DPM-Based Realtime Object Detection
by Daniel Goehring, Audio and Multimedia Visiting Scholar
Object detection and classification is an important field within computer vision. Subtasks include the derivation of the position of an object within the image and its classification. In this demo we show an object detection approach, applied for 30 different household objects. The approach is based on the Deformable Parts Model algorithm, which uses HOG features and linear SVM to find the different part locations of an object. We use sparselets and a Cuda-Implementation to run our algorithm with 3-7 Hz.