Drinking from a Multimedia Firehose: How to Deal with Billions of Daily Video Posts
At the ICSI Research Review, Gerald Friedland spoke about the challenges of performing empirical research on the astronomically large data set that is consumer-produced video. Here is Dr. Friedland's abstract:
Consumer-produced videos are the fastest-growing type of content on the Internet. YouTube claims that 72 hours of video are uploaded to its Web site alone every minute. Because the videos capture parts of the world, they are potentially useful for qualitative and quantitative empirical research on a larger scale than has ever been possible before. A major prerequisite to making social media videos usable for "field studies" is efficient and unbiased (e.g., keyword-independent) retrieval. More importantly, retrieval needs to go beyond simply finding objects to detecting more abstract concepts, such as “taking care of a car” or “winning a game that is not a card game.” Research on such a large corpus requires the creation of methods that exploit as many cues as possible from different modalities. ICSI has begun using novel acoustic methods to complement computer vision approaches. This talk summarizes ICSI’s research and progress in this area.
To learn more about the multimedia research being done at ICSI, read these recent papers or browse our publications database:
- Name That Room: Room Identification Using Acoustic Features in a Recording
- There is No Data Like Less Data: Percepts for Video Concept Detection on Consumer-Produced Media
- Pushing the Limits of Mechanical Turk: Qualifying the Crowd for Video Geo-Location
- Content‐based Privacy for Consumer‐Produced Multimedia
- Multimodal Location Estimation of Consumer Media – Dealing with Sparse Training Data
- Cybercasing the Joint: Language Technologies, Multimedia Retrieval, and Online Privacy
- Narrative Theme Navigation for Sitcoms Supported by Fan-Generated Scripts
- On the Applicability of Speaker Diarization to Audio Concept Detection for Multimedia Retrieval
- Video2GPS: A Demo of Multimodal Location Estimation on Flickr Videos
- Acoustic Super Models for Large Scale Video Event Detection
- Automatic Tagging and Geo-Tagging in Video Collections and Communities