Abstract
In Multimedia Event Detection 2014 evaluation [20], SRI Aurora team participated in task 000Ex, 010Ex and 100Ex with full system evaluation. Aurora system extracts multi-modality features including motion features, static image feature, and audio features from videos, and represents a video with Bag-of-Word (BOW) and Fisher Vector model. In addition, various high-level concept features have been explored. Other than the action concept features and SIN features, deep learning based semantic features including both DeCaf and Overfeat implementation have been explored. The deep-learning features achieve good performance for MED, but they are not the right features for MER. In particular, we performed further study on semi-supervised Automatic Annotation to expand our action concepts. To distinguish event categories efficiently and effectively, we introduce Linear SVM into our system, as well as the feature-mapping technique to approximate the Histogram Intersection Kernel for BOW video model. All the modalities are fused by an ensemble of classifiers including techniques such as Logistic Regression, SVR, Boosting, and so on. Eventually, we achieve satisfied achieved satisfactory results. In MER task, we developed an approach to provide a breakdown of the evidences of why the MED decision has been made by exploring the SVM-based event detector.
Original language | English (US) |
---|---|
State | Published - 2020 |
Event | 2014 TREC Video Retrieval Evaluation, TRECVID 2014 - Orlando, United States Duration: Nov 10 2014 → Nov 12 2014 |
Conference
Conference | 2014 TREC Video Retrieval Evaluation, TRECVID 2014 |
---|---|
Country/Territory | United States |
City | Orlando |
Period | 11/10/14 → 11/12/14 |
ASJC Scopus subject areas
- Information Systems
- Signal Processing
- Electrical and Electronic Engineering