Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Information Science and Technology / Faculty of Information Science and Technology >
Peer-reviewed Journal Articles, etc >
Detection of Important Scenes in Baseball Videos via Bidirectional Time Lag Aware Deep Multiset Canonical Correlation Analysis
Title: | Detection of Important Scenes in Baseball Videos via Bidirectional Time Lag Aware Deep Multiset Canonical Correlation Analysis |
Authors: | Hirasawa, Kaito Browse this author | Maeda, Keisuke Browse this author | Ogawa, Takahiro Browse this author →KAKEN DB | Haseyama, Miki Browse this author →KAKEN DB |
Keywords: | Videos | Feature extraction | Sports | Visualization | Blogs | Social networking (online) | Correlation | Unsupervised important scene detection | time lag aware canonical correlation maximization | anomaly detection | generative adversarial network |
Issue Date: | 18-Jun-2021 |
Publisher: | IEEE (Institute of Electrical and Electronics Engineers) |
Journal Title: | IEEE Access |
Volume: | 9 |
Start Page: | 84971 |
End Page: | 84981 |
Publisher DOI: | 10.1109/ACCESS.2021.3088284 |
Abstract: | A novel method for detection of important scenes in baseball videos based on correlation maximization between heterogeneous modalities via bidirectional time lag aware deep multiset canonical correlation analysis (BiTl-dMCCA) is presented in this paper. The proposed method enables detection of important scenes by collaboratively using baseball videos and their corresponding tweets. The technical contributions of this paper are twofold. First, since there are time lags between not only "tweets and corresponding multiple previous events" but also "events and corresponding multiple following posted tweets", the proposed method considers these bidirectional time lags. Specifically, the representation of such bidirectional time lags into the derivation of their covariance matrices is newly introduced. Second, the proposed method adopts textual, visual and audio features calculated from tweets and videos as multi-modal time series features. Important scenes are detected as abnormal scenes via anomaly detection based on a generative adversarial network using multi-modal features projected by BiTl-dMCCA. The proposed method does not need any training data with annotation. Experimental results obtained by applying the proposed method to actual baseball matches show the effectiveness of the proposed method. |
Type: | article |
URI: | http://hdl.handle.net/2115/82485 |
Appears in Collections: | 情報科学院・情報科学研究院 (Graduate School of Information Science and Technology / Faculty of Information Science and Technology) > 雑誌発表論文等 (Peer-reviewed Journal Articles, etc)
|
|