A novel method for video shot boundary detection using CNN-LSTM approach

Citation:

Benoughidene A, TITOUNA F. A novel method for video shot boundary detection using CNN-LSTM approach. International Journal of Multimedia Information Retrieval [Internet]. 2022;11.

Abstract:

Due to the rapid growth of digital videos and the massive increase in video content, there is an urgent need to develop efficient automatic video content analysis mechanisms for different tasks, namely summarization, retrieval, and classification. In all these applications, one needs to identify shot boundary detection. This paper proposes a novel dual-stage approach for cut transition detection that can withstand certain illumination and motion effects. Firstly, we present a deep neural network model using the pre-trained model combined with long short-term memory LSTM network and the euclidean distance metric. Two parallel pre-trained models sharing the same weights extract the spatial features. Then, these features are fed to the LSTM and the euclidean distance metric to classify the frames into specific categories (similar or not similar). To train the model, we generated a new database containing 5000 frame pairs with two labels (similar, dissimilar) for training and 1000 frame pairs for testing from online videos. Secondly, we adopt the segment selection process to predict the shot boundaries. This preprocessing method can help improve the accuracy and speed of the VSBD algorithm. Then, cut transition detection based on the similarity model is conducted to identify the shot boundaries in the candidate segments. Experimental results on standard databases TRECVid 2001, 2007, and RAI show that the proposed approach achieves better detection rates over the state-of-the-art SBD methods in terms of the F1 score criterion.

Publisher's Version