We present a content-based retrieval method for long-surveillance videos in wide-area (airborne) and near- field [closed-circuit television (CCTV)] imagery. Our goal is to retrieve video segments, with a focus on detecting objects moving on routes, that match user-defined events of interest. The sheer size and remote locations where surveillance videos are acquired necessitates highly compressed representations that are also meaningful for supporting user-defined queries. To address these challenges, we archive long-surveillance video through lightweight processing based on low-level local spatiotemporal extraction of motion and object 2. These are then hashed into an inverted index using locality-sensitive hashing. This local approach allows for query flexibility and leads to significant gains in compression. Our second task is to extract partial matches to user-created queries and assemble them into full matches using dynamic programming (DP). DP assembles the indexed low-level features into a video segment that matches the query route by exploiting causality. We examine CCTV and airborne footage, whose low contrast makes motion extraction more difficult. We generate robust motion estimates for airborne data using a tracklets generation algorithm, while we use the Horn and Schunck approach to generate motion estimates for CCTV. Our approach handles long routes, low contrasts, and occlusion. We derive bounds on the rate of false positives and demonstrate the effectiveness of the approach for counting, motion pattern recognition, and abandoned object applications.
@article{castanon16,
title = {Retrieval in Long Surveillance Videos using User Described Motion and Object Attributes},
author = {Castanon, Greg and Elgharib, Mohamed and Jodoin, P.M. and Saligarama, Venkatesh},
journal = {IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)},
volume={26},
number={12},
pages={2313-2327},
year={2016},
}