A Two Stream Siamese Convolutional Neural Network for Person Re-identification
Person re-identification is an important task in video surveillance systems. It can be formally defined as establishing the correspondence between images of a person taken from different cameras at different times. In this paper, we present a two stream convolutional neural network where each stream is a Siamese network. This architecture can learn spatial and temporal information separately. We also propose a weighted two stream training objective function which combines the Siamese cost of the spatial and temporal streams with the objective of predicting a person's identity. We evaluate our proposed method on the publicly available PRID2011 and iLIDS-VID datasets and demonstrate the efficacy of our proposed method. On average, the top rank matching accuracy is 4% higher than the accuracy achieved by the cross-view quadratic discriminant analysis used in combination with the hierarchical Gaussian descriptor (GOG+XQDA), and 5% higher than the recurrent neural network method.