A behaviour recognition approach that depends on object tracking has been introduced and extensively investigated. Automated online and offline videos as input for object tracking has increased its attraction due to the part it plays for detection of unusual behaviour. But object tracking in online videos is costly due to the set of hardware that has been used to detect unusual behaviour as soon as possible. The proposed is designed to find unusual behaviour in offline videos by obtaining 3D object level information to track person and luggage in that raw video. Here training sets are used to make the detection more accurate. First, blob matching technique is used for object and inter object motion feature. Second, people who involve in security violation are detected. Examples are abandoned and stolen objects, fighting, fainting, and loitering are tracked and detected.


blob to object matching, behaviour recognition, occlusion, HOG, SIFT, background subtraction


H. Weiming, T. Tieniu, W. Liang, and S. Maybank, “A survey on visual surveillance of object motion and behaviours,” IEEE Trans. Syst., Man,

Cybern. C, Appl. Rev., vol. 34, no. 3, pp. 334–352, Aug. 2004.

G. L. Foresti, C. Micheloni, L. Snidaro, P. Remagnino, and T. Ellis, “Active video-based surveillance system: The low-level image and video

processing techniques needed for implementation,” IEEE Signal Process.Mag., vol. 22, no. 2, pp. 25–37, Mar. 2005.

N. Firth, Face recognition technology fails to find U.K. rioters, New-Scientist, London, U.K. [Online]. Available:

L. M. Fuentes and S. A. Velastin, “Tracking-based event detection for CCTV systems,” Pattern Anal. Appl., vol. 7, no. 4, pp. 356–364,

Dec. 2004.

M. Elhamod and M. D. Levine, “A real time semantics-based detection of suspicious activities in public scenes,” in Proc. 9th Conf. CRV, Toronto,

ON, Canada, 2012, pp. 268–275.

N. T. Siebel and S. J. Maybank, “The ADVISOR visual surveillance system,” in Proc. ECCV Workshop ACV, 2004, pp. 103–111.

Z. Zhang, T. Tieniu, and H. Kaiqi, “An extended grammar system for learning and recognizing complex visual events,” IEEE Trans. Pattern

Anal. Mach. Intell., vol. 33, no. 2, pp. 240–255, Feb. 2011.

D. Demirdjian and C. Varri, “Recognizing events with temporal random forests,” in Proc. Int. Conf.Multimodal Interfaces, Cambridge,MA, 2009,

pp. 293–296.

D. Weinland, R. Ronfard, and E. Boyer, “A survey of vision-based methods for action representation, segmentation and recognition,” Comput. Vis. Image Underst., vol. 115, no. 2, pp. 224–241, Feb. 2011.

M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri, “Actions as space–time shapes,” in Proc. 10th IEEE ICCV, 2005, vol. 2,pp. 1395–1402.

N. D. Bird, O.Masoud, N. P. Papanikolopoulos, and A. Isaacs, “Detection of loitering individuals in public transportation areas,” IEEE Trans. Intell.

Transp. Syst., vol. 6, no. 2, pp. 167–177, Jun. 2005.

N. Bird, S. Atev, N. Caramelli, R. Martin, O. Masoud, and N. Papanikolopoulos, “Real time, online detection of abandoned objects

in public areas,” in Proc. IEEE ICRA, 2006, pp. 3775–3780.

L. Sijun, Z. Jian, and D. Feng, “A knowledge-based approach for detecting unattended packages in surveillance video,” in Proc. IEEE AVSS, 2006,

p. 110.

S. Blunsden and R. B. Fisher, “The BEHAVE video dataset: Ground truthed video for multi-person behaviour classification,” Annu. BMVA, vol. 2010, no. 4, pp. 1–11, 2010.

S. Blunsden, E. Andrade, and R. Fisher, “Non parametric classification of human interaction,” in Proc. 3rd Iberian Conf. Pattern Recog. Image

Anal., Part II, Girona, Spain, 2007, pp. 347–354.

P. Fatih, “Detection of temporarily static regions by processing video at different frame rates,” in Proc. IEEE Conf. AVSS, 2007, pp. 236–241.

A. Singh, S. Sawan, M. Hanmandlu, V. K. Madasu, and B. C. Lovell, “An abandoned object detection system based on dual background segmentation,” in Proc. 6th IEEE Int. Conf. AVSS, 2009, pp. 352–357.

A. Hakeem, Y. Sheikh, and M. Shah, “ CASEE: A hierarchical event representation for the analysis of videos,” in Proc. 19th Nat. Conf. Artif.

Intell., San Jose, CA, 2004, pp. 263–268.

A. Kojima, T. Tamura, and K. Fukunaga, “Natural language description of human activities from video images based on concept hierarchy of

actions,” Int. J. Comput. Vis., vol. 50, no. 2, pp. 171–184, Nov. 2002.

A. Hakeem and M. Shah, “Learning, detection and representation of multi-agent events in videos,” Artif. Intell., vol. 171, no. 8/9, pp. 586–605,

Jun. 2007

Full Text: PDF


  • There are currently no refbacks.


All Rights Reserved © 2012 IJARCSEE

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.