Reconocimiento en-línea de acciones humanas basado en patrones de RWE aplicado en ventanas dinámicas de momentos invariantes

López, Dennis Romero; Neto, Anselmo Frizera; Bastos, Teodiano Freire

doi:10.1016/j.riai.2013.09.009

Información del artículo

Resumen

Texto completo

Bibliografía

Descargar PDF

Estadísticas

Resumen

En este trabajo se presenta una metodología para el reconocimiento en-línea de acciones humanas en secuencias de vídeo. Se aborda un enfoque eficiente para el uso de momentos invariantes como descriptores de imagen, aplicados en siluetas obtenidas del procesamiento de mapas de profundidad. Una comparación rápida entre ventanas de tamaño 4 (equivalente a 4 frames) es realizada mediante el cómputo de la distancia de Mahalanobis, sobre una de las secuencias de momentos invariantes identificada como la menos sensible al ruido de captura y la más estable durante ausencia de movimiento. Este enfoque es usado para la detección rápida del estado de parada/movimiento, el cual permite la captura de intervalos (ventanas) de crecimiento dinámico para su posterior procesamiento, rescatando de la señal contenida sus propiedades temporales y frecuenciales. Mediante la aplicación de la transformada Wavelet Haar, tres niveles de descomposición son utilizados para el cómputo de la Energía Relativa Wavelet (RWE - Relative Wavelet Energy) y SSC (Slope Sign Change), obteniendo patrones 11-dimensionales. En experimentos realizados, el 97% de 4 movimientos capturados en-línea fueron reconocidos correctamente, y 10 movimientos tomados de la base de datos Muhavi-MAS fueron reconocidos con 94,2% de efectividad.

Palabras clave:

Visión por ordenador

Mapas de profundidad

Reconocimiento de acciones humanas

Relative Wavelet Energy

Distancia de Mahalanobis

Abstract

This paper presents a methodology for online human action recognition on video sequences. It addresses an efficient approach to use invariant moments as image descriptors, applied in processing silhouettes obtained from depth maps. A quick comparison between size-4 windows (equivalent to 4 frames) is performed by computing the Mahalanobis distance, on one of the invariant moment sequences identified as less sensitive to noise and more stable during movement absence. This approach is used for rapid detection of the idle/motion state, which allows the capture of dynamic growth intervals (windows) for further processing, rescuing from the signal contained their temporal and frequential properties. By applying the Haar wavelet transform, three decomposition levels are used for calculating Relative Wavelet Energy (RWE - Relative Wavelet Energy) and SSC (Slope Sign Change), obtaining 11-dimensional patterns. In experiments, 97% of 4 movements online-captured were recognized correctly, and 10 movements taken from Muhavi-MAS database were recognized with 94.2% efficiency.

Keywords:

Computer Vision

Depth Maps

Human Action Recognition

Relative Wavelet Energy

Mahalanobis Distance

Texto completo

Referencias no citadas

Antoniou, 2005, Broggi et al., 2000, Chan and Vese, 2001, Chen et al., 2004, Chockalingam et al., 2009, Cifuentes et al., 2012, Franke and Joos, 2000, Garcia-Costa et al., 2011, Geronimo et al., 2010, Gonzalez, 2010, Grubb et al., 2004, Haibin et al., 2008, Hu et al., 2007, Huang and Leng, 2010, Itti et al., 1998, Jones and Snow, 2008, Knoll, 2007, Marsi et al., 2007, Mercimek et al., 2005, Miau et al., 2001, Moeslund and Kruger, 2006, Nayar and Branzoi, 2003, Park and Trivedi, 2007, Park and Trivedi, 2008, Phinyomark et al., 2009, Poppe, 2010, Qiao et al., 2011, Rabiner, 1989, Romero et al., 2012a, Romero et al., 2012b, Rosso et al., 2003, Rosso et al., 2001, Salas-Lopez et al., 2012, Sarvaiya, 2011, Singh et al., 2010, Soga et al., 2005, Viola et al., 2003, Wang et al., 2003 and Yan et al., 2011.

Referencias

[Antoniou, 2005]

Antoniou, A., 2005. Digital Signal Processing. McGraw-Hill.

[Broggi et al., 2000]

Broggi, A., Bertozzi, M., Fascioli, A., Sechi, M., 2000. Shape-based pedestrian detection. In: Intelligent Vehicles Symposium, 2000. IV 2000. Proceedings of the IEEE. pp. 215-220.

[Chan and Vese, 2001]

Chan, T., Vese, L., feb 2001. Active contours without edges. Image Processing, IEEE Transactions on 10 (2), 266-277.

[Chen et al., 2004]

Chen, Q., Petriu, E., Yang, X., may 2004. A comparative study of fourier descriptors and hu's seven moment invariants for image recognition. In: Electrical and Computer Engineering, 2004. Canadian Conference on. Vol. 1. pp. 103-106 Vol.1.

[Chockalingam et al., 2009]

Chockalingam, P., Pradeep, N., Birchfield, S., oct. 2009. Adaptive fragments- based tracking of non-rigid objects using level sets. In: Computer Vision, 2009 IEEE 12th International Conference on. pp. 1530-1537.

[Cifuentes et al., 2012]

Cifuentes, C., Braidot, A., Rodriguez, L., Frisoli, M., Santiago, A., Frizera, A., june 2012. Development of a wearable zigbee sensor system for upper limb rehabilitation robotics. In: Biomedical Robotics and Biomechatronics (Bio- Rob), 2012 4th IEEE RAS EMBS International Conference on. pp. 1989-1994. DOI: 10.1109/BioRob.2012.6290926.

[Franke and Joos, 2000]

Franke, U., Joos, A., 2000. Real-time stereo vision for urban traffic scene understanding. In: Intelligent Vehicles Symposium. Proceedings of the IEEE. pp. 273-278.

[Garcia-Costa et al., 2011]

Garcia-Costa, C., Egea-Lopez, E., Tomas-Gabarron, J., Garcia-Haro, J., Haas, Z., 2011. A stochastic model for chain collisions of vehicles equipped with vehicular communications. Intelligent Transportation Systems, IEEE Transactions on 13, 503-518.

[Geronimo et al., 2010]

Geronimo, D., Lopez, A., Sappa, D., july 2010. Survey of pedestrian detection for advanced driver assistance systems. Pattern Analysis and Machine Intelligence, IEEE Transactions on 32 (7), 1239-1258.

[Gonzalez, 2010]

Gonzalez, R.C., 2010. Digital Image Processing, 2nd Edition. McGraw-Hill.

[Grubb et al., 2004]

Grubb, G., Zelinsky, A., Nilsson, L., Rilbe, M., june 2004. 3d vision sensing for improved pedestrian safety. In: Intelligent Vehicles Symposium, IEEE. pp. 19-24.

[Haibin et al., 2008]

Haibin, Z., Xu, W., Hong, W., may 2008. Feature selection using relative wavelet energy for brain-computer interface design. In: Bioinformatics and Biomedical Engineering, 2008. ICBBE 2008. The 2nd International Conference on. pp. 1434-1437.

[Hu et al., 2007]

Hu, X., Kong, B., Zheng, F., Wang, S., july 2007. Image recognition based on wavelet invariant moments and wavelet neural networks. In: Information Acquisition, 2007. ICIA ‘07. International Conference on. pp. 275-279.

[Huang and Leng, 2010]

Huang, Z., Leng, J., april 2010. Analysis of hu's moment invariants on image scaling and rotation. In: Computer Engineering and Technology (ICCET), 2010 2nd International Conference on. Vol. 7. pp. V7-476 –V7-480.

[Itti et al., 1998]

Itti, L., Koch, C., Niebur, E., nov 1998. A model of saliency-based visual attention for rapid scene analysis. Pattern Analysis and Machine Intelligence, IEEE Transactions on 20 (11), 1254-1259.

[Jones and Snow, 2008]

Jones, M., Snow, D., dec. 2008. Pedestrian detection using boosted features over many frames. In: Pattern Recognition, 2008. ICPR 2008. 19th International Conference on. pp. 1-4.

[Knoll, 2007]

Knoll, P., 2007. Hdr vision for driver assistance. In: Hoefflinger, B. (Ed.), High- Dynamic-Range (HDR) Vision. Vol. 26. Springer, pp. 123-136.

[Marsi et al., 2007]

Marsi, S., Impoco, G., Ukovich, A., Ramponi, G., 2007. Video enhancement and dynamic range control of hdr sequences for automotive applications. Advances in Signal Processing (EURASIP) 2007, 9.

[Mercimek et al., 2005]

Mercimek, M., Gulez, K., Mumcu, T., 2005. Real object recognition using moment invariants. Sadhna - Acad. Proc. Eng. Sci. 30, 765-775.

[Miau et al., 2001]

Miau, F., Papageorgiou, C.S., Itti, L., 2001. Neuromorphic algorithms for computer vision and attention. Proc.Intl Symp. Optical Science and Technology 01 (46), 12-23.

[Moeslund and Kruger, 2006]

Moeslund, T., Kruger, V., 2006. A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding 103, 90-126.

[Nayar and Branzoi, 2003]

Nayar, S., Branzoi, V., 2003. Adaptive dynamic range imaging: optical control of pixel exposures over space and time. In: Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on. pp. 1168-1175 vol.2.

[Park and Trivedi, 2007]

Park, S., Trivedi, M., 2007. Multi-person interaction and activity analysis: a synergistic track- and body-level analysis framework. Machine Vision and Applications 18, 151-166.

[Park and Trivedi, 2008]

Park, S., Trivedi, M., jul 2008. Understanding human interactions with track and body synergies (tbs) captured from multiple views. Computer Vision and Image Understanding 111 (1), 2-20.

[Phinyomark et al., 2009]

Phinyomark, A., Limsakul, C., Phukpattaranont, P., 2009. A novel feature extraction for robust emg pattern recognition. CoRR abs/0912.3973.

[Poppe, 2010]

Poppe, R., 2010. A survey on vision-based human action recognition. Image and Vision Computing 28 (6), 976-990.

[Qiao et al., 2011]

Qiao, Y., Wang, X., Xu, C., june 2011. Learning mahalanobis distance for dtw based online signature verification. In: Information and Automation (ICIA), 2011 IEEE International Conference on. pp. 333-338.

[Rabiner, 1989]

Rabiner, L., feb 1989. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77 (2), 257-286.

[Romero et al., 2012a]

Romero, D., Frizera, A., Bastos, T., jan. 2012a. Movement analysis in learning by repetitive recall. an approach for automatic assistance in physiotherapy. In: Biosignals and Biorobotics Conference (BRC), 2012 ISSNIP. pp. 1-8.

[Romero et al., 2012b]

Romero, D., Vintimilla, B., Frizera, A., Bastos, T.F., jun 2012b. Rwe patterns extraction for on-line human action recognition through window-based analysis of invariant moments. In: Robocontrol (2012). Bauru - SP, pp. 20-27.

[Rosso et al., 2003]

Rosso, O., Martin, M., Plastino, A., 2003. Brain electrical activity analysis using wavelet-based informational tools (ii): Tsallis non-extensivity and complexity measures. Physica A: Statistical Mechanics and its Applications 320 (0), 497-511.

[Rosso et al., 2001]

Rosso, O.A., Blanco, S., Yordanova, J., Kolev, V., Figliola, A., Schurmann, M., Basar, E., 2001. Wavelet entropy: a new tool for analysis of short duration brain electrical signals. Journal of Neuroscience Methods 105 (1), 65-75.

[Salas-Lopez et al., 2012]

Salas-Lopez, G., Sandoval-Gonzalez, O., Herrera-Aguilar, I., MartA¿ nez-Sibaja, A., Portillo-Rodriguez, O., Vilchis-Gonzalez, A., 2012. Design and development of a planar robot for upper extremities rehabilitation with visuovibrotactile feedback. Procedia Technology 3, 147-156.

[Sarvaiya, 2011]

Sarvaiya, J.N., 2011. Automatic image registration using mexican hat wavelet, invariant moment, and radon transform. IJACSA - International Journal of Advanced Computer Science and Applications 01 (Special Issue), 75-84.

[Singh et al., 2010]

Singh, S., Velastin, S., Ragheb, H., september 2010. Muhavi: A multicamera human action video dataset for the evaluation of action recognition methods. In: Advanced Video and Signal Based Surveillance (AVSS), 2010 Seventh IEEE International Conference on. pp. 48-55.

[Soga et al., 2005]

Soga, M., Kato, T., Ohta, M., Ninomiya, Y., april 2005. Pedestrian detection with stereo vision. In: Data Engineering Workshops. 21st International Conference on. Vol. 01. pp. 20-28.

[Viola et al., 2003]

Viola, P., Jones, M., Snow, D., oct. 2003. Detecting pedestrians using patterns of motion and appearance. In: Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on. Vol. 2. pp. 734-741.

[Wang et al., 2003]

Wang, L., Hu, W., T. Tan, 2003. Recent developments in human motion analysis. Pattern Recognition 36, 585-601.

[Yan et al., 2011]

Yan, L., Casperson, D., Chen, L., june 2011. Survey: Dimension reduction by pattern decomposition. In: Modelling, Identification and Control (ICMIC), Proceedings of 2011 International Conference on. pp. 69-74.

Indexada en:

Síguenos:

Indexada en:

Síguenos:

Suscríbase a la newsletter