T11 - Multimodal Sensing for Healthcare, Sports, and Entertainment

Imagine these questions: Will it not be great to monitor the whereabouts a near and dear one afflicted with illness such as Parkinsons or Alzheimers (that may cause them to get lost)? Will it not be perfect if you can exercise those sets of muscles and improve your sports performance? Or know which subtle action you are doing it wrong that is causing your Golf Swing to send the ball totally off the target? Will it not be nice if we can “walk into” a video game and “punch the monster” – without using any joy stick or remote controllers? Or play a music title by waving your hand towards the computer screen?

Answering these questions typically requires sensing of human body joint motions. With advances in camera technology, 3D cameras provide depth information or the “z-pixels” also. More advanced techniques use data from 3D cameras in order to better sense human gestures and respond to them. For instance, ZCam [6] has sensors that are able to measure the depth for each of the captured pixels using a principle called Time-Of-Flight. It gets 3D information “by emitting pulses of infra-red light to all objects in the scene and sensing the reflected light from the surface of each object.” The objects in the scene are then ordered in layers in the Z axis, which gives you a grayscale depth map that a game or any software application can use. For the depth resolution, it can detect 3D motion and volume down to 0.4 inches, capturing at the same time full color, 1.3 megapixel video at 60 frames per second. In a similar manner, advances in Body Sensor Network (BSN) technology provide several ways in which human gestures and in some cases, intentions can possibly be recognized. Examples of such devices include accelerometers (for tracking human motions), Electro-myograms (EMG for measuring muscular activities), and Electro-encephala gram (EEG for brain activity monitoring). As one can easily visualize, more than one body sensor as well as video can be used simultaneously. Such a multi-modal sensing might facilitate ease of use as well as the accuracy and efficiency of human motion/gesture recognition.

Data from these sensors are typically Time Series data and the data from multiple sensors form multiple, multidimensional time series data. Analyzing data from such multiple medical sensors pose several challenges: different sensors have different characteristics, different people generate different patterns through these sensors, and even for the same person the data can vary widely depending on time and environment. Body Sensor Networks (BSN) data has several similarities to other multimedia data. BSN data may have both discrete and continuous components, with or without real-time requirements. The data can be voluminous. Continuous BSN data may need signal processing techniques for recognition and interpretation.

In this tutorial, we discuss the state-of-the-art in the technologies used for video-based and sensor-based human gesture recognition. We dwell on the similarities and differences between the two approaches (vision-based and sensor-based) and evaluate the algorithms and techniques that can be employed. We focus primarily on real-time nature of the algorithms that can be employed for this purpose. We also discuss approaches for classification, data mining, visualization, and securing these data. We also show several demonstrations of body sensor networks as well as the software that aid in analyzing the data.

Outline

Following topics will be discussed during the tutorial.

Introduction: (30 minutes).
Video-based and sensor-based human gesture recognition. Discussion on hardware components that go into these
systems as well as the wireless communication standards for BSNs.
BSN Operating System and Wireless Communication: (30 minutes)
Presentation on IEEE 802.15 series standards for BSNs and the TinyOS architecture.
Video-based Gesture Recognition Strategies: (30 minutes)
Real-time algorithms for segmenting and classifying human motions and gestures in video data.
Demonstration: (30 minutes)
Hands-on demo of few BSN configurations and personal trials for participants (participants will wear the BSNs and
try).
Data Characteristics of BSNs: (30 minutes)
Outline of the characteristics of data from different body sensors and how these characteristics influence
classification and mining.
Strategies for Classification, Data Mining, and Visualization (30 minutes):
Discussion on the techniques that have been developed and their performance consideration.
Demonstration of Mining & Classification Software: (30minutes)

Organizers/Presenters

Dr. B. Prabhakaran is an Associate Professor with the faculty of Computer Science Department, University of Texas at Dallas. He has been working in the area of multimedia systems: animation & multimedia databases, authoring & presentation, resource management, and scalable web-based multimedia presentation servers. Dr. Prabhakaran received the prestigious National Science Foundation (NSF) CAREER Award in 2003 for his proposal on Animation Databases. He is also the Principal Investigator for US Army Research Office (ARO) grant on 3D data storage, retrieval, and delivery. He has published several research papers in various refereed conferences and journals in this area. He is the General Co-Chair of ACM Multimedia 2011 and has served as an Associate Chair of the ACM Multimedia Conferences in 2006 (Santa Barbara), 2003 (Berekeley, CA), 2000 (Los Angeles, CA) and in 1999 (Orlando, FL) He has served as guest-editor (special issue on Multimedia Authoring and Presentation) for ACM Multimedia Systems journal. He is also serving on the editorial board of Multimedia Tools and Applications journal, Springer Publishers. He has also served as program committee member on several multimedia conferences and workshops. He has presented tutorials in ACM Multimedia and other multimedia conferences. D. Prabhakaran has served as a visiting research faculty with the Departmentof Computer Science, University of Maryland, College Park. He also served as a faculty in the Department of Computer Science, National University of Singapore as well as in the Indian Institute of Technology, Madras, India.

Be Sociable, Share!

T11 – Multimodal Sensing for Healthcare, Sports, and Entertainment

Outline

Organizers/Presenters

Social

Sponsors

External sponsors

ACM Multimedia 2010 Twitter