Human pose estimation for care robots using deep learning

Human pose estimation for care robots using deep learning
Left: Experiment scene (this image is not used for estimation) Center: Depth data corresponding to the extracted person region, Right: Estimation result (the colors correspond to each part of the body. Credit: (c) Toyohashi University of Technology

Expectations for care robots are growing against the backdrop of declining birthrates, an aging population, and a lack of care staff. As an example, for care at nursing homes and other such facilities, it is anticipated that robots will check the condition of the residents while patrolling the facility. When evaluating a person's condition, while an initial estimation of the pose (standing, sitting, fallen, etc.) is useful, most methods to date have utilized images. These methods face challenges such as privacy issues, and difficulties concerning application within darkly lit spaces. As such, the research group (Kaichiro Nishi, a 2016 master's program graduate, and Professor Miura) has developed a method of pose recognition using depth data alone (Fig. 1).

For poses such as upright positions and sitting positions, where body parts are able to be recognized relatively easily, methods and instruments which can estimate poses with high precision are available. In the case of care, however, it is necessary to recognize various poses, such as a recumbent position (the state of lying down) and a crouching position, which has posed a challenge up until now. Along with the recent progress of deep learning (a technique using a multistage neural network), the development of a to estimate complex poses using images is advancing. Although requires preparation of a large amount of training data, in the case of image data, it is relatively easy for a person to see each part in an image and identify it, with some datasets also having been made open to the public. In the case of depth data, however, it is difficult to see the boundaries of parts, making it difficult to generate training data.

As such, this research has established a method to generate a large amount of training data by combining computer graphics (CG) technology and (Fig. 2). This method first creates CG data of various body shapes. Next, it adds to the data information of each part (11 parts including a head part, a torso part, and a right upper arm part), and skeleton information including each joint position. This makes it possible to make CG models take arbitrary poses simply by giving the joint angles using a system. Fig. 3 shows an example of generating data for various sitting poses.

Human pose estimation for care robots using deep learning
Procedure of generating learning data. Credit: (c) Toyohashi University of Technology

By using this developed method, training data can be generated corresponding to a combination of persons with arbitrary body shapes, and arbitrary poses. So far, we have created and released a total of about 100,000 pieces of data, both for sitting positions (with/without occlusions), and for several poses in a recumbent positions. This data is freely available for research purposes ( In the future, we will release human models and detailed procedures for data generation so that everyone can make easily by using them. We hope that this will contribute to the progress of the related fields.

The result of this research was published in Pattern Recognition on Saturday, June 3, 2017.

Human pose estimation for care robots using deep learning
First row: These are the body part label images, Second row: This is the depth data. Credit: (c) Toyohashi University of Technology

Explore further

A computer that reads body language

More information: K. Nishi et al, Generation of human depth images with body part labels for complex human pose recognition, Pattern Recognition (2017). DOI: 10.1016/j.patcog.2017.06.006
Citation: Human pose estimation for care robots using deep learning (2017, July 11) retrieved 18 April 2021 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Feedback to editors

User comments