Hand Gesture recognition using

Hidden Markov

Models

Arpit Sharma

D.A.V.I.E.T, Jalandhar

Abstract— With recent technological

advancements in the field of artificial intelligence, human gesture recognition

has gained special interests in the fields of computer vision and human

computer interaction. Gesture recognition is a difficult problem because of the

variations in subject location, size in the frame. Subject occlusion also

hinders the process of image recognition. Similarly, human gestures are a

pattern recognition problem. This paper proposes a method to recognize time –

varying human gestures from continuous video streams. This uses hidden Markov

model method for the human hand gesture recognition. This can be applied to

gestures by simply converting the into sequential symbols.

Keywords—Hidden Marokov

models; neural networks;Vector quantization.

I. INTRODUCTION

Human hand gestures are non-verbal actions

used to express information and can instantly recognized by humans. Automatic

gesture recognition from continuous video streams has a variety of potential

applications from smart surveillance to human machine interaction. .

Hand gesture recognition is implemented automatically.

It is performed in two phase process:

Feature Extraction and classification.

II.FEATURE EXTRACTION AND CLASSIFICATION

Feature extraction and

classification is a field where objects captured are classified into a number

of classes. These objects are referred to as patterns and generally consists of

speech, text or any other pattern. In this paper, the pattern is hand gestures.

Basically, we start when the user provides an input in

form of some gesture. The input data is then collected also known as our

training data for the next step i.e. feature extraction. Training is done using

hidden Markov model. Finally, classification is performed using feature

classifier and the output is generated

III.HIDDEN MARKOV MODEL

Before discussing

about the hidden Markov model, it is necessary to understand it’s foundation

i.e. the Markov process. A Markov

process consists of two parts:

1)

A

finite number of states, and

2)

Transition

probabilities of moving between those states.

The Markov model

provides a powerful way to stimulate the real-world processes, provided a small

number of assumptions are met (i.e. a

fixed set of states, fixed transition probabilities, and the possibility of

getting from any state to any other state through a series of transitions).

We can say, a HMM

is a collection of finite states connected by change of states.

Each state events

has two probabilities

(1)a transition

probability

(2)either a

discrete output probability distribution or continuous output probability

density function.

IV.APPROACH

The application of

HMM to gesture recognition was motivated by HMM in speech recognition.

There are many similarities between gesture and speech recognition therefore

HMM can be applied here as well. The gestures vary with subject position,

symmetry and size in frame.

The gestures that

are to be used are to be defined in advance. Here, using the sign language, the

vocabulary needs to be defined in advance.

Now the role of

HMM comes into play, each gesture is defined in terms of HMM. The structures of

the transition function and the output probability are estimated.

We pre-process the

data that was fetched during the initial steps. It involves short term Fourier

transformation and Vector quantization. Here the data is collected and gestures

are defined through our training data. Therefore, this data needs to be

represented in a compact term.

The parameters of

HMM are estimated by using the Baum-Welch algorithm. Baum-Welch algorithm is needed because the

state paths in HMM are hidden and the equations cannot be solved analytically.

Also, it provides the maximum likelihood estimates i.e. attempts to find the model

that assigns the training data the highest likelihood. The probability of a

state ‘i’ calculated at time ‘t’ is given by:

Then, gestures are

decoded and recognized using Viterbi algorithm. This algorithm is used to find

the most likely hidden state in HMM.

The figures after

classification come as follow:

Representation of

five

Software

Requirements:

Windows 7 or

higher, Python 3 or higher, Numpy 1.14, Scipy 1.0.0 , Open CV 3.4.0.

V. FEATURE EXTRACTION, VECTOR QUANTIZATION

& RECOGNITION

The main goal of

the feature extraction step is to simplify recognition by summarizing vast

amounts of image data and then obtaining the unique properties that define the

gesture’s individuality.

In the initial

steps, the observation of a feature is transformed into a pattern and then the

distinct components are called features. A classifier is needed in the next

step. Primarily it divides the features space into parts such that each corresponds

to a pattern class.

Vector quantization is a lossy data compression method based upon principle of

block coding. Vector quantization aims at minimizing distortions over all

samples. Vector quantization is one of the preferred methods to lower distortions

than using scalar quantization at the same rate.

The final step

that comes is the recognition. Basically, we use the Viterbi algorithm to find

the most likely sequence of states. A recognition is made by analyzing the

final and initial states together. For a successful model there is no change of

state from initial state of gesture provided to the initial state of

recognition model. Same happens to the final states too.

VI. CONCLUSION

This paper

produces a model for the hand gesture recognition using hidden Markov model.

Hidden Markov model were originally used for the speech recognition learning

algorithms, results show that the HMM can be successfully applied to hand

gesture recognition as well. The results can be further improved by using

better gesture segmentation and image noise filtration. The proposed method can

be used in developing various learning algorithms in the field of robotics,

human-computer interaction etc.

VII. REFERENCES

1 T.B. Moeslund and E. Granum, “A

survey of computer vision based human motion

Capture”,

2 Christopher M. Bishop, “Patten recognition and machine learning”,

3 Daphne Koller and Nir Friedman, “Probabilistic Graphical models”,

4 Andrew D. Wilson, Aaron F. Bobick ” Parametric Hidden Markov

model”,

5 Donald O. Tanguay Jr, “Hidden Markov model for gesture

recognition”,

6Trevor Hastie, Robert Tibshirani, Jerome Friedman “The Elements of

Statistical Learning, Data Mining, Inference and Prediction”s.