We showcased this project as our final-year endeavor at the university – a communication tool designed for individuals who are deaf and mute, harnessing the power of computer vision technology.
whole world. ◈ 466 million people in the world have disabling hearing loss. This is over 5% of the world's population. ◈ Deaf and mutes face a lot of problem in their life because of their disability. The most amount of problem they face are in educational sectors. ◈ Though in a lot of countries they have special schools to provide primary education to student with these disabilities, but higher-level education is not accessible to them.
using standard WHO methods to describe the prevalence of hearing impairment was conducted in Bangladesh in 2013 ❖ One-third of Bangladeshi people suffer from some sort of hearing impairment and one in ten of them suffer from disabling hearing losses.
to communicate with a deaf person. ◈ You need to learn sign language to communicate with them. ◈ But to learn a sign language is very di cult and time consuming. Moreover different languages has their own sign language. ◈ This communication gap is creating the biggest hurdle between a normal person and a deaf person. ◈ But what if we can eliminate this communication gap.
a normal person and person who have hearing impairments can communicate with the help of technological means. ◈ This communication is going to happen in both way normal person to Deaf person and vice versa. ◈ Using speech recognition the speech of a normal person will be translated into text. A deaf person will read the text and understand what the speaker is trying to say. ◈ When the deaf person wants to communicate, he/she will use sign language which will be recognized by a camera and will be translated into on screen text .
educational sectors. ◈ One the biggest hurdle for deaf are they don’t get proper education because of the communication gap. ◈ Moreover there are very small number of teachers who have proper training to teach student with hearing impairment. ◈ The communication gap will be eliminated between the deaf and normal people using our app. ◈ Most importantly the deaf people will get the privilege of studying in a normal university with normal students which was unimaginable before. ◈ Getting proper education the people suffering with hearing impairment will be able to contribute a lot in our country's development and be an asset to our nation.
school ,colleges in our country are now digitalized and there is a computer and a projector in every classroom. ◈ Digital Bangladesh is one of the nation's dreams and with this vision 2021 we think all the educational institute will be fully digitalized. ◈ Every institution just needs to install our developed software.
use Google speech api to recognize the speech and translate it into text. ◈ In our case English speech will be translated into text. ◈ The speech recognizer will recognize the speech of the speaker and translate it into text. ◈ This text will be shown into a screen which will help the deaf to understand what the speaker is trying to say. Fig 3: Conversion of speech to text
have used convolutional neural network (CNN) to recognize sign language and translate it into text. ◈ This is the most important and crucial part in this project. ◈ In order to recognize the sign language used in Bangladesh for English Alphabets through a video capturing device we need a dataset containing many images of 26 alphabets. ◈ Using Convolutional Neural Network we have trained our model so that it can recognize sign language. Fig 4: Sign language to
most crucial and the most important part. ◈ Our dataset contains pictures of signs for 26 alphabets(A – Z) and a Space( ). ◈ We have altogether managed to capture 20,250 images for our dataset.
architecture name VGG16 for our training purpose. ◈ VGG16 is a convolutional neural network model proposed by K. Simonyan and A. Zisserman from the University of Oxford in the paper “Very Deep Convolutional Networks for Large-Scale Image Recognition”. The model achieves 92.7% top-5 test accuracy in ImageNet, which is a dataset of over 14 million images belonging to 1000 classes.
have used xed feature extractor which is modi ed for recognizing hand gestures only. The last classi er part is removed from the model and we have given a classi er of our own. ◈ We have also ne-tuned some of the last layers so that it can extract the high-level features properly. ◈ Augmentation was also used in our dataset to give it versatility.
can easily tweak it a little so that it can work for other sign languages. ◈ To make it work for different sign languages we just need to train the model with the sign language we want it to recognize. ◈ In future we want to add Bangla sign language support. ◈ We will also develop an app for smartphones OS.
Datta, P. G., Professor. (2017, July 13). World Health Organization, National Survey on Prevalence of Hearing Impairment in Bangladesh 2013. Retrieved May 02, 2020, from http://origin.searo.who.int/bangladesh/publications/national_survey/en/ ◈ VGG16 - Convolutional Network for Classi cation and Detection. (2018, November 21). Retrieved September 19, 2020, from https://neurohive.io/en/popular-networks/vgg16/