Integrated Speaker and Speech Recognition for Wheel Chair Movement Using Artificial Intelligence

Gurpreet Kaur, Mohit Srivastava, Amod Kumar

Abstract


A speech signal is a result of the constrictions of vocal tract and different sounds can be generated by different vocal tract constrictions. A speech signal carries two things i.e. speaker's identity and meaning. For a specific applications of speaker and speech recognition like voice operated wheel chair, both the speaker and speech is to be recognized for movement of wheel chair. Automation in wheelchair is,  today's requirement as the numbers of people are increasing with disabilities like injuries in spine, amputation, impairments in hands etc. They need assistance for moving their wheel chair. Voice operated wheel chair is one of the solution. The intention of this study is to use a speaker and speech dependent system to control the wheelchair and minimize the risk of unwanted accident. We have proposed a system in which speaker (patient) as well as speech (commands) is recognized based upon the acoustic features like Mel Frequency Cepstral Coefficients (MFCC). Optimization of the features using Artificial Bee Algorithm (ABC) is done to gain good accuracy with artificial intelligence technique as a classifier. We have tested our system on standard dataset (TIDIGITS) and our own prepared dataset. Also, validation of proposed work is done by generating control signal to actuate the wheel chair in real time scenario.

 


Full Text:

PDF

References


Cutajar M., Micallef J., Casha O., Grech I., and Gatt E., “Comparative study of automatic speech recognition techniques,” IET Signal Processing, vol. 7, no. 1, pp. 25–46, 2013.

______________________________________________________

Kaur, G., Srivastava, M., & Kumar, A, "Analysis of Feature Extraction Methods for Speaker Dependent Speech Recognition," International Journal of Engineering and Technology Innovation, 78–88.

______________________________________________________

Ijjina E. P. and Mohan C. K., “Human action recognition using action bank features and convolutional neural networks,” 2014 Asian Conference on Computer Vision (ACCV), vol. 59, pp. 178–182, 2014.

______________________________________________________

Ijjina E. P. and Mohan C. K., “Hybrid deep neural network model for human action recognition,” Applied Soft Computing, vol. 46, pp. 936–952, 2015.

______________________________________________________

Karaboga D. and Akay B., “A comparative study of Artificial Bee Colony algorithm,” Applied Mathematics and Computation, vol. 214, no. 1, pp. 108–132, 2009.

______________________________________________________

Bolaji A. L., Khader A. T., Al-Betar M. A., and Awadallah M. A., “Artificial bee colony algorithm, its variants and applications: A survey,” Journal of Theoretical and Applied Information Technology, vol. 47, no. 2, pp. 434–459, 2013.

______________________________________________________

Chandra B. and Sharma R. K., “Fast learning in Deep Neural Networks,” Neuro-computing, vol. 171, pp. 1205–1215, 2016.

______________________________________________________

Li K., Wu X., and Meng H., “Intonation classification for L2 English speech using multi-distribution deep neural networks,” Computer Speech & Language, vol. 43, pp. 18–33, 2017.

______________________________________________________

Richardson F., Member S., Reynolds D., and Dehak N., “Deep Neural Network Approaches to Speaker and Language Recognition,” IEEE Signal Processing Letters, vol. 22, no. 10, pp. 1671–1675, 2015.

______________________________________________________

Dahl G. E., Yu D., Deng L., and Acero A., “Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition,” vol. 20, no. 1, pp. 30–42, 2012.

______________________________________________________

Solera-Urena R. and Garcia-Moral A. I., “Real-time robust automatic speech recognition using compact support vector machines,” Audio speech and Language processing, vol. 20, no. 4, pp. 1347–1361, 2012.

______________________________________________________

Mohamad D.and Salleh S., “Malay Isolated Speech Recognition Using Neural Network : A Work in Finding Number of Hidden Nodes and Learning Parameters,” vol. 8, no. 4, pp. 364–371, 2011.

______________________________________________________

Desai Vijayendra A.and Thakar V. K., “Neural Network Based Gujarati Speech Recognition for Dataset Collected by in-ear Microphone,” Procedia Computer Science, vol. 93, pp. 668–675, 2016.

______________________________________________________

Abdalla O. A., Zakaria M. N., Sulaiman S., and Ahmad W. F. W., “A comparison of feed-forward back-propagation and radial basis artificial neural networks: A Monte Carlo study,” Proceedings 2010 International Symposium on Information Technology - Engineering Technology, vol. 2, pp. 994–998, 2010.

______________________________________________________

Chen X., Liu X., Wang Y., Gales M. J. F., and Woodland P. C., “Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition,” IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 24, no. 11, pp. 2146–2157, 2016.

______________________________________________________

Simpson, R.C. et al.,. "NavChair : An Assistive Wheelchair Navigation System with Automatic Adaptation," Assistive Technology and Artificial Intelligence, 1458, pp.235–255, 1998.

______________________________________________________

Pacnik, G., Benkic, K. and Brecko, B., "Voice operated intelligent wheelchair - VOIC," IEEE International Symposium on Industrial Electronics, III, pp.1221–1226, 2005.

______________________________________________________

Jabardi, M.H., "Voice Controlled Smart Electric-Powered wheelchair based on Artificial Neural Network," 8(5), pp.31–37, 2017.

______________________________________________________

Siniscalchi S. M., Svendsen T., and Lee C.-H., “An artificial neural network approach to automatic speech processing,” Neurocomputing, vol. 140, pp. 326–338, 2014.

______________________________________________________

Hossain A., Rahman M., Prodhan U. K., and Khan F., “Implementation of back-propagation neural Network for isolated Bangla speech recognition,” International Journal of Information Sciences and Techniques, vol. 3, no. 4, pp. 1–9, 2013.

______________________________________________________

Mansour A. H., Zen G., Salh A., Hayder H., and Alabdeen Z., “Voice recognition Using back propagation algorithm in neural networks,” vol. 23, no. 3, pp. 132–139, 2015.

______________________________________________________

Qian Y., Tan T., and Yu D., “Neural Network Based Multi-Factor Aware Joint Training for Robust Speech Recognition,” IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 24, no. 12, pp. 2231–2240, 2016.

______________________________________________________

Dede G. and Sazlı M. H., “Speech recognition with artificial neural networks,” Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 20, no. 3, pp. 763–768, 2015.

______________________________________________________

Shahamiri S. R. and Binti Salim S. S., “Real-time frequency-based noise-robust Automatic Speech Recognition using Multi-Nets Artificial Neural Networks: A multi-views multi-learners approach,” Neurocomputing, vol. 129, no. 5, pp. 1053–1063, 2014.

______________________________________________________




DOI: https://doi.org/10.31449/inf.v42i4.2003

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.