Integrated Speaker and Speech Recognition for Wheel Chair Movement Using Artificial Intelligence
Abstract
A speech signal is a result of the constrictions of vocal tract and different sounds can be generated by different vocal tract constrictions. A speech signal carries two things i.e. speaker's identity and meaning. For a specific applications of speaker and speech recognition like voice operated wheel chair, both the speaker and speech is to be recognized for movement of wheel chair. Automation in wheelchair is, today's requirement as the numbers of people are increasing with disabilities like injuries in spine, amputation, impairments in hands etc. They need assistance for moving their wheel chair. Voice operated wheel chair is one of the solution. The intention of this study is to use a speaker and speech dependent system to control the wheelchair and minimize the risk of unwanted accident. We have proposed a system in which speaker (patient) as well as speech (commands) is recognized based upon the acoustic features like Mel Frequency Cepstral Coefficients (MFCC). Optimization of the features using Artificial Bee Algorithm (ABC) is done to gain good accuracy with artificial intelligence technique as a classifier. We have tested our system on standard dataset (TIDIGITS) and our own prepared dataset. Also, validation of proposed work is done by generating control signal to actuate the wheel chair in real time scenario.References
Cutajar M., Micallef J., Casha O., Grech I., and Gatt E., “Comparative study of automatic speech recognition techniques,” IET Signal Processing, vol. 7, no. 1, pp. 25–46, 2013.
______________________________________________________
Kaur, G., Srivastava, M., & Kumar, A, "Analysis of Feature Extraction Methods for Speaker Dependent Speech Recognition," International Journal of Engineering and Technology Innovation, 78–88.
______________________________________________________
Ijjina E. P. and Mohan C. K., “Human action recognition using action bank features and convolutional neural networks,” 2014 Asian Conference on Computer Vision (ACCV), vol. 59, pp. 178–182, 2014.
______________________________________________________
Ijjina E. P. and Mohan C. K., “Hybrid deep neural network model for human action recognition,” Applied Soft Computing, vol. 46, pp. 936–952, 2015.
______________________________________________________
Karaboga D. and Akay B., “A comparative study of Artificial Bee Colony algorithm,” Applied Mathematics and Computation, vol. 214, no. 1, pp. 108–132, 2009.
______________________________________________________
Bolaji A. L., Khader A. T., Al-Betar M. A., and Awadallah M. A., “Artificial bee colony algorithm, its variants and applications: A survey,” Journal of Theoretical and Applied Information Technology, vol. 47, no. 2, pp. 434–459, 2013.
______________________________________________________
Chandra B. and Sharma R. K., “Fast learning in Deep Neural Networks,” Neuro-computing, vol. 171, pp. 1205–1215, 2016.
______________________________________________________
Li K., Wu X., and Meng H., “Intonation classification for L2 English speech using multi-distribution deep neural networks,” Computer Speech & Language, vol. 43, pp. 18–33, 2017.
______________________________________________________
Richardson F., Member S., Reynolds D., and Dehak N., “Deep Neural Network Approaches to Speaker and Language Recognition,” IEEE Signal Processing Letters, vol. 22, no. 10, pp. 1671–1675, 2015.
______________________________________________________
Dahl G. E., Yu D., Deng L., and Acero A., “Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition,” vol. 20, no. 1, pp. 30–42, 2012.
______________________________________________________
Solera-Urena R. and Garcia-Moral A. I., “Real-time robust automatic speech recognition using compact support vector machines,” Audio speech and Language processing, vol. 20, no. 4, pp. 1347–1361, 2012.
______________________________________________________
Mohamad D.and Salleh S., “Malay Isolated Speech Recognition Using Neural Network : A Work in Finding Number of Hidden Nodes and Learning Parameters,” vol. 8, no. 4, pp. 364–371, 2011.
______________________________________________________
Desai Vijayendra A.and Thakar V. K., “Neural Network Based Gujarati Speech Recognition for Dataset Collected by in-ear Microphone,” Procedia Computer Science, vol. 93, pp. 668–675, 2016.
______________________________________________________
Abdalla O. A., Zakaria M. N., Sulaiman S., and Ahmad W. F. W., “A comparison of feed-forward back-propagation and radial basis artificial neural networks: A Monte Carlo study,” Proceedings 2010 International Symposium on Information Technology - Engineering Technology, vol. 2, pp. 994–998, 2010.
______________________________________________________
Chen X., Liu X., Wang Y., Gales M. J. F., and Woodland P. C., “Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition,” IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 24, no. 11, pp. 2146–2157, 2016.
______________________________________________________
Simpson, R.C. et al.,. "NavChair : An Assistive Wheelchair Navigation System with Automatic Adaptation," Assistive Technology and Artificial Intelligence, 1458, pp.235–255, 1998.
______________________________________________________
Pacnik, G., Benkic, K. and Brecko, B., "Voice operated intelligent wheelchair - VOIC," IEEE International Symposium on Industrial Electronics, III, pp.1221–1226, 2005.
______________________________________________________
Jabardi, M.H., "Voice Controlled Smart Electric-Powered wheelchair based on Artificial Neural Network," 8(5), pp.31–37, 2017.
______________________________________________________
Siniscalchi S. M., Svendsen T., and Lee C.-H., “An artificial neural network approach to automatic speech processing,” Neurocomputing, vol. 140, pp. 326–338, 2014.
______________________________________________________
Hossain A., Rahman M., Prodhan U. K., and Khan F., “Implementation of back-propagation neural Network for isolated Bangla speech recognition,” International Journal of Information Sciences and Techniques, vol. 3, no. 4, pp. 1–9, 2013.
______________________________________________________
Mansour A. H., Zen G., Salh A., Hayder H., and Alabdeen Z., “Voice recognition Using back propagation algorithm in neural networks,” vol. 23, no. 3, pp. 132–139, 2015.
______________________________________________________
Qian Y., Tan T., and Yu D., “Neural Network Based Multi-Factor Aware Joint Training for Robust Speech Recognition,” IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 24, no. 12, pp. 2231–2240, 2016.
______________________________________________________
Dede G. and Sazlı M. H., “Speech recognition with artificial neural networks,” Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 20, no. 3, pp. 763–768, 2015.
______________________________________________________
Shahamiri S. R. and Binti Salim S. S., “Real-time frequency-based noise-robust Automatic Speech Recognition using Multi-Nets Artificial Neural Networks: A multi-views multi-learners approach,” Neurocomputing, vol. 129, no. 5, pp. 1053–1063, 2014.
______________________________________________________
DOI:
https://doi.org/10.31449/inf.v42i4.2003Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







