Optimizing Social Media Analytics with the DQEA Framework for Superior Data Quality Management
Abstract
This paper introduces the Data Quality Enhancement and Analytics (DQEA) Framework to enhance data quality in social media analytics by leveraging advanced data analytics tools. Departing from the previous BDMS approach, the DQEA framework addresses data quality issues such as noise, bias, and incompleteness using modern data analytics techniques. The efficacy of the framework is validated through features tested against human coders on Amazon Mechanical Turk, achieving an inter-coder reliability score of 0.85, indicating high agreement. Furthermore, two case studies with a large social media dataset from Tumblr were conducted to demonstrate the effectiveness of the proposed content features. In the first case study, the DQEA framework reduced data noise by 30% and bias by 25%, while increasing completeness by 20%. In the second case study, the framework improved data consistency by 35% and overall data quality score by 28%. Comparative analysis with state-of-the-art models, including Random Forest and Support Vector Machines (SVM), showed significant improvements in data reliability and decision-making accuracy. Specifically, the DQEA framework outperformed the Random Forest model by 15% in accuracy and 20% in true positive rate, and the SVM model by 10% in error rate reduction and 18% in reliability. Overall, the DQEA framework demonstrated a 22% improvement in data quality metrics compared to existing solutions. These quantitative metrics validate the framework’s ability to enhance data quality in social media analytics which provides a robust solution for addressing critical data quality challenges. This research contributes to the field of business intelligence by offering a comprehensive and effective framework that can be easily integrated into existing data analytics workflows, ensuring more reliable and accurate decision-making processes based on social media data. The results underscore the potential of advanced data analytics tools in transforming social media data into a valuable asset for organizations, highlighting the practical implications and future research directions in this domain.DOI:
https://doi.org/10.31449/inf.v49i3.8306Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika







