Enhancing Contextual Data Analysis through Retrieval-Augmented Fine-Tuning of Large Language Models

Yuezhen Zhang

doi:10.31449/inf.v49i29.8094

Enhancing Contextual Data Analysis through Retrieval-Augmented Fine-Tuning of Large Language Models

Yuezhen Zhang

Abstract

This study explores the optimization of large language models (LLMs) for enhanced contextual data analysis and knowledge extraction from unstructured user-generated content, with a comparative analysis of open-source models (e.g., Mistral 7B, LLaMA 2) and proprietary systems (e.g., GPT-4, Gemini). We evaluate their efficiency and accuracy in processing complex datasets, introducing a novel approach that integrates Retrieval-Augmented Generation (RAG) with fine-tuning techniques like Low-Rank Adaptation (LoRA) to reduce model complexity while preserving performance. Empirical results, using metrics such as BERTScore, ROUGE, and BLEU, show GPT-4 achieving an F1 score of 0.683, while Mistral 7B, a standout open-source model, scores 0.632 with a 40% reduction in computational cost and 92% accuracy retention, making it ideal for resource-constrained environments. These findings underscore the importance of tailoring model selection to computational and organizational needs. The research offers actionable insights for deploying AI-driven solutions to streamline data processing and advance machine learning applications, while addressing limitations and future research directions for broader applicability.

Full Text:

PDF

References

B. Vlaˇci´c, L. Corbo, S. C. e Silva, and M. Dabi´c, “The evolving role of artificial intelligence in marketing: A review and research agenda,” Journal of business research, vol. 128, pp. 187–203, 2021.

S. M. R. Naqvi, M. Ghufran, C. Varnier, J.-M. Nicod,K. Javed, and N. Zerhouni, “Unlocking maintenance insights in industrial text through semantic search,”Computers in Industry, vol. 157, p. 104083, 2024.

V. Soni, “Large language models for enhancing cus-tomer lifecycle management,” Journal of Empirical Social Science Studies, vol. 7, no. 1, pp. 67–89, 2023.

L. Wu, Z. Zheng, Z. Qiu, H. Wang, H. Gu, T. Shen, C. Qin, C. Zhu, H. Zhu, Q. Liu et al., “A survey on large language models for recommendation,” WorldWide Web, vol. 27, no. 5, p. 60, 2024.

R. Bavaresco, D. Silveira, E. Reis, J. Barbosa,R. Righi, C. Costa, R. Antunes, M. Gomes, C. Gatti, M. Vanzin et al., “Conversational agents in business:A systematic literature review and future research directions,” Computer Science Review, vol. 36, p.100239, 2020.

V. Hackl, A. E. M¨uller, M. Granitzer, and M. Sailer,“Is gpt-4 a reliable rater? evaluating consistency in gpt-4’s text ratings,” in Frontiers in Education, vol. 8.Frontiers Media SA, 2023, p. 1272229.

M. Haman and M. ˇSkoln´ık, “Using chatgpt to con-duct a literature review,” Accountability in research,vol. 31, no. 8, pp. 1244–1246, 2024.

H. Yang, X.-Y. Liu, and C. D. Wang, “Fingpt:Open-source financial large language models,” arXiv preprint arXiv:2306.06031, 2023.

G. B¨uy¨uk¨ozkan, O. Feyzio˘glu, and D. Ruan, “Fuzzygroup decision-making to multiple preference for-mats in quality function deployment,” Computers inIndustry, vol. 58, no. 5, pp. 392–402, 2007.

D. Kilroy, G. Healy, and S. Caton, “Using machine learning to improve lead times in the identification of customer needs,” IEEE Access, vol. 10, pp. 37 774–37 795, 2022.

I. Spada, S. Barandoni, V. Giordano, F. Chiarello, G. Fantoni, and A. Martini, “What users want: A natural language processing approach to discover users’needs from online reviews,” Proceedings of the Design Society, vol. 3, pp. 3879–3888, 2023.

J. Krumm, N. Davies, and C. Narayanaswami, “User-generated content,” IEEE Pervasive Comput-ing, vol. 7, no. 4, pp. 10–11, 2008.

X. Ding, B. Liu, and P. S. Yu, “A holistic lexicon-based approach to opinion mining,” in Proceedings of the 2008 international conference on web search and data mining, 2008, pp. 231–240.

A. S. Abrahams, W. Fan, G. A. Wang, Z. Zhang,and J. Jiao, “An integrated text analytic framework for product defect discovery,” Production and Opera-tions Management, vol. 24, no. 6, pp. 975–990, 2015.

T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “Bertscore: Evaluating text generation with bert,” arXiv preprint arXiv:1904.09675, 2019.

A. Timoshenko and J. R. Hauser, “Identifying cus-tomer needs from user-generated content,” Marketing Science, vol. 38, no. 1, pp. 1–20, 2019.

A. Bigi, F. Cassia, and M. M. Ugolini, “Who killed food tourism? unaware cannibalism in online conver-sations about traveling in italy,” British Food Journal,vol. 124, no. 2, pp. 573–589, 2022.

T. U. Haque, N. N. Saber, and F. M. Shah, “Sentiment analysis on large scale amazon product reviews,” in 2018 IEEE international conference on innovative re-search and development (ICIRD). IEEE, 2018, pp.1–6.

Y. Luo and X. Xu, “Comparative study of deep learn-ing models for analyzing online restaurant reviews in the era of the covid-19 pandemic,” International Jour-nal of Hospitality Management, vol. 94, p. 102849,2021.

G. Team, R. Anil, S. Borgeaud, J.-B. Alayrac,J. Yu, R. Soricut, J. Schalkwyk, A. M. Dai,

A. Hauth, K. Millican et al., “Gemini: a family of highly capable multimodal models,” arXiv preprint arXiv:2312.11805, 2023.

A. Gilson, C. W. Safranek, T. Huang, V. Socrates,L. Chi, R. A. Taylor, D. Chartash et al., “How does chatgpt perform on the united states medical licensing examination (usmle)? the implications of large lan- guage models for medical education and knowledge assessment,” JMIR medical education, vol. 9, no. 1,p. e45312, 2023.

J. Achiam, S. Adler, S. Agarwal, L. Ahmad,I. Akkaya, F. L. Aleman, D. Almeida, J. Al-tenschmidt, S. Altman, S. Anadkat et al., “Gpt-4 tech-nical report,” arXiv preprint arXiv:2303.08774, 2023.

J. Li, A. Dada, B. Puladi, J. Kleesiek, and J. Eg-ger, “Chatgpt in healthcare: a taxonomy and sys-tematic review,” Computer Methods and Programs in Biomedicine, p. 108013, 2024.

H. Touvron, L. Martin, K. Stone, P. Albert, A. Alma-hairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhar-gava, S. Bhosale et al., “Llama 2: Open foun-dation and fine-tuned chat models,” arXiv preprint arXiv:2307.09288, 2023.

A. Q. Jiang, A. Sablayrolles, A. Roux, A. Mensch, B. Savary, C. Bamford, D. S. Chaplot, D. d. l. Casas, E. B. Hanna, F. Bressand et al., “Mixtral of experts,” arXiv preprint arXiv:2401.04088, 2024.

M. Javaheripi, S. Bubeck, M. Abdin, J. Aneja, S. Bubeck, C. C. T. Mendes, W. Chen, A. Del Giorno,R. Eldan, S. Gopi et al., “Phi-2: The surprising power of small language models,” Microsoft Research Blog, vol. 1, no. 3, p. 3, 2023.

B. Burger, D. K. Kanbach, S. Kraus, M. Breier, and V. Corvello, “On the use of ai-based tools like chatgpt to support management research,” European Journal Innovation Management, vol. 26, no. 7, pp. 233–241, 2023.

D. Mhlanga, “Open ai in education, the responsi-ble and ethical use of chatgpt towards lifelong learn-ing,” in FinTech and artificial intelligence for sustain-able development: The role of smart technologies in achieving development goals. Springer, 2023, pp.387–409.

C. K. Lo, “What is the impact of chatgpt on educa-tion? a rapid review of the literature,” Education Sci-ences, vol. 13, no. 4, p. 410, 2023.

A. Haleem, M. Javaid, and R. P. Singh, “An era of chatgpt as a significant futuristic support tool: A study on features, abilities, and challenges,” Bench-Council transactions on benchmarks, standards and evaluations, vol. 2, no. 4, p. 100089, 2022.

J. Kim, J. H. Kim, C. Kim, and J. Park, “Decisions with chatgpt: Reexamining choice overload in chat-gpt recommendations,” Journal of Retailing and Con-sumer Services, vol. 75, p. 103494, 2023.

A. Kumar, N. Gupta, and G. Bapat, “Who is making [41] P. He, X. Liu, J. Gao, and W. Chen, “Deberta:Decoding-enhanced bert with disentangled attention,”arXiv preprint arXiv:2006.03654, 2020.

B. Mo, H. Xu, D. Zhuang, R. Ma, X. Guo, and J. Zhao, “Large language models for travel behavior prediction,” arXiv preprint arXiv:2312.00819, 2023.

E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2021.

O. Khattab, A. Singhvi, P. Maheshwari, Z. Zhang, K. Santhanam, S. Vardhamanan, S. Haq, A. Sharma, T. T. Joshi, H. Moazam et al., “Dspy: Compiling declarative language model calls into self-improving pipelines,” arXiv preprint arXiv:2310.03714, 2023.

J. Devlin, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.

Z. Xiang and U. Gretzel, “Role of social media in on- line travel information search,” Tourism management, vol. 31, no. 2, pp. 179–188, 2010.

J. L. Fleiss, “Measuring nominal scale agreement among many raters.” Psychological bulletin, vol. 76,no. 5, p. 378, 1971.

Y. Liu, “Roberta: A robustly optimized bert pretrain-ing approach,” arXiv preprint arXiv:1907.11692, vol. 364, 2019.

Y. Sasaki et al., “The truth of the f-measure. Teach tutor mater, 1 (5), 1–5,” 2007.

P. He, X. Liu, J. Gao, and W. Chen, “Deberta: Decoding-enhanced bert with disentangled attention,” arXiv preprint arXiv:2006.03654, 2020.

C.-Y. Lin, “Rouge: A package for automatic evalua- tion of summaries,” in Text summarization branches out, 2004, pp. 74–81.

A. Celikyilmaz, E. Clark, and J. Gao, “Evalua- tion of text generation: A survey,” arXiv preprint arXiv:2006.14799, 2020.

DOI: https://doi.org/10.31449/inf.v49i29.8094

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me