Application of Multimodal Generation Model in Short Video Content Personalized Generation
Abstract
The rise of short video platforms has led to a higher demand for rapidly generated personalized content. Existing systems either struggle with high levels of customization or require large amounts of data, limiting real-time production. A multimodal generation model serves as the focus of study to generate customized short video content that adapts to user preferences as well as their behavioral patterns. The objective targets an integrative model using text alongside image and audio data to make context-specific short video content, which delivers personalized entertainment. First, it analyses user preferences from interaction data and then synthesizes corresponding video content using a novel method called a stochastic paint optimizer with an intelligent convolutional neural network (SPO-IntelliConvNet). The SPO component ensures optimal representation of multimodal content by improving feature selection and parameter tuning through stochastic search algorithms modelled after the dynamics of abstract paintings. The IntelliConvNet is used to combine and interpret several modalities, allowing for efficient personalization that is consistent with user preferences. To develop personalized content, user preference data is collected, which includes interactions such as video views and comments. The model employs natural language processing (NLP), audio processing, and computer vision to merge text, image, and audio modalities. Pre-processing includes tokenization for text, Canny edge detection for images, and Wiener filtering for audio, optimizing each modality for better analysis and feature extraction using principal component analysis (PCA) to reduce the dimensions of features from all three modalities to lower dimensions while preserving essential information. This proposed approach achieved superior personalized content development, leading to increased user satisfaction and engagement. The performance of the proposed method was evaluated using BLEU-4 (0.55), ROUGE-L (0.79), METEOR (0.72), and CIDEr (0.80). The system's ability to successfully incorporate multimodal data resulted in more precise video customization, as demonstrated by interaction metrics and user comments. This multimodal generation model provides an advanced solution for creating personalized short video content, increasing the user experience with highly tailored content.DOI:
https://doi.org/10.31449/inf.v49i21.9838Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika







