Semantically Aware Style-Controlled Animation Line Art Colorization Using Conditional GANs with GCN and Attention Mechanisms

Chen Li

Abstract


In traditional animation production, coloring line art is a labor-intensive and time-consuming process. In this study, image processing techniques based on generative adversarial networks (GANs) were investigated to develop an algorithm for the automatic coloring of animation line drawings. The objective was to improve production efficiency, reduce the workload of artists, and generate outputs that are natural in appearance, consistent in style, and closed along region boundaries. The proposed method was implemented within the conditional GAN framework. The generator adopted a U-Net architecture with skip connections, which allowed both fine-grained details and global structural features of the line art to be captured. A spectrally normalized discriminator was used to evaluate the realism of local image regions. To improve semantic accuracy and color coherence, an attention mechanism was incorporated, enabling the model to focus on key semantic areas and learn dependencies between color regions. End-to-end training was conducted using a large-scale paired dataset of line art and corresponding colored images. A multi-task learning strategy combining perceptual loss, L1 loss, and adversarial loss was employed for optimization. Latent space interpolation was further introduced to allow limited user adjustment of color styles. Experimental results indicated that the algorithm achieved a PSNR of 30.2 dB and an SSIM of 0.94, which represented improvements of 2.2 dB and 0.04 over the Structure Probe Adaptive Differential Evolution (SPADE) baseline, respectively. The FID and boundary overflow rate were reduced to 18.3 and 2.1%, also showing clear improvements over the baseline. With the inclusion of structural consistency loss, graph convolutional networks, and self-attention mechanisms, the method maintained accurate boundary preservation, achieved cross-region color consistency, and supported adaptive style rendering. In summary, the algorithm addressed key limitations of existing approaches, such as color overflow, semantic inconsistency, and rigid stylization. These findings demonstrate improvements in automatic coloring quality and suggest the potential of GAN-based techniques in animation production.


Full Text:

PDF

References


Lyu J, Young Lee H, Liu H. Color matching generation algorithm for animation characters based on convolutional neural network. Computational Intelligence and Neuroscience, 2022, 2022(1): 3146488. doi: 10.1155/2022/3146488

Gao X, Yin L, Deng Y, Wang F, Qin Y, Zhang M. Bi-Stream feature extraction and multiscale attention generative adversarial network (BM-GAN): colorization of grayscale images based on Bi-Stream feature fusion and multiscale attention generative adversarial network. The European Journal on Artificial Intelligence, 2025, 38(2): 159-180. doi: 10.1177/30504554241297613

Zhao Y, Ren D, Chen Y, Jia W, Wang R, Liu X. Cartoon image processing: a survey. International Journal of Computer Vision, 2022, 130(11): 2733-2769. doi: 10.1007/s11263-022-01645-1

Duan J, Gao M, Zhao G, Zhao W, Mo S, Zhang W. FAColorGAN: a dual-branch generative adversarial network for near-infrared image colorization. Signal, Image and Video Processing, 2024, 18(8): 5719-5731. doi:10.1007/s11760-024-03266-2

Jin X, Di Y, Jiang Q, Chu X, Duan Q, Yao S, Zhou W. Image colorization using deep convolutional auto-encoder with multi-skip connections. Soft Computing, 2023, 27(6): 3037-3052. doi:10.1007/s00500-022-07483-0

Zhang Z, Li Y, Shin B S. Robust medical image colorization with spatial mask-guided generative adversarial network. Bioengineering, 2022, 9(12): 721. doi: 10.3390/bioengineering9120721

Ai Y, Liu X, Zhai H, Li J, Liu S, An H, et al. Multi-scale feature fusion with attention mechanism based on CGAN network for infrared image colorization. Applied Sciences, 2023, 13(8): 4686. doi: 10.3390/app13084686

Treneska S, Zdravevski E, Pires I M, Lameski P, Gievska S. Gan-based image colorization for self-supervised visual feature learning. Sensors, 2022, 22(4): 1599. doi:10.3390/s22041599

Wu D, Gan J, Zhou J, Wang J, Gao W. Fine‐grained semantic ethnic costume high‐resolution image colorization with conditional GAN. International Journal of Intelligent Systems, 2022, 37(5): 2952-2968. doi: 10.1002/int.22726

Wu B, Dong Q, Sun W. Automatic colorization of Chinese ink painting combining multi-level features and generative adversarial networks. Fractals, 2023, 31(06): 2340144. doi: 10.1142/S0218348X23401448

Zhou S, Li H, Sun W, Zhou F, Xiao K. Auto‐White balance algorithm of skin color based on asymmetric generative adversarial network. Color Research & Application, 2025, 50(3): 266-275. doi:10.1002/col.22970

Al-Ghanimi A, Lakizadeh A. ViT-Based Automatic Grayscale Image Colorization with a Hybrid Loss Function. Ingenierie des Systemes d'Information, 2025a, 30(4): 995. doi: 10.18280/isi.300416

Dalal H, Dangle A, Radhika M J, Gore S. Image colorization progress: a review of deep learning techniques for automation of colorization. International Journal of Advanced Trends in Computer Science and Engineering, 2021, 10(10.30534). doi: 10.30534/ijatcse/2021/401042021

Chen R, Zhao J, Yao X, He Y, Li Y, Lian Z, et al. Enhancing urban landscape design: a GAN-based approach for rapid color rendering of park sketches. Land, 2024, 13(2): 254. doi: 10.3390/land13020254

Jampour M, Zare M, Javidi M. Advanced multi-Gans towards near to real image and video colorization. Journal of Ambient Intelligence and Humanized Computing, 2023, 14(9): 12857-12874. doi: 10.1007/s12652-022-04206-z

Mourchid Y, Donias M, Berthoumieu Y, Najim M. SPDGAN: a generative adversarial network based on SPD manifold learning for automatic image colorization. Neural Computing and Applications, 2023, 35(32): 23581-23597. doi: 10.1007/s00521-023-08999-8

Yang P, Zhao H, Xie Z. Colorization of fundus images based on advanced degradation models. Journal of Radiation Research and Applied Sciences, 2025, 18(1): 101285. doi: 10.1016/j.jrras.2024.101285

Al-Ghanimi A, Lakizadeh A. Improving unsupervised deep learning methods for detection and colorization of grayscale images based on contextual content. International Journal of Intelligent Engineering & Systems, 2025b, 18(4):439. doi: 10.22266/ijies2025.0531.28

Tian N, Liu Y, Wu B, et al. Colorization of logo sketch based on conditional generative adversarial networks. Electronics, 2021, 10(4): 497. doi: 10.3390/electronics10040497

Rizkinia M, Faustine N, Okuda M. Conditional generative adversarial networks with total variation and color correction for generating Indonesian face photo from sketch. Applied Sciences, 2022, 12(19): 10006. doi: 10.3390/app121910006

Zhang J, Zhu S, Liu K, Liu X. UGSC‐GAN: User‐guided sketch colorization with deep convolution generative adversarial networks. Computer Animation and Virtual Worlds, 2022, 33(1): e2032. doi: 10.1002/cav.2032

Tian Z, Li X, Zheng Y, et al. Graph‐convolutional‐network‐based interactive prostate segmentation in MR images. Medical physics, 2020, 47(9): 4164-4176. doi: 10.1002/mp.14327

Guo X, Liu X, Królczyk G, Sulowicz M, Glowacz A, Gardoni P, et al. Damage detection for conveyor belt surface based on conditional cycle generative adversarial network. Sensors, 2022, 22(9): 3485. doi: 10.3390/s22093485

Gui X, Zhang B, Li L, Yang Y. DLP-GAN: learning to draw modern Chinese landscape photos with generative adversarial network. Neural Computing and Applications, 2024, 36(10): 5267-5284. doi: 10.1007/s00521-023-09345-8

Langa A S, Bolaño R R, Carrión S G, Elorrieta I U. Color normalization through a simulated color checker using generative adversarial networks. Electronics, 2025, 14(9): 1746. doi: 10.3390/electronics14091746

Lee H Y, Li Y H, Lee T H, Aslam M S. Progressively unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. Sensors, 2023, 23(15): 6858. doi: 10.3390/s23156858




DOI: https://doi.org/10.31449/inf.v49i29.10563

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.