Dual Interactive Wasserstein GAN fostered Offset based In-Loop Filtering for HEVC

Vanishree Moji, Bharathi Gururaj, Mathivanan Murugavelu

Abstract


Deep learning technology is so flexible and effective that many attempts have been made to replace the features that are included in video codecs, such as High Efficiency Video Coding, with deep learning-based alternatives. This article suggests a Dual interactive Wasserstein Generative Adversarial Network fostered Offset-Based In-Loop Filtering in HEVC (DiWaGAN-OBILF-HEVC). Previously, deep learning based in-loop filtering techniques only fixed distorted frames. The DiWaGAN-OBILF-HEVC system transfers offsets as side information, which makes prediction filtering much more accurate while using less bit rate. The DiWaGAN based sample adaptive offset filter is broadly divided into 2 phases: 1) category of error as per neighbouring image intensity values and 2) calculation of optimum offsets as per category of error. The DiWaGAN draws inspiration from the Sample Adaptive Offset (SAO) filter-Edge Offset (EO) type in high efficiency video coding, where one network classifies the error to the reconstructed signal edge shape while the other network simultaneously forecasts ideal offset values. These offsets are fed to the decoder to enhance its accuracy. The DiWaGAN-OBILF-HEVC method is implemented in Pytorch. The efficacy of DiWaGAN-OBILF-HEVC is analysed with performance metrics like Visual quality, PSNR, Bjontegaard rate difference (BD rate), and average execution time. Further, the proposed DiWaGAN-OBILF-HEVC method provides a higher PSNR of 16.25%, a lower Bjontegaard rate difference of 27.47% and a lower execution time of 15.993% when compared with existing methods like Efficient In-Loop Filtering Based on Enhanced Deep Convolutional Neural Networks for HEVC, Offset-Based in-Loop Filtering with Deep Network in HEVC and deep CNN for VVC in-loop filtration respectively.

Full Text:

PDF


DOI: https://doi.org/10.31449/inf.v49i33.8260

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.