Categories
Technology

This Artificial Intelligence (AI) Paper From South Korea Proposes


Research on neural fields, which represent signals by mapping coordinates to their quantities (e.g., scalars or vectors) with neural networks, has exploded recently. This has sparked an increased interest in utilizing this technology to handle a variety of signals, including audio, image, 3D shape, and video. The universal approximation theorem and coordinate encoding techniques provide the theoretical foundations for accurate signal representation of brain fields. Recent investigations have shown its adaptability in data compression, generative models, signal manipulation, and basic signal representation.

This Artificial Intelligence (AI) Paper From South Korea Proposes
Figure 1 shows the (a) general structure of the proposed flow-guided frame-wise representations, (b) frame-wise video representations, (c) pixel-wise video representations (FFNeRV)

Research on neural fields, which represent signals by mapping coordinates to their quantities (e.g., scalars or vectors) with neural networks, has exploded recently. This has sparked an increased interest in utilizing this technology to handle a variety of signals, including audio, image, 3D shape, and video. The universal approximation theorem and coordinate encoding techniques provide the theoretical foundations for accurate signal representation of brain fields. Recent investigations have shown its adaptability in data compression, generative models, signal manipulation, and basic signal representation.

Each time coordinate is represented by a video frame created by a stack of MLP and convolutional layers. Compared to the basic neural field design, our method considerably cut the encoding time and outperformed common video compression techniques. This paradigm is followed by the recently suggested E-NeRV while also boosting video quality. As shown in Figure 1, they offer flow-guided frame-wise neural representations for movies (FFNeRV). They embed optical flows into the frame-wise representation to use temporal redundancy, drawing inspiration from common video codecs. By combining nearby frames led by flows, FFNeRV creates a video frame that enforces the reuse of pixels from previous frames. Encouraging the network to avoid remembering the same pixel values again across frames dramatically improves parameter efficiency.

Meet Hailo-8™: An AI Processor That Uses Computer Vision For Multi-Camera Multi-Person Re-Identification (Sponsored)

FFNeRV beats alternative frame-wise algorithms in video compression and frame interpolation, according to experimental results on the UVG dataset. They suggest using multi-resolution temporal grids with a fixed spatial resolution in place of MLP to map continuous temporal coordinates to corresponding latent features to improve the compression performance further. This is motivated by the grid-based neural representations. Additionally, they suggest utilizing a more condensed convolutional architecture. They use group and pointwise convolutions in the recommended frame-wise flow representations, driven by generative models that produce high-quality pictures and lightweight neural networks. FFNeRV beats popular video codecs (H.264 and HEVC) and performs on par with cutting-edge video compression algorithms using quantization-aware training and entropy coding. Code implementation is based on NeRV and is available on GitHub. 


Check out the Paper, Github, and Project. All Credit For This Research Goes To Researchers on This Project. Also, don’t forget to join our Reddit page and discord channel, where we share the latest AI research news, cool AI projects, and more.


This Artificial Intelligence (AI) Paper From South Korea Proposes

Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.




Source link

Avatar

By Google News

Google News is a news aggregator platform. It presents a continuous, customizable flow of articles organized from thousands of publishers and magazines.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.