Skip to yearly menu bar Skip to main content


Poster

Ultra High-Resolution Image Inpainting with Patch-Based Content Consistency Adapter

JianHui Zhang · Shen Cheng · Qirui Sun · Jia Liu · Wang Luyang · chaoyu feng · Chen Fang · LEI LEI · Jue Wang · Shuaicheng Liu


Abstract:

In this work, we present Patch-Adapter, an effective framework for high-resolution text-guided image inpainting. Unlike existing methods limited to lower resolutions, our approach achieves 4K+ resolution while maintaining precise content consistency and prompt alignment—two critical challenges in image inpainting that intensify with increasing resolution and texture complexity.Patch-Adapter leverages a two-stage adapter architecture to scale the Diffusion models's resolution from 1K to 4K+ without requiring structural overhauls:(1)Dual Context Adapter: Learns coherence between masked and unmasked regions at reduced resolutions to establish global structural consistency.(2)Reference Patch Adapter: Implements a patch-level attention mechanism for full-resolution inpainting, preserving local detail fidelity through adaptive feature fusion.This dual-stage architecture uniquely addresses the scalability gap in high-resolution inpainting by decoupling global semantics from localized refinement. Experiments demonstrate that Patch-Adapter not only resolves artifacts common in large-scale inpainting but also achieves state-of-the-art performance on the OpenImages and photo-concept-bucket datasets, outperforming existing methods in both perceptual quality and text-prompt adherence. The code will be open-sourced.

Live content is unavailable. Log in and register to view live content