ICCV Poster Addressing Text Embedding Leakage in Diffusion-based Image Editing

Poster

Addressing Text Embedding Leakage in Diffusion-based Image Editing

Sunung Mun · Jinhwan Nam · Sunghyun Cho · Jungseul Ok

Exhibit Hall I #1513

[ Abstract ] [ Project Page ]

Wed 22 Oct 5:45 p.m. PDT — 7:45 p.m. PDT

Abstract:

Text-guided image editing with diffusion models enables flexible modifications, but editing multiple objects remains challenging due to unintended attribute interference, where edits affect non-target regions or mix attributes within the target areas. We identify the End-of-Sequence (EOS) token embeddings as a key factor in this issue, introducing global semantics that disrupt intended modifications. To address this, we propose Attribute-LEakage-free Editing (ALE-Edit), an approach that is both effective, by properly addressing EOS-induced interference, and efficient, as it requires no additional fine-tuning. ALE-Edit consists of: (1) Object-Restricted Embedding (ORE) to localize attributes, (2) Region-Guided Blending for Cross-Attention Masking (RGB-CAM) to align attention with target regions, and (3) Background Blending (BB) to preserve structural consistency. Additionally, we introduce ALE-Bench, a benchmark to quantify target-external and target-internal interference. Experiments show that ALE-Edit reduces unintended changes while maintaining high-quality edits, outperforming existing tuning-free methods. Our approach provides a scalable and computationally efficient solution for multi-object image editing.

Live content is unavailable. Log in and register to view live content