Poster
ArtEditor: Learning Customized Instructional Image Editor from Few-Shot Examples
Shijie Huang · Yiren Song · Yuxuan Zhang · Hailong Guo · Xueyin Wang · Jiaming Liu
We introduce ArtEditor, a novel framework for instruction-based image editing that learns unique editing styles from few-shot examples. While image editing has seen significant advancements, customized instructional editing remains underexplored. Existing methods often rely on complex, multi-stage pipelines that are difficult to adapt to specific styles. Additionally, this domain lacks a standardized benchmark, making it challenging to evaluate progress. To address these issues, we propose ArtEditor, a two-stage training framework. In the first stage, we train ArtEditor-Base, a general-purpose image editing model, on large-scale datasets to build a strong foundational capability. In the second stage, we fine-tune this model using ArtEditor-LoRA, a lightweight adaptation module, on a small dataset of before-and-after image pairs. This approach enables the model to efficiently learn distinct editing styles and techniques with minimal data. To enhance the performance of a pre-trained Diffusion Transformer (DiT) model, we introduce two key innovations: position encoding cloning and a noise-free conditioning paradigm. These techniques ensure stable and coherent edits, even when adapting to new styles. To support research in this area, we contribute the DoodleArt dataset, the first benchmark specifically designed for customized image editing. DoodleArt features six high-quality artistic styles created by professional artists and designers, providing a valuable resource for evaluating and advancing future work. Extensive experiments demonstrate that ArtEditor achieves superior performance and robustness in customized image editing. Our framework opens new possibilities for artistic creation, offering artists intuitive and flexible tools to bring their visions to life.
Live content is unavailable. Log in and register to view live content