ICCV Poster EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model

Poster

EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model

Shengqi Dang · Yi He · Long Ling · Ziqing Qian · Nanxuan Zhao · Nan Cao

Exhibit Hall I #1396

[ Abstract ]

Wed 22 Oct 5:45 p.m. PDT — 7:45 p.m. PDT

Abstract:

Recent research shows that emotions can enhance users' cognition and influence information communication. While research on visual emotion analysis is extensive, limited work has been done on helping users generate emotionally rich image content. Existing work on emotional image generation relies on discrete emotion categories, making it challenging to capture complex and subtle emotional nuances accurately. Additionally, these methods struggle to control the specific content of generated images based on text prompts. In this paper, we introduce the task of continuous emotional image content generation (C-EICG) and present EmotiCrafter, a general emotional image generation model that generates images based on free text prompts and Valence-Arousal (V-A) values. It leverages a novel emotion-embedding mapping network to fuse V-A values into textual features, enabling the capture of emotions in alignment with intended input prompts. A novel loss function is also proposed to enhance emotion expression. The experimental results show that our method effectively generates images representing specific emotions with the desired content and outperforms existing techniques.

Live content is unavailable. Log in and register to view live content