Poster
Dataset Distillation as Data Compression: A Rate-Utility Perspective
Youneng Bao · Yiping Liu · Zhuo Chen · Yongsheng Liang · Mu Li · Kede Ma
[
Abstract
]
Abstract:
The ``scale-is-everything" paradigm in machine learning has resulted in escalating computational and storage demands as datasets and models grow increasingly large. Dataset distillation addresses this challenge by compressing datasets into compact latent representations that generate synthetic data capable of matching the performance of models trained on the original data, formulated as a rate-utility optimization problem. Existing dataset distillation methods fail to achieve Pareto optimality due to their inability to jointly optimize compression rate and utility within a differentiable framework.Drawing inspiration from learned image compression (LIC), we propose a unified framework where latent representations are modeled as optimizable parameter grids (codes) and a generator (decoder) to transform codes to synthesized images. This approach subsumes nearly all existing latent representations while explicitly modeling the rate as an optimizable term through precise entropy estimation of the latent. To quantify compression efficiency, we introduce bits per class (BPC), a novel metric for distilled datasets. We optimize the uniform latent representation according to joint rate-utility trade off and achieve state-of-the-art results on CIFAR-10/100 and ImageNet-128. For instance, on the ImageNet-Subset dataset, our method achieves a 170$\times$ compression rate improvement over the baseline approach while maintaining comparable utility.The framework is compatible with most existing distillation algorithms and serves as a plug-in component to enhance rate-utility performance without modifications.
Live content is unavailable. Log in and register to view live content