Rescaling Intermediate Features Makes Trained Consistency Models Perform Better
- Junyi Zhu ,
- Zinan Lin ,
- Enshu Liu ,
- Xuefei Ning ,
- Matthew B. Blaschko
ICLR (TinyPapers) 2024 |
In the domain of deep generative models, diffusion models are renowned for their high-quality image generation but are constrained by intensive computational demands. To mitigate this, consistency models have been proposed as a computationally efficient alternative. Our research reveals that post-training rescaling of internal features can enhance the one-step sample quality of these models without incurring detectable computational overhead. This optimization is evidenced by an obvious improvement in Fréchet Inception Distance (FID). For example, with our rescaled consistency distillation (CD) model, FID on the ImageNet dataset reduces from 6.2 to 5.2, on the LSUN-cat dataset from 10.9 to 9.5. Closer inspection of the generated images reveals that this enhancement may originate from improved visual details and clarity.