Image to image translation
https://arxiv.org/abs/1512.04150 https://arxiv.org/abs/2104.01538
Multimodal Unsupervised Image-to-Image Translation
Purpose
While this conditional distribution is inherently multimodal, existing approaches make an overly simplified assumption, modeling it as a deterministic one-to-one mapping. As a result, they fail to generate diverse outputs from a given source domain image. To address this limitation, we propose a Multimodal Unsupervised Image-to-image Translation (MUNIT) framework. (Abstract)
Method
Functional summary: Given a input image and sample a style variable $z$ from Gaussian distribution, the model give target domain image with the content depend on input image and the style depend on output image. Different $z$ will give differnet output image in target domain. But this method still limit to one domain to domain. If mutiplie domain is want, the conditional generate should be applied. (maybe some work solve this problem).
Network Strcutre