Image to image translation
https://arxiv.org/abs/1512.04150 https://arxiv.org/abs/2104.01538
Multimodal Unsupervised Image-to-Image Translation
Purpose
While this conditional distribution is inherently multimodal, existing approaches make an overly simplified assumption, modeling it as a deterministic one-to-one mapping. As a result, they fail to generate diverse outputs from a given source domain image. To address this limitation, we propose a Multimodal Unsupervised Image-to-image Translation (MUNIT) framework. (Abstract)
Method


Loss function:

Functional summary: Given an input image and sample a style variable from Gaussian distribution, the model gives target domain image with the content depending on the input image and the style depending on the style variable. Different will give different output image in target domain. But this method still limited to one domain to domain. If multiple domain is wanted, the conditional generation should be applied. (maybe some other work solves this problem).
Network Structure

Result

DRIT
Method


Loss function



Result

[https://arxiv.org/pdf/1703.00848.pdf]
Method


Cross-Modal Recipe Embeddings by Disentangling Recipe Contents and Dish Styles
Method
