Image to image translation

Image to image translation

September 21, 2023

https://arxiv.org/abs/1512.04150 https://arxiv.org/abs/2104.01538

Multimodal Unsupervised Image-to-Image Translation

Purpose

While this conditional distribution is inherently multimodal, existing approaches make an overly simplified assumption, modeling it as a deterministic one-to-one mapping. As a result, they fail to generate diverse outputs from a given source domain image. To address this limitation, we propose a Multimodal Unsupervised Image-to-image Translation (MUNIT) framework. (Abstract)

Method

MUNIT Method

Method overview

Loss function: Total Loss

Functional summary: Given an input image and sample a style variable zz from Gaussian distribution, the model gives target domain image with the content depending on the input image and the style depending on the style variable. Different zz will give different output image in target domain. But this method still limited to one domain to domain. If multiple domain is wanted, the conditional generation should be applied. (maybe some other work solves this problem).

Network Structure

Network Structure

Result

DRIT

Method

Loss function

Result

[https://arxiv.org/pdf/1703.00848.pdf]

Method

Cross-Modal Recipe Embeddings by Disentangling Recipe Contents and Dish Styles

Method

ACME