Disentangled Representation Learning in Real-World Image Datasets via Image Segmentation Prior

Nakagawa, Nao; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki

doi:10.1109/ACCESS.2021.3101229


Hokkaido University \| Library \| HUSCAP	Advanced Search		言語

	Home
	About HUSCAP
	Open Access Policy

	Browse by Author

Browse
	Communities & Collections

	Scholarly Journals
	Theses
	Doctoral Dissertations Listed by Graduate Schools
	Conference Procs.
	Events

	HUSCAP Senior (in Japanese)

	Societies

	Downloads (country)

For university staff
	How to post your papers to HUSCAP
	Publication of theses
	Helpline about theses publication

Open Archives Compliant

You can search our collection also at:
	Google
	Google Scholar
	CiNii
	IRDB
	OAIster
	NDLTD

Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Information Science and Technology / Faculty of Information Science and Technology >
Peer-reviewed Journal Articles, etc >

Disentangled Representation Learning in Real-World Image Datasets via Image Segmentation Prior

Files in This Item:

The file(s) associated with this item can be obtained from the following URL: https://doi.org/10.1109/ACCESS.2021.3101229

Title:	Disentangled Representation Learning in Real-World Image Datasets via Image Segmentation Prior
Authors:	Nakagawa, Nao Browse this author
	Togo, Ren Browse this author
	Ogawa, Takahiro Browse this author →KAKEN DB
	Haseyama, Miki Browse this author →KAKEN DB
Keywords:	Image segmentation
	Standards
	Task analysis
	Decoding
	Data models
	Solid modeling
	Semantics
	Alpha blend
	disentanglement
	image segmentation
	real-world image
	representation learning
Issue Date:	30-Jul-2021
Publisher:	IEEE (Institute of Electrical and Electronics Engineers)
Journal Title:	IEEE Access
Volume:	9
Start Page:	110880
End Page:	110888
Publisher DOI:	10.1109/ACCESS.2021.3101229
Abstract:	We propose a novel method that can learn easy-to-interpret latent representations in real-world image datasets using a VAE-based model by splitting an image into several disjoint regions. Our method performs object-wise disentanglement by exploiting image segmentation and alpha compositing. With remarkable results obtained by unsupervised disentanglement methods for toy datasets, recent studies have tackled challenging disentanglement for real-world image datasets. However, these methods involve deviations from the standard VAE architecture, which has favorable disentanglement properties. Thus, for disentanglement in images of real-world image datasets with preservation of the VAE backbone, we designed an encoder and a decoder that embed an image into disjoint sets of latent variables corresponding to objects. The encoder includes a pre-trained image segmentation network, which allows our model to focus only on representation learning while adopting image segmentation as an inductive bias. Evaluations using real-world image datasets, CelebA and Stanford Cars, showed that our method achieves improved disentanglement and transferability.
Rights:	© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Type:	article
URI:	http://hdl.handle.net/2115/82697
Appears in Collections:	情報科学院・情報科学研究院 (Graduate School of Information Science and Technology / Faculty of Information Science and Technology) > 雑誌発表論文等 (Peer-reviewed Journal Articles, etc)

OAI-PMH ( junii2 , jpcoar_1.0 )

- Hokkaido University