Query is GAN: Scene Retrieval with Attentional Text-to-image Generative Adversarial Network

Yanagi, Rintaro; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki

doi:10.1109/ACCESS.2019.2947409


Hokkaido University \| Library \| HUSCAP	Advanced Search		言語

	Home
	About HUSCAP
	Open Access Policy

	Browse by Author

Browse
	Communities & Collections

	Scholarly Journals
	Theses
	Doctoral Dissertations Listed by Graduate Schools
	Conference Procs.
	Events

	HUSCAP Senior (in Japanese)

	Societies

	Downloads (country)

For university staff
	How to post your papers to HUSCAP
	Publication of theses
	Helpline about theses publication

Open Archives Compliant

You can search our collection also at:
	Google
	Google Scholar
	CiNii
	IRDB
	OAIster
	NDLTD

Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Information Science and Technology / Faculty of Information Science and Technology >
Peer-reviewed Journal Articles, etc >

Query is GAN: Scene Retrieval with Attentional Text-to-image Generative Adversarial Network

This item is licensed under:Creative Commons Attribution 4.0 International

Files in This Item:

ACCESS.2019.2947409.pdf

2.42 MB

PDF

View/Open

Please use this identifier to cite or link to this item:http://hdl.handle.net/2115/76167

Title:	Query is GAN: Scene Retrieval with Attentional Text-to-image Generative Adversarial Network
Authors:	Yanagi, Rintaro Browse this author
	Togo, Ren Browse this author
	Ogawa, Takahiro Browse this author →KAKEN DB
	Haseyama, Miki Browse this author →KAKEN DB
Keywords:	Scene retrieval
	deep learning
	generative adversarial network
	text-to-image translation
Issue Date:	14-Oct-2019
Publisher:	IEEE
Journal Title:	IEEE Access
Volume:	7
Start Page:	153183
End Page:	153193
Publisher DOI:	10.1109/ACCESS.2019.2947409
Abstract:	Scene retrieval from input descriptions has been one of the most important applications with the increasing number of videos on the Web. However, this is still a challenging task since semantic gaps between features of texts and videos exist. In this paper, we try to solve this problem by utilizing a text-to-image Generative Adversarial Network (GAN), which has become one of the most attractive research topics in recent years. The text-to-image GAN is a deep learning model that can generate images from their corresponding descriptions. We propose a new retrieval framework“, Query is GAN”, based on the text-to-image GAN that drastically improves scene retrieval performance by simple procedures. Our novel idea makes use of images generated by the text-to-image GAN as queries for the scene retrieval task. In addition, unlike many studies on text-to-image GANs that mainly focused on the generation of high-quality images, we reveal that the generated images have reasonable visual features suitable for the queries even though they are not visually pleasant. We show the effectiveness of the proposed framework through experimental evaluation in which scene retrieval is performed from real video datasets.
Rights:	https://creativecommons.org/licenses/by/4.0/
Type:	article
URI:	http://hdl.handle.net/2115/76167
Appears in Collections:	情報科学院・情報科学研究院 (Graduate School of Information Science and Technology / Faculty of Information Science and Technology) > 雑誌発表論文等 (Peer-reviewed Journal Articles, etc)

Submitter: 藤後廉

OAI-PMH ( junii2 , jpcoar_1.0 )

- Hokkaido University