Cross-Modal Image Retrieval Considering Semantic Relationships With Many-to-Many Correspondence Loss

Zhang, Huaying; Yanagi, Rintaro; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki

doi:10.1109/ACCESS.2023.3239858


Hokkaido University \| Library \| HUSCAP	Advanced Search		言語

	Home
	About HUSCAP
	Open Access Policy

	Browse by Author

Browse
	Communities & Collections

	Scholarly Journals
	Theses
	Doctoral Dissertations Listed by Graduate Schools
	Conference Procs.
	Events

	HUSCAP Senior (in Japanese)

	Societies

	Downloads (country)

For university staff
	How to post your papers to HUSCAP
	Publication of theses
	Helpline about theses publication

Open Archives Compliant

You can search our collection also at:
	Google
	Google Scholar
	CiNii
	IRDB
	OAIster
	NDLTD

Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Environmental Science / Faculty of Environmental Earth Science >
Peer-reviewed Journal Articles, etc >

Cross-Modal Image Retrieval Considering Semantic Relationships With Many-to-Many Correspondence Loss

Files in This Item:

The file(s) associated with this item can be obtained from the following URL: https://doi.org/10.1109/ACCESS.2023.3239858

Title:	Cross-Modal Image Retrieval Considering Semantic Relationships With Many-to-Many Correspondence Loss
Authors:	Zhang, Huaying Browse this author
	Yanagi, Rintaro Browse this author
	Togo, Ren Browse this author
	Ogawa, Takahiro Browse this author →KAKEN DB
	Haseyama, Miki Browse this author →KAKEN DB
Keywords:	Semantics
	Image retrieval
	Measurement
	Training data
	Extraterrestrial measurements
	Convolutional neural networks
	Recurrent neural networks
	Cross-modal image retrieval
	many-to-many correspondences
	multimedia information retrieval
	semantic similarity
Issue Date:	25-Jan-2023
Publisher:	IEEE (Institute of Electrical and Electronics Engineers)
Journal Title:	IEEE Access
Volume:	11
Start Page:	10675
End Page:	10686
Publisher DOI:	10.1109/ACCESS.2023.3239858
Abstract:	A cross-modal image retrieval that explicitly considers semantic relationships between images and texts is proposed. Most conventional cross-modal image retrieval methods retrieve the target images by directly measuring the similarities between the candidate images and query texts in a common semantic embedding space. However, such methods tend to focus on a one-to-one correspondence between a predefined image-text pair during the training phase, and other semantically similar images and texts are ignored. By considering the many-to-many correspondences between semantically similar images and texts, a common embedding space is constructed to assure semantic relationships, which allows users to accurately find more images that are related to the input query texts. Thus, in this paper, we propose a cross-modal image retrieval method that considers semantic relationships between images and texts. The proposed method calculates the similarities between texts as semantic similarities to acquire the relationships. Then, we introduce a loss function that explicitly constructs the many-to-many correspondences between semantically similar images and texts from their semantic relationships. We also propose an evaluation metric to assess whether each method can construct an embedding space considering the semantic relationships. Experimental results demonstrate that the proposed method outperforms conventional methods in terms of this newly proposed metric.
Rights:	© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Type:	article
URI:	http://hdl.handle.net/2115/88543
Appears in Collections:	環境科学院・地球環境科学研究院 (Graduate School of Environmental Science / Faculty of Environmental Earth Science) > 雑誌発表論文等 (Peer-reviewed Journal Articles, etc)

OAI-PMH ( junii2 , jpcoar_1.0 )

- Hokkaido University