Multi-modal shared module that enables the bottom-up formation of map representation and top-down map reading

Noguchi, Wataru; Iizuka, Hiroyuki; Yamamoto, Masahito

doi:10.1080/01691864.2021.1993334


Hokkaido University \| Library \| HUSCAP	Advanced Search		言語

	Home
	About HUSCAP
	Open Access Policy

	Browse by Author

Browse
	Communities & Collections

	Scholarly Journals
	Theses
	Doctoral Dissertations Listed by Graduate Schools
	Conference Procs.
	Events

	HUSCAP Senior (in Japanese)

	Societies

	Downloads (country)

For university staff
	How to post your papers to HUSCAP
	Publication of theses
	Helpline about theses publication

Open Archives Compliant

You can search our collection also at:
	Google
	Google Scholar
	CiNii
	IRDB
	OAIster
	NDLTD

Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Information Science and Technology / Faculty of Information Science and Technology >
Peer-reviewed Journal Articles, etc >

Multi-modal shared module that enables the bottom-up formation of map representation and top-down map reading

Files in This Item:

advanced_robotics.pdf

2.04 MB

PDF

View/Open

Please use this identifier to cite or link to this item:http://hdl.handle.net/2115/87060

Title:	Multi-modal shared module that enables the bottom-up formation of map representation and top-down map reading
Authors:	Noguchi, Wataru Browse this author →KAKEN DB
	Iizuka, Hiroyuki Browse this author →KAKEN DB
	Yamamoto, Masahito Browse this author →KAKEN DB
Keywords:	Cognitive map
	multimodal learning
	predictive learning
	deep neural networks
	symbol grounding
Issue Date:	2-Nov-2021
Publisher:	Taylor & Francis
Journal Title:	Advanced Robotics
Volume:	36
Issue:	1-2
Start Page:	85
End Page:	99
Publisher DOI:	10.1080/01691864.2021.1993334
Abstract:	Humans create internal models of an environment (i.e. cognitive maps) through subjective sensorimotor experiences and can also understand spatial locations by looking at an external map as a symbol of an environment. We simulate the development of the cognitive map from sensorimotor experiences and grounding of the external map in a single deep neural network model. Our proposed network has a shared module that processes the features of multiple modalities (i.e. vision, hearing, and touch) and even external maps in the same manner. The multiple modalities are encoded into feature vectors by modality-specific encoders, and the encoded features are processed by the same shared module. The proposed network was trained to predict the sensory inputs of a simulated mobile robot. After the predictive learning, the spatial representation was developed in the internal states of the shared module, and the same spatial representation was used for predicting multiple modalities, including the external map. The network can also perform spatial navigation by associating the external map with the cognitive map. This implies that the external maps are grounded in subjective sensorimotor experiences, being bridged through the developed internal spatial representation in the shared module.
Rights:	This is an Accepted Manuscript of an article published by Taylor & Francis in Advanced robotics on 02 Nov 2021, available online: https://www.tandfonline.com/doi/10.1080/01691864.2021.1993334
Type:	article (author version)
URI:	http://hdl.handle.net/2115/87060
Appears in Collections:	情報科学院・情報科学研究院 (Graduate School of Information Science and Technology / Faculty of Information Science and Technology) > 雑誌発表論文等 (Peer-reviewed Journal Articles, etc)

Submitter: 野口渉

OAI-PMH ( junii2 , jpcoar_1.0 )

- Hokkaido University