A study on machine learning for personalized prediction of human perception toward visual stimuli

諸戸, 祐哉


Hokkaido University \| Library \| HUSCAP	Advanced Search		言語

	Home
	About HUSCAP
	Open Access Policy

	Browse by Author

Browse
	Communities & Collections

	Scholarly Journals
	Theses
	Doctoral Dissertations Listed by Graduate Schools
	Conference Procs.
	Events

	HUSCAP Senior (in Japanese)

	Societies

	Downloads (country)

For university staff
	How to post your papers to HUSCAP
	Publication of theses
	Helpline about theses publication

Open Archives Compliant

You can search our collection also at:
	Google
	Google Scholar
	CiNii
	IRDB
	OAIster
	NDLTD

Hokkaido University Collection of Scholarly and Academic Papers >
Theses >
博士　（情報科学） >

A study on machine learning for personalized prediction of human perception toward visual stimuli

Files in This Item:

Yuya_Moroto.pdf

44.98 MB

PDF

View/Open

Please use this identifier to cite or link to this item:https://doi.org/10.14943/doctoral.k16015

Related Items in HUSCAP:

論文内容及び審査の要旨
A study on machine learning for personalized prediction of human perception toward visual stimuli [an abstract of dissertation and a summary of dissertation review]

Title:	A study on machine learning for personalized prediction of human perception toward visual stimuli
Other Titles:	視覚刺激に対する人間の知覚を個別予測するための機械学習に関する研究
Authors:	諸戸, 祐哉 Browse this author
Issue Date:	25-Mar-2024
Publisher:	Hokkaido University
Abstract:	This thesis summarizes studies on the construction of machine learning models specific to personalized prediction of human perception toward visual stimuli.Machine learning has attracted significant attention in assisting humans due to its high potential and continues to respond to expectations in various fields. Specifically, after the development of deep learning technologies such as convolutional neural networks and recurrent neural networks, machine learning models can solve more complex tasks by learning a large amount of data. Recent studies on machine learning have progressed to the foundation models such as contrastive language-image pre-training and generative pre-trained transformer, and researchers have focused on the way to construct models that can effectively learn big data and conduct several tasks in a single model. Namely, they aim to develop the generalized model. Although this direction may be one advancement of machine learning, another important direction is the development of machine learning models that can be tuned for each individual from the perspective of human assistance. For instance, user satisfaction in video-sharing services can be improved by personalizing the multimedia content recommender system. Therefore, the personalization of machine learning can be an effective direction of advancement. The person-specific information is needed as a clue for training machine learning models to suit each individual. One of the person-specific information is the biological information obtained from humans. Here, to introduce such information into the machine learning models, human perception should be mediated as in the actual human information processing. However, it is difficult to directly implement them to existing models in various tasks such as content recommendation and information retrieval since machine learning just recognizes the patterns in the inputs and outputs and may ignore human perception. Hence, studies on predicting human perception have been conducted to indirectly personalize machine learning. Concretely, previous studies have predicted emotion and attention as human perception from brain activity and gaze data as the data representing biological information (hereafter, biological data). In these studies, although machine learning models have been used as prediction models, these models do not necessarily consider the properties specific to biological data since their architectures are designed not specifically for biological data. In contrast to the general data in the fields of computer vision and natural language processing, biological data are difficult to handle due to their unique properties such as individual differences. Therefore, there are great demands for rethinking the machine learning models suitable for biological data. This thesis focuses on three perspectives related to the inherent properties of biological data. The first perspective is the data volume obtained from each individual. Biological data varies widely among individuals, and data obtained from various individuals are difficult to handle in a uniform manner. Hence, the machine learning models need to be trained from the limited amount of data for reflecting on individual differences. The next perspective is the relationship between stimuli and their human response. Humans constantly receive a variety of stimuli and perceive them in their daily lives, and biological data reflect on such stimuli. To effectively predict human perception, not only biological data but also the contents of stimuli should be considered. Finally, the third perspective is mutual complementation through the collaborative use of several types of biological data. Advancements in sensor technologies enable the easy and simultaneous acquisition of various types of biological data. Each type of biological data represents a different aspect of the human response, and the human perception can be more precisely predicted by collaboratively using them than one of them alone. The purpose of this thesis is to construct machine learning models that can predict personalized human perception by incorporating the above perspectives. This thesis targets the human perception toward visual stimuli since several studies show that visual information is the most important to humans. Concretely, this thesis mainly tackles three themes to construct the machine learning models incorporating the above perspectives, respectively. First, to address the problem of the data volume, we focus on the similarities of biological data between individuals. In the case of predicting human attention toward visual stimuli such as images, we propose a new method for detecting the individuals with biological data patterns similar to those of the target individual. Moreover, we construct the machine learning model using the data obtained from similar individuals for predicting the perception of the target individual. Secondly, for analyzing the relationship between visual stimuli and biological data, we focus on the construction of the uniform representation of visual contents and gaze data including the region watched by the individual. Finally, we newly propose the feature integration methods for treating several types of biological information since biological data are pre-processed for calculating features suitable for each type of data before inputting machine learning models, generally. Then, when calculating the features of gaze data, we adopt the representation based on the second perspective for considering both visual contents and biological data. In this way, we newly proposed machine learning models suitable for biological data and indicate the effectiveness of focusing on the above inherent perspective. This thesis consists of six chapters. Chapter 1 describes the research background and the proposition of this thesis. Chapter 2 describes the related works and their problems to be solved in this thesis. Chapter 3 presents methods for few-shot personalized saliency prediction, which is the task predicting regions in images gazed at by individuals. Chapters 4 and 5 focus on human emotions as perceptions. Chapter 4 presents the methods for classifying images into emotional categories using gaze data. Chapter 5 presents the methods for multi-modal human emotion recognition based on various types of biological information. Finally, Chapter 6 concludes this thesis and clarifies the future directions. In summary, this thesis presents several machine learning methods for personalized prediction of human perception toward visual stimuli. For constructing the machine learning models specific to personalized prediction of human perception, the proposed methods incorporate the similarities of biological data between individuals and mutual complementation between different types of biological information. Furthermore, we confirm the effectiveness of the proposed methods through empirical experimentation on datasets derived from personally acquired raw data and openly available datasets.
Conffering University:	北海道大学
Degree Report Number:	甲第16015号
Degree Level:	博士
Degree Discipline:	情報科学
Examination Committee Members:	(主査) 教授長谷山美紀, 特任教授荒木健治, 特任教授坂本雄児, 教授土橋宜典, 教授小川貴弘
Degree Affiliation:	情報科学院(情報科学専攻)
Type:	theses (doctoral)
URI:	http://hdl.handle.net/2115/92394
Appears in Collections:	課程博士 (Doctorate by way of Advanced Course) > 情報科学院(Graduate School of Information Science and Technology) 学位論文 (Theses) > 博士　（情報科学）

OAI-PMH ( junii2 , jpcoar_1.0 )

- Hokkaido University