Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Information Science and Technology / Faculty of Information Science and Technology >
Peer-reviewed Journal Articles, etc >
Diversity Learning Based on Multi-Latent Space for Medical Image Visual Question Generation
Title: | Diversity Learning Based on Multi-Latent Space for Medical Image Visual Question Generation |
Authors: | Zhu, He Browse this author | Togo, Ren Browse this author | Ogawa, Takahiro Browse this author →KAKEN DB | Haseyama, Miki Browse this author →KAKEN DB |
Keywords: | visual question generation | medical image analysis | medical informatics | computer vision | natural language processing |
Issue Date: | 17-Jan-2023 |
Publisher: | MDPI |
Journal Title: | Sensors |
Volume: | 23 |
Issue: | 3 |
Start Page: | 1057 |
Publisher DOI: | 10.3390/s23031057 |
Abstract: | Auxiliary clinical diagnosis has been researched to solve unevenly and insufficiently distributed clinical resources. However, auxiliary diagnosis is still dominated by human physicians, and how to make intelligent systems more involved in the diagnosis process is gradually becoming a concern. An interactive automated clinical diagnosis with a question-answering system and a question generation system can capture a patient's conditions from multiple perspectives with less physician involvement by asking different questions to drive and guide the diagnosis. This clinical diagnosis process requires diverse information to evaluate a patient from different perspectives to obtain an accurate diagnosis. Recently proposed medical question generation systems have not considered diversity. Thus, we propose a diversity learning-based visual question generation model using a multi-latent space to generate informative question sets from medical images. The proposed method generates various questions by embedding visual and language information in different latent spaces, whose diversity is trained by our newly proposed loss. We have also added control over the categories of generated questions, making the generated questions directional. Furthermore, we use a new metric named similarity to accurately evaluate the proposed model's performance. The experimental results on the Slake and VQA-RAD datasets demonstrate that the proposed method can generate questions with diverse information. Our model works with an answering model for interactive automated clinical diagnosis and generates datasets to replace the process of annotation that incurs huge labor costs. |
Type: | article |
URI: | http://hdl.handle.net/2115/88557 |
Appears in Collections: | 情報科学院・情報科学研究院 (Graduate School of Information Science and Technology / Faculty of Information Science and Technology) > 雑誌発表論文等 (Peer-reviewed Journal Articles, etc)
|
|