Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Information Science and Technology / Faculty of Information Science and Technology >
Peer-reviewed Journal Articles, etc >
Exploratory Causal Analysis of Open Data : Explanation Generation and Confounder Identification
Title: | Exploratory Causal Analysis of Open Data : Explanation Generation and Confounder Identification |
Authors: | Song, Jing Browse this author | Oyama, Satoshi Browse this author →KAKEN DB | Kurihara, Masahito Browse this author →KAKEN DB |
Keywords: | causal analysis | confounding | crowdsourcing | open data |
Issue Date: | 20-Jan-2020 |
Publisher: | Fuji Technology Press |
Journal Title: | Journal of Advanced Computational Intelligence and Intelligent Informatics |
Volume: | 24 |
Issue: | 1 |
Start Page: | 142 |
End Page: | 155 |
Publisher DOI: | 10.20965/jaciii.2020.p0142 |
Abstract: | Open data are becoming increasingly available in various domains, and many organizations rely on making decisions according to data. Such decision making requires care to distinguish between correlations and causal relationships. Among data analysis tasks, causal relationship analysis is especially complex because of unobserved confounders. For example, to correctly analyze the causal relationship between two variables, the possible confounding effect of a third variable should be considered. In the open-data environment, however, it is difficult to consider all possible confounders in advance. In this paper, we propose a framework for exploratory causal analysis of open data, in which possible confounding variables are collected and incrementally tested from a large volume of open data. To the extent of the authors’ knowledge, no framework has been proposed to incorporate data for possible confounders in causal analysis process. This paper shows an original way to expand causal structures and generate reasonable causal relationships. The proposed framework accounts for the effect of possible confounding in causal analysis by first using a crowdsourcing platform to collect explanations of the correlation between variables. Keywords are then extracted using natural language processing methods. The framework searches the related open data according to the extracted keywords. Finally, the collected explanations are tested using several automated causal analysis methods. We conducted experiments using open data from the World Bank and the Japanese government. The experimental results confirmed that the proposed framework enables causal analysis while considering the effects of possible confounders. |
Type: | article |
URI: | http://hdl.handle.net/2115/87709 |
Appears in Collections: | 情報科学院・情報科学研究院 (Graduate School of Information Science and Technology / Faculty of Information Science and Technology) > 雑誌発表論文等 (Peer-reviewed Journal Articles, etc)
|
Submitter: 小山 聡
|