HUSCAP logo Hokkaido Univ. logo

Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Humanities and Human Sciences / Faculty of Humanities and Human Sciences >
北方言語研究 = Northern Language Studies >
第10号 >

How to Handle “Missing Values” in Linguistic Typology : A Pitfall in the Statistical Modelling Approach

Files in This Item:
04_61_82.pdf556.72 kBPDFView/Open
Please use this identifier to cite or link to this item:http://hdl.handle.net/2115/77605

Title: How to Handle “Missing Values” in Linguistic Typology : A Pitfall in the Statistical Modelling Approach
Authors: Ono, Yohei Browse this author
Keywords: Computational Linguistics
Descriptive Linguistics
Probability Axioms
Statistical Imputation
The World Atlas of Language Structure
Issue Date: 20-Mar-2020
Publisher: 日本北方言語学会
Journal Title: 北方言語研究
Journal Title(alt): Northern Language Studies
Volume: 10
Start Page: 61
End Page: 82
Abstract: There are two mainstreams in statistical typology: one that learns WALS data with probability distribution, the other that elucidates WALS data without probability distribution. These two streams differ in the following three points: (1) the selection of WALS data in the analysis; (2) the purpose of applying statistical methods to WALS data; (3) the selection of statistical methods based on the previous two points. This paper focuses on the first stream, called the“statistical modelling approach”in this paper, and discusses whether probability distribution can apply to “missing values”in WALS from the viewpoint of linguistic materials, taking Ainu, Chukchi, Khalkha, and Navajo as examples. The results demonstrate that“missing values”are not dealt with in the context of linguistic materials but conform to statistical notions, which enables information scientists/statisticians to apply probability function and probabilistic modelling. Thus, the statistical modelling approach does not learn what WALS data represent in terms of substantive linguistics knowledge and distorts WALS data in the statistical context. This raises a question regarding the fundamentals of statistical typology with the statistical modelling approach. Statistical typology should primarily address how the missing values in WALS are dealt with using the probability function.The findings indicate that interdisciplinary research among the humanities and information science/statistics necessitates that information scientists/statisticians explain their research using linguistics concepts and that linguists explain their research using concepts from information science/statistics. This will enable mutual responses from both fields, with appropriate feedback from substantive knowledge, as well as constructive complementary studies.
Type: bulletin (article)
URI: http://hdl.handle.net/2115/77605
Appears in Collections:北方言語研究 = Northern Language Studies > 第10号

Export metadata:

OAI-PMH ( junii2 , jpcoar_1.0 )

MathJax is now OFF:


 

 - Hokkaido University