HUSCAP logo Hokkaido Univ. logo

Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Information Science and Technology / Faculty of Information Science and Technology >
Peer-reviewed Journal Articles, etc >

From one star to three stars : Upgrading legacy open data using crowdsourcing

Files in This Item:
dsaa2015.pdf2.25 MBPDFView/Open
Please use this identifier to cite or link to this item:http://hdl.handle.net/2115/65226

Title: From one star to three stars : Upgrading legacy open data using crowdsourcing
Authors: Oyama, Satoshi Browse this author →KAKEN DB
Baba, Yukino Browse this author
Ohmukai, Ikki Browse this author
Dokoshi, Hiroaki Browse this author
Kashima, Hisashi Browse this author
Issue Date: 2015
Publisher: IEEE (Institute of Electrical and Electronics Engineers)
Citation: Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on, ISBN: 978-1-4673-8273-1
Start Page: 1
End Page: 9
Publisher DOI: 10.1109/DSAA.2015.7344801
Abstract: Despite recent open data initiatives in many coun- tries, a significant percentage of the data provided is in non- machine-readable formats like image format rather than in a machine-readable electronic format, thereby restricting their usability. This paper describes the first unified framework for converting legacy open data in image format into a machine- readable and reusable format by using crowdsourcing. Crowd workers are asked not only to extract data from an image of a chart but also to reproduce the chart objects in spreadsheets. The properties of the reconstructed chart objects give their data structures including series names and values, which are useful for automatic processing of data by computer. Since results produced by crowdsourcing inherently contain errors, a quality control mechanism was developed that improves the accuracy of extracted tables by aggregating tables created by different workers for the same chart image and by utilizing the data structures obtained from the reproduced chart objects. Experimental results demonstrated that the proposed framework and mechanism are effective.
Conference Name: IEEE International Conference on Data Science and Advanced Analytics (DSAA)
Conference Sequence: 2015
Conference Place: Paris
Rights: © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Type: proceedings (author version)
URI: http://hdl.handle.net/2115/65226
Appears in Collections:情報科学院・情報科学研究院 (Graduate School of Information Science and Technology / Faculty of Information Science and Technology) > 雑誌発表論文等 (Peer-reviewed Journal Articles, etc)

Submitter: 小山 聡

Export metadata:

OAI-PMH ( junii2 , jpcoar_1.0 )

MathJax is now OFF:


 

 - Hokkaido University