HUSCAP logo Hokkaido Univ. logo

Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Information Science and Technology / Faculty of Information Science and Technology >
Peer-reviewed Journal Articles, etc >

Why Does Large Batch Training Result in Poor Generalization? A Comprehensive Explanation and a Better Strategy from the Viewpoint of Stochastic Optimization

Files in This Item:
Tomoumi Takase.pdf2.41 MBPDFView/Open
Please use this identifier to cite or link to this item:http://hdl.handle.net/2115/71558

Title: Why Does Large Batch Training Result in Poor Generalization? A Comprehensive Explanation and a Better Strategy from the Viewpoint of Stochastic Optimization
Authors: Takase, Tomoumi Browse this author
Oyama, Satoshi Browse this author →KAKEN DB
Kurihara, Masahito Browse this author →KAKEN DB
Keywords: Non-convex optimization
Gradient descent
Neural network
Batch training
Randomized algorithm
Issue Date: Jul-2018
Publisher: MIT
Journal Title: Neural Computation
Volume: 30
Issue: 7
Start Page: 2005
End Page: 2023
Publisher DOI: 10.1162/neco_a_01089
PMID: 29652590
Abstract: We present a comprehensive framework of search methods, such as simulated annealing and batch training, for solving non-convex optimization problems. These methods search a wider range by gradually decreasing the randomness added to the standard gradient descent method. The formulation that we define on the basis of this framework can be directly applied to neural network training. This produces an effective approach that gradually increases batch size during training. We also explain why large batch training degrades generalization performance, which was not clarified in previous studies.
Rights: © 2018 Massachusetts Institute of Technology
Relation: https://www.mitpressjournals.org/loi/neco
Type: article
URI: http://hdl.handle.net/2115/71558
Appears in Collections:情報科学院・情報科学研究院 (Graduate School of Information Science and Technology / Faculty of Information Science and Technology) > 雑誌発表論文等 (Peer-reviewed Journal Articles, etc)

Submitter: 高瀬 朝海

Export metadata:

OAI-PMH ( junii2 , jpcoar_1.0 )

MathJax is now OFF:


 

 - Hokkaido University