HUSCAP logo Hokkaido Univ. logo

Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Information Science and Technology / Faculty of Information Science and Technology >
Peer-reviewed Journal Articles, etc >

Mistake bounds on the noise-free multi-armed bandit game

This item is licensed under:Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

Files in This Item:
infcomp-lata2016v4c.pdf372.65 kBPDFView/Open
Please use this identifier to cite or link to this item:http://hdl.handle.net/2115/83379

Title: Mistake bounds on the noise-free multi-armed bandit game
Authors: Nakamura, Atsuyoshi Browse this author →KAKEN DB
Helmbold, David P. Browse this author
Warmuth, Manfred K. Browse this author
Keywords: Computational learning theory
Online learning
Bandit problem
Mistake bound
Issue Date: Dec-2019
Publisher: Elsevier
Journal Title: Information and computation
Volume: 269
Start Page: 104453
Publisher DOI: 10.1016/j.ic.2019.104453
Abstract: We study the {0, 1}-loss version of adaptive adversarial multi-armed bandit problems with alpha(>= 1) lossless arms. For the problem, we show a tight bound K - alpha - Theta(1/T) on the minimax expected number of mistakes (1-losses), where K is the number of arms and T is the number of rounds. (C) 2019 Elsevier Inc. All rights reserved.
Rights: © 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/
http://creativecommons.org/licenses/by-nc-nd/4.0/
Type: article (author version)
URI: http://hdl.handle.net/2115/83379
Appears in Collections:情報科学院・情報科学研究院 (Graduate School of Information Science and Technology / Faculty of Information Science and Technology) > 雑誌発表論文等 (Peer-reviewed Journal Articles, etc)

Submitter: 中村 篤祥

Export metadata:

OAI-PMH ( junii2 , jpcoar_1.0 )

MathJax is now OFF:


 

 - Hokkaido University