Auer, Peter, et al. "Gambling in a rigged casino: The adversarial multi-armed bandit problem." Foundations of Computer Science, 1995. Proceedings., 36th Annual Symposium on. IEEE, 1995.

Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems (2006)

Even-Dar, Eyal, Shie Mannor, and Yishay Mansour. "Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems." Journal of machine learning research 7.Jun (2006): 1079-1105.

Bayesian multi-armed bandit (2010)

Scott, Steven L. "A modern Bayesian look at the multi‐armed bandit." Applied Stochastic Models in Business and Industry 26.6 (2010): 639-658.

Analysis of Thompson Sampling for the Multi-armed Bandit Problem(2012)

Agrawal, Shipra, and Navin Goyal. "Analysis of Thompson Sampling for the Multi-armed Bandit Problem." COLT. 2012.

Analysis of Thompson Sampling for the Multi-armed Bandit Problem (2013)

Agrawal, Shipra, and Navin Goyal. "Analysis of Thompson Sampling for the Multi-armed Bandit Problem." COLT. 2012.

Bandits With Heavy Tail (2013)

Bubeck, Sébastien, Nicolo Cesa-Bianchi, and Gábor Lugosi. "Bandits with heavy tail." IEEE Transactions on Information Theory 59.11 (2013): 7711-7717.

Matroid Bandits: Fast Combinatorial Optimization with Learning (2014)

Kveton, Branislav, et al. "Matroid bandits: Fast combinatorial optimization with learning." arXiv preprint arXiv:1403.5045 (2014).

めも

ゲームの攻略・プログラミングの勉強内容・読んだ本の感想のような雑記を主に投稿するブログです

バンディットアルゴリズムの資料・論文のめも

導入

バンディットアルゴリズム入門と実践

I’m a bandit

Thompson Sampling

アルゴリズム

バンディット問題の各定式化について

Introduction to Bandits: Algorithms and Theory

応用例：レコメンデーション

論文

Some aspects of the sequential design of experiments

Adversarial multi-armed bandit, Auer, Cesa-Bianchi, Freund and Schapire (1995)