Fischer. Finite time analysis of multi-armed bandit problems. Machine learning, 47(2/3):235–256, 2002. (9) J.-Y. Audibert, R. Munos, and C. Szepesv´ari. Tuning bandit algorithms in stochastic environments. In Proceedings of the 18th international conference on Algorithmic Learning Theory, 2007. (10) W Jouini, C Moy, and J Palicot. Upper confidence bound algorithm for opportunistic spectrum access with sensing errors. In CrownCom’11, Osaka, Japan, June 2011. 31/03/2011 Apprentissage pour L'Accès Opportuniste au Spectre 35 spectrum access with sensing errors. In CrownCom’11, Osaka, Japan, June 2011. (11)W Jouini, D Ernst, C Moy, and J Palicot. Upper confidence bound based decision making strategies and dynamic spectrum access. In IEEE International Conference on Communications (ICC), Cape Town, South Africa, May 2010.