Bandit Algorithms

Bandit Algorithms

Szepesvari, Csaba; Lattimore, Tor

Cambridge University Press

07/2020

536

Dura

Inglês

9781108486828

15 a 20 dias

1070

Descrição não disponível.
1. Introduction; 2. Foundations of probability; 3. Stochastic processes and Markov chains; 4. Finite-armed stochastic bandits; 5. Concentration of measure; 6. The explore-then-commit algorithm; 7. The upper confidence bound algorithm; 8. The upper confidence bound algorithm: asymptotic optimality; 9. The upper confidence bound algorithm: minimax optimality; 10. The upper confidence bound algorithm: Bernoulli noise; 11. The Exp3 algorithm; 12. The Exp3-IX algorithm; 13. Lower bounds: basic ideas; 14. Foundations of information theory; 15. Minimax lower bounds; 16. Asymptotic and instance dependent lower bounds; 17. High probability lower bounds; 18. Contextual bandits; 19. Stochastic linear bandits; 20. Confidence bounds for least squares estimators; 21. Optimal design for least squares estimators; 22. Stochastic linear bandits with finitely many arms; 23. Stochastic linear bandits with sparsity; 24. Minimax lower bounds for stochastic linear bandits; 25. Asymptotic lower bounds for stochastic linear bandits; 26. Foundations of convex analysis; 27. Exp3 for adversarial linear bandits; 28. Follow the regularized leader and mirror descent; 29. The relation between adversarial and stochastic linear bandits; 30. Combinatorial bandits; 31. Non-stationary bandits; 32. Ranking; 33. Pure exploration; 34. Foundations of Bayesian learning; 35. Bayesian bandits; 36. Thompson sampling; 37. Partial monitoring; 38. Markov decision processes.
Este título pertence ao(s) assunto(s) indicados(s). Para ver outros títulos clique no assunto desejado.