Michael Bradley Johanson, PhD

Regret Minimization in Games with Incomplete Information

Martin Zinkevich, Michael Johanson, Michael Bowling and Carmelo Piccione. Regret Minimization in Games with Incomplete Information. In Advances in Neural Information Processing Systems 20 (NIPS 2007), pages 1729-1736, 2008.

Download

Paper: [PDF]
Appendices: [PDF]

Abstract

Extensive games are a powerful model of multiagent decision-making scenarios with incomplete information. Finding a Nash equilibrium for very large instances of these games has received a great deal of recent attention. In this paper, we describe a new technique for solving large games based on regret minimization. In particular, we introduce the notion of counterfactual regret, which exploits the degree of incomplete information in an extensive game. We show how minimizing counterfactual regret minimizes overall regret, and therefore in self-play can be used to compute a Nash equilibrium. We demonstrate this technique in the domain of poker, showing we can solve abstractions of limit Texas Hold’em with as many as 10^12 states, two orders of magnitude larger than previous methods.

Notes

This was the first paper to introduce Counterfactual Regret Minimization (CFR), which quickly became the foundation of all of the University of Alberta Computer Poker Research Group's later work. This algorithm provides an efficient approach for approximating Nash equilibrium strategies in very large extensive games, allowing us to tackle much larger games than was possible using the standard linear programming technique. Strategies generated with CFR were used in the 2007 Annual Computer Poker Competition and 2007 Man-vs-Machine Poker Championship matches.

This paper is also described as part of my Masters Thesis. In that work, I attempted to describe the algorithm's theory and implementation in a less formal but more intuitive way. It also describes the state-space abstraction process that is required to simplify Texas Hold'em games to the point where CFR can be applied, and thus gives an end-to-end view of the steps that are required to make a strong poker agent.

If you are interested in implementing CFR, I would suggest reading our more recent work on Public Chance Sampled CFR, as it gives a more intuitive explanation of optional sampling schemes and provides a much more useful pseudocode description of the algorithm. The pseudocode in this paper isn't very useful, and doesn't mention the critical step of updating the average strategy.

BibTeX

@incollection{ 2007-nips-cfr,
  title = {Regret Minimization in Games with Incomplete Information},
  author = {Martin Zinkevich and Michael Johanson and Michael Bowling and Carmelo Piccione},
  booktitle = {Advances in Neural Information Processing Systems 20 (NIPS)},
  editor = {J.C. Platt and D. Koller and Y. Singer and S. Roweis},
  publisher = {MIT Press},
  address = {Cambridge, MA},
  pages = {1729--1736},
  year = {2008}
}

Michael Bradley Johanson, PhD

michael.johanson@gmail.com

Regret Minimization in Games with Incomplete Information

Download

Abstract

Notes

Links

BibTeX

Pages