Esse paper to DeepStack, caso seja reprodutível, pode representar um avanço significativo em relação a todo eixo em que a Inteligência Artificial está hoje, em especial em problemas de informação assimétrica.
Como os autores salientam, jogos de Damas, Xadrez e Go partem de um princípio básico de que a informação é simétrica entre os jogadores; em outras palavras, há um determinado determinismo em relação às ações dos adversários.
O Poker por sua vez tem como principal característica ser um jogo em que há um algo grau de não-determinismo seja no River, na mão (cartas) dos oponentes, bem como no tão famigerado blefe (que não passa de um bom problema estocástico).
De qualquer maneira, para quem é especialista ou não em AI ou Machine Learning vale a pena conferir a modelagem e os resultados do Deep Stack.
Abstract: Artificial intelligence has seen a number of breakthroughs in recent years, with games often serving as significant milestones. A common feature of games with these successes is that they involve information symmetry among the players, where all players have identical information. This property of perfect information, though, is far more common in games than in real-world problems. Poker is the quintessential game of imperfect information, and it has been a longstanding challenge problem in artificial intelligence. In this paper we introduce DeepStack, a new algorithm for imperfect information settings such as poker. It combines recursive reasoning to handle information asymmetry, decomposition to focus computation on the relevant decision, and a form of intuition about arbitrary poker situations that is automatically learned from selfplay games using deep learning. In a study involving dozens of participants and 44,000 hands of poker, DeepStack becomes the first computer program to beat professional poker players in heads-up no-limit Texas hold’em. Furthermore, we show this approach dramatically reduces worst-case exploitability compared to the abstraction paradigm that has been favored for over a decade
Conclusions: DeepStack is the first computer program to defeat professional poker players at heads-up nolimit Texas Hold’em, an imperfect information game with 10160 decision points. Notably it achieves this goal with almost no domain knowledge or training from expert human games. The implications go beyond just being a significant milestone for artificial intelligence. DeepStack is a paradigmatic shift in approximating solutions to large, sequential imperfect information games. Abstraction and offline computation of complete strategies has been the dominant approach for almost 20 years (29,36,37). DeepStack allows computation to be focused on specific situations that arise when making decisions and the use of automatically trained value functions. These are two of the core principles that have powered successes in perfect information games, albeit conceptually simpler to implement in those settings. As a result, for the first time the gap between the largest perfect and imperfect information games to have been mastered is mostly closed. As “real life consists of bluffing… deception… asking yourself what is the other man going to think” (9), DeepStack also has implications for seeing powerful AI applied more in settings that do not fit the perfect information assumption. The old paradigm for handling imperfect information has shown promise in applications like defending strategic resources (38) and robust decision making as needed for medical treatment recommendations (39). The new paradigm will hopefully open up many more possibilities.