Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
陈皓 c4da388a79 | 1 year ago | |
---|---|---|
PokerAI | 2 years ago | |
pypokergui | 2 years ago | |
Agent_against_OpenStackTwo.txt | 2 years ago | |
LICENSE | 2 years ago | |
README.md | 1 year ago | |
requirements.txt | 2 years ago |
An imperfect information game is a type of game with asymmetric information. Compared with perfect information game, imperfect information game is more common in life. Artificial intelligence in imperfect games like poker has made significant progress and success in recent years. The great success of Superhuman Poker AI, such as Libratus and Deepstack, attracts researchers to pay attention to poker research. However, the lack of open source code limits the development of Texas Hold'em AI to some extent.
This project introduces DecisionHoldem, a high-level AI for heads-up no-limit Texas hold'em with safer depth-limited solving with diverse opponents ranges to reduce the exploitability of the strategy.DecisionHoldem is mainly composed of two parts, namely the blueprint strategy and the real-time search part.
In the blueprint strategy part, DecisionHoldem first employs the hand abstraction technique and action abstraction to obtain an abstracted game. Then we used the linear CFR algorithm iteration on the abstracted game tree to calculate blueprint strategy on a workstation with 48 core CPUs for 3 - 4 days. The total number of iterations is about 200 million.
In the real-time search part, we propose a safer depth-limited solving algorithm than modicum's depth-limited solving algorithm on subgame by putting more possible ranges of opponent private hands into consideration for off-tree nodes. This algorithm can significantly improve the AI game level by reducing the exploitability of the strategy. The details of the algorithm will be introduced in subsequent articles soon.
To evaluate the performance of DecisionHoldem, we play it against Slumbot and OpenStackTwo, respectively. Slumbot is the champion of the 2018 Anual Computer Poker Competition and the only high-level poker AI currently available. About 20,000 games against Slumbot, DecisionHoldem's average profit is more remarkable than 730mbb/h, and it ranked first in statistics on November 26, 2021 (DecisionHoldem's name on the ranking is zqbAgent[2,3]). OpenStackTwo built-in OpenHoldem Texas Hold'em Confrontation Platform is a reproduced version of DeepStack. With about 2,000 games against OpenStack[1], DecisionHoldem's average profit is more excellent than 700mbb/h.
To promote artificial intelligence development in imperfect-information games, we have open-sourced the relevant code of DecisionHoldem with tools for playing against the Slumbot, OpenHoldem and human[5]. Meanwhile, we provide a simple program about Leduc poker, which helps to understand the algorithm framework and its mechanism.
$ git clone https://git.openi.org.cn/chenhao/DecisionHoldem.git
$ git clone https://github.com/AI-Decision/DecisionHoldem.git
sevencards_strength.bin
preflop_hand_cluster.bin
flop_hand_cluster.bin
turn_hand_cluster.bin
river_hand_cluster.bin
blueprint_strategy.dat
These data can be obtained through Baidu Netdisk.
Link: https://pan.baidu.com/s/157n-H1ECjEryAx0Z03p2_w
Extraction code: q1pv
$ cd DecisionHoldem/PokerAI
$ g++ Main.cpp -o Main.o -std=c++11 -mcmodel=large -lpthread
$ ./Main.o 0
$ cd DecisionHoldem/PokerAI
$ g++ Main.cpp -o Main.o -std=c++11 -mcmodel=large -lpthread
$ ./Main.o 1
AlascasiaHoldem.so and blueprint.so provides a interface for the agent to play with other agent or human in real game scenario.
GUI application refer to PyPokerGUI.
$ cd DecisionHoldem/PokerAI/
$ python DecisionHoldem/pypokergui/server/poker.py 8000
Tt is necessary that AlascasiaHoldem.so is in directory "DecisionHoldem/PokerAI/".
https://www.slumbot.com/#
Results on November 26, 2021, DecisionHoldem registered as zqbAgent and ranked first in the leaderboard.
$ cd DecisionHoldem/PokerAI/
$ python DecisionHoldem/pypokergui/play_with_slumbot.py
http://holdem.ia.ac.cn/#/battle
$ cd DecisionHoldem/PokerAI/
$ python DecisionHoldem/pypokergui/play_with_ia_v4.py 888891 2 Bot 2000 OpenStackTwo
The Agent_against_OpenStackTwo file contains the information for each game in 2000 games, including the each action probability of our agent, opponents actions and game state.
├── Poker # game tree code
│ ├── Node.h # data structure of every node in game tree
│ ├── Bulid_Tree.h # traverse every possible hole card, community cards and legal actions to bulid the game tree
│ ├── Exploitability.h # it compute the exploitability of game tree policy
│ ├── Save_load.h # it can save game tree policy to a file and load file to bulid a game tree
│ └── Visualize_Tree.h # Visualize game Tree
│
├── util #
│ ├── Engine.h # it compute game result, judging win person and the person can get the number of chips and get the cluster of the player's hand
│ ├── Exploitability.h # compute the strategy of best response
│ ├── ThreadPool.h # Multithread control
│ └── Randint.h # the class is to generate random number
│
├── Poker # the foundation class of the poker game
│ ├── Card.h # every card class, it's id range from 0 to 51
│ ├── Deck.h # deck class of cards, it contains 52 cards
│ ├── Player.h # player class,it's attributes contain initial chips, bet chips, small or big blind
│ ├── Table.h # it's attributes contain players, pot and deck
│ └── State.h # it is game state, contain every players infoset, legal actions
│
├── Depth_limit_Search.h # it is a algorithm of real time searching in each subgame
├── Multi_Blureprint.h # it is a blueprint mccfr algorithm which running with the multithread
└── BlueprintMCCFR.cpp # it is a blueprint mccfr algorithm which running with the single thread
$ cd GraphViz/bin
$ dot -Tpng blueprint_subnode.stgy > temp.png
GUI is based on a project which can be found here:
https://github.com/ishikota/PyPokerGUI
demo project:
https://github.com/zqbAse/PokerAI_Sim
[1] www.holdem.ia.ac.cn
[2] www.slumbot.com
[3] https://github.com/ericgjackson/slumbot2017/issues/11
[4] Development Environment:A workstation with an Intel(R) Xeon(R) Gold 6240R CPU, and 512GB of RAM.
[5] Currently some source codes only provide compiled files, and they will be open sourced in the near future.
The project leader is Junge Zhang , and the main contributors are Dongdong Bai and Qibin Zhou. Kaiqi Huang co-supervises this project as well. In recent years, this team has been devoting to reinforcement learning, multi-agent system, decision-making intelligence.
If you use DecisionHoldem in your research, please cite the following paper.
@article{zhou2022decisionholdem,
title={DecisionHoldem: Safe Depth-Limited Solving With Diverse Opponents for Imperfect-Information Games},
author={Zhou, Qibin and Bai, Dongdong and Zhang, Junge and Duan, Fuqing and Huang, Kaiqi},
journal={arXiv preprint arXiv:2201.11580},
year={2022}
}
不完美信息博弈是一种信息不对称的博弈。与完美信息博弈相比,不完美信息博弈在生活中更为常见。近年来,在扑克等不完美游戏中的人工智能取得了重大进展和成功。 Libratus 和 Deepstack 等人工智能的巨大成功吸引了研究人员关注扑克研究。然而,开源代码的缺乏在一定程度上限制了德州扑克 AI 的发展。 本项目软件名为DecisionHoldem,是一种用于单挑无限制德州扑克的高级AI,其特点是在子博弈搜索中具有更安全的策略精炼,使博弈策略具有低可利用性。
Text C++ Python HTML JavaScript other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》