Yıl: 2022 Cilt: 30 Sayı: 6 Sayfa Aralığı: 2303 - 2318 Metin Dili: İngilizce DOI: 10.55730/1300-0632.3940 İndeks Tarihi: 09-12-2022

Learning to play an imperfect information card game using reinforcement learning

Öz:
Artificial intelligence and machine learning are widely popular in many areas. One of the most popular ones is gaming. Games are perfect testbeds for machine learning and artificial intelligence with various scenarios and types. This study aims to develop a self-learning intelligent agent to play the Hearts game. Hearts is one of the most popular trick-taking card games around the world. It is an imperfect information card game. In addition to having a huge state space, Hearts offers many extra challenges due to its nature. In order to ease the development process, the agent developed in the scope of this study was divided into subagents such that each subagent was assigned a part of the game. The experiment results reveal that the developed agent can compete against some rule based Hearts agents and human Hearts players.
Anahtar Kelime: Artificial intelligence machine learning reinforcement learning supervised learning neural networks

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • [1] Sturtevant NR, White AM. Feature construction for reinforcement learning in hearts. In: International Conference on Computers and Games; 2006. pp. 122–134.
  • [2] Alzubi OA. A deep learning-based frechet and dirichlet model for intrusion detection in IWSN. Journal of Intelligent & Fuzzy Systems 2021; (Preprint): 1-11.
  • [3] Samuel AL. Some studies in machine learning using the game of checkers. IBM Journal of Research and Development 1959; 3 (3): 210-229.
  • [4] Samuel AL. Some studies in machine learning using the game of checkers. ii—recent progress. IBM Journal of Research and Development 1967; 11 (6): 601-617.
  • [5] Tesauro G. Temporal difference learning and TD-Gammon. Communications of the ACM 1995; 38 (3): 58-68.
  • [6] Tesauro G. Programming backgammon using self-teaching neural nets. Artificial Intelligence 2002; 134 (1-2): 181- 199.
  • [7] Sutton RS. Learning to predict by the methods of temporal differences. Machine Learning 1988; 3 (1): 9-44.
  • [8] Campbell M, Hoane Jr AJ, Hsu F. Deep Blue. Artificial Intelligence 2002; 134 (1-2): 57-83.
  • [9] Hsu F. Behind Deep Blue: Building the computer that defeated the world chess champion. Princeton, NJ, USA: Princeton University Press, 2002.
  • [10] Silver D, Huang A, Maddison CJ, Guez A, Sifre L et al. Mastering the game of go with deep neural networks and tree search. Nature 2016; 529 (7587): 484-489.
  • [11] Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A et al. Mastering the game of go without human knowledge. Nature 2017; 550 (7676): 354-359.
  • [12] Silver D, Hubert T, Schrittwieser J, Antonoglou I, Lai M et al. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 2018; 362 (6419): 1140-1144.
  • [13] Schrittwieser J, Antonoglou I, Hubert T, Simonyan K, Sifre L et al. Mastering atari, go, chess and shogi by planning with a learned model. Nature 2020; 588 (7839): 604-609.
  • [14] Niklaus J, Alberti M, Pondenkandath V, Ingold R, Liwicki M. Survey of artificial intelligence for card games and its application to the swiss game jass. In: 2019 6th Swiss Conference on Data Science (SDS); Bern, Switzerland; 2019. pp. 25-30.
  • [15] Baykal O, Alpaslan FN. Reinforcement learning in card game environments using monte carlo methods and artificial neural networks. In: 2019 4th International Conference on Computer Science and Engineering (UBMK); 2019. pp. 1-6.
  • [16] Charlesworth H. Application of self-play reinforcement learning to a four-player game of imperfect information. ArXiv 2018; abs/1808.10442.
  • [17] Santo WL, Wong A. Evaluating mcts in a new ai framework for hearts. Retrieved October 11, 2021, from https://wellssanto.com/HeartsAI.pdf
  • [18] Sturtevant NR. Multi-player games: algorithms and approaches. PhD, University of California, Los Angeles, CA, USA, 2003.
  • [19] Kuvayev L. Learning to play Hearts. In: AAAI/IAAI; 1997. pp. 836.
  • [20] Ishii S, Fujita H, Mitsutake M, Yamazaki T, Matsuda J, Matsuno Y. A reinforcement learning scheme for a partially- observable multi-agent game. Machine Learning 2005; 59 (1): 31-54.
  • [21] Wagenaar M. Learning to play the game of hearts using reinforcement learning and a multi-layer perceptron. BS, University of Groningen, Groningen, The Netherlands, 2017.
  • [22] Bax F. Determinization with Monte Carlo Tree Search for the card game Hearts. BS, University of Utrecht, Utrecht, The Netherlands, 2020.
  • [23] Zha D, Lai KH, Cao Y, Huang S, Wei R et al. RLCard: a toolkit for reinforcement learning in card games. ArXiv 2019; abs/1910.04376.
  • [24] Zha D, Lai KH, Huang S, Cao Y, Reddy K et al. RLCard: a platform for reinforcement learning in card games. In: Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence; 2021. pp. 5264-5266.
  • [25] Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J et al. Human-level control through deep reinforcement learning. Nature 2015; 518 (7540): 529-533.
  • [26] Heinrich J, Silver D. Deep reinforcement learning from self-play in imperfect-information games. arXiv preprint arXiv:1603.01121.
  • [27] Zha D, Xie J, Ma W, Zhang S, Lian X et al. Douzero: Mastering doudizhu with self-play deep reinforcement learning. In: International Conference on Machine Learning; 2021. pp. 12333-12344.
APA DEMIRDOVER B, Baykal Ö, ALPASLAN F (2022). Learning to play an imperfect information card game using reinforcement learning. , 2303 - 2318. 10.55730/1300-0632.3940
Chicago DEMIRDOVER BUGRA KAAN,Baykal Ömer,ALPASLAN FERDANUR Learning to play an imperfect information card game using reinforcement learning. (2022): 2303 - 2318. 10.55730/1300-0632.3940
MLA DEMIRDOVER BUGRA KAAN,Baykal Ömer,ALPASLAN FERDANUR Learning to play an imperfect information card game using reinforcement learning. , 2022, ss.2303 - 2318. 10.55730/1300-0632.3940
AMA DEMIRDOVER B,Baykal Ö,ALPASLAN F Learning to play an imperfect information card game using reinforcement learning. . 2022; 2303 - 2318. 10.55730/1300-0632.3940
Vancouver DEMIRDOVER B,Baykal Ö,ALPASLAN F Learning to play an imperfect information card game using reinforcement learning. . 2022; 2303 - 2318. 10.55730/1300-0632.3940
IEEE DEMIRDOVER B,Baykal Ö,ALPASLAN F "Learning to play an imperfect information card game using reinforcement learning." , ss.2303 - 2318, 2022. 10.55730/1300-0632.3940
ISNAD DEMIRDOVER, BUGRA KAAN vd. "Learning to play an imperfect information card game using reinforcement learning". (2022), 2303-2318. https://doi.org/10.55730/1300-0632.3940
APA DEMIRDOVER B, Baykal Ö, ALPASLAN F (2022). Learning to play an imperfect information card game using reinforcement learning. Turkish Journal of Electrical Engineering and Computer Sciences, 30(6), 2303 - 2318. 10.55730/1300-0632.3940
Chicago DEMIRDOVER BUGRA KAAN,Baykal Ömer,ALPASLAN FERDANUR Learning to play an imperfect information card game using reinforcement learning. Turkish Journal of Electrical Engineering and Computer Sciences 30, no.6 (2022): 2303 - 2318. 10.55730/1300-0632.3940
MLA DEMIRDOVER BUGRA KAAN,Baykal Ömer,ALPASLAN FERDANUR Learning to play an imperfect information card game using reinforcement learning. Turkish Journal of Electrical Engineering and Computer Sciences, vol.30, no.6, 2022, ss.2303 - 2318. 10.55730/1300-0632.3940
AMA DEMIRDOVER B,Baykal Ö,ALPASLAN F Learning to play an imperfect information card game using reinforcement learning. Turkish Journal of Electrical Engineering and Computer Sciences. 2022; 30(6): 2303 - 2318. 10.55730/1300-0632.3940
Vancouver DEMIRDOVER B,Baykal Ö,ALPASLAN F Learning to play an imperfect information card game using reinforcement learning. Turkish Journal of Electrical Engineering and Computer Sciences. 2022; 30(6): 2303 - 2318. 10.55730/1300-0632.3940
IEEE DEMIRDOVER B,Baykal Ö,ALPASLAN F "Learning to play an imperfect information card game using reinforcement learning." Turkish Journal of Electrical Engineering and Computer Sciences, 30, ss.2303 - 2318, 2022. 10.55730/1300-0632.3940
ISNAD DEMIRDOVER, BUGRA KAAN vd. "Learning to play an imperfect information card game using reinforcement learning". Turkish Journal of Electrical Engineering and Computer Sciences 30/6 (2022), 2303-2318. https://doi.org/10.55730/1300-0632.3940