Our project aims to train an AI agent to play Hanabi and achieve high scores using reinforcement learning. We will use PPO and A2C to optimize decision-making in a cooperative, partially observable environment.
Github: Source code