Metroid-II-RL

Reinforcement learning plays Metroid II. 'Nuff said.

Current state

I've changed the model plenty of times, lots is changing, ill document it later

PyBoy provides "game wrappers" for various games to make AI work easier. I am working on implementing one for Metroid II and will eventually get my code pulled into the project. For now, I have enough implemented for it to work.

Currently, a pixel-based observation approach is going to be used. Due to the environment, a tile-based approach could be used, and that may much faster, however it has a lot of its own issues and those may or may not be explored here.

Tangent about timing info

Through testing and some math, roughly 216,000 iterations correspond to an hour of "real life" game time. This was calculated by measuring 1000 iteration's time. The output of the script was as follows. This assumes 1000 iterations

Human Time: 16.650566339492798
Machine time: 0.2655789852142334
Machine is 62.69534589139785 times faster
time per step HUMAN: 0.016650566339492797
time per step FAST: 0.0002655789852142334
One hour of human gameplay = 216208.8620049702

the math for this was simply time / (measured_time / iterations) --> 3600 / (16.65/1000)

TODO

I'm focusing much more on the model, as well as the "real AI" portion of the project. I'm currently tweaking and tuning my reward function a ton.

Currently, the agent makes it pretty reliably out of the main portion of the level, however it runs straight through enemies, and has learned to completely not shoot at all.

At this point, a pull request has been made for PyBoy, and many changes need to be made to that repository, but before I do that, I want to focus more on the training and AI portion of things.

Agent Milestones

explore starting area
Stop shooting randomly and "spazzing out"
Get out of starting area relatively quickly
Avoid enemies just outside starting area
Kill enemies just outside starting area
Drop down through first major shaft (requires downward jump shooting)
Find first Metroid
Kill first Metroid

Model

Custom Environment

Environment is pretty much set. Its plenty good enough to start training

Misc

Containerize the program to make running on other machines easy (either with conda, or docker potentially, or a setup script)
Overall, clean up the code a LOT, including but not limited to complete restructuring for the sake of pufferlib. (I'm using three different repos! my Pufferlib fork, PyBoy fork, and my gymnasium environment.

History

A majority of this projects initial work (January - February) was dedicated to many many iterations of the PyBoy wrapper, action space, and overall implementation, wrappers, libraries oh my! At this point, I'm switching gears to 90/10 doing AI and "backend" development. Up to this point, most of the training was "proof of concept". And many bugs and kinks were worked out along the way before switching to a more AI focused development.

The first real model was using simply the observation of the screen this worked great for getting past the initial

Comedy of errors

A list of extremely silly mistakes

agent couldn't actually jump! My original implementation of the button presses was simply single frame tapping the button. Needed to re-implement the emulator interfacing to check next action, and release buttons not in the next action.
At one point, I had my exploration reward scale with explorations. I.e. for every N new coordinate points, we'd give N*W reward, the reward very quickly balooned, and the agent would just get stuck going to the right.
By shrinking the screen by four, after CNN layers, the screen was too small, and got reduced to one single feature! This is sub-optimal for AI.
Gymnasium can't handle (width, height) for images even if its black and white. It needs to be (width, height, 1) even though its grayscale.

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
checkpoints		checkpoints
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
actions_lists.py		actions_lists.py
main.py		main.py
metroid_env.py		metroid_env.py
setup.py		setup.py
train.py		train.py
view.py		view.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Metroid-II-RL

Current state

Tangent about timing info

TODO

Agent Milestones

Model

Custom Environment

Misc

History

Comedy of errors

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Metroid-II-RL

Current state

Tangent about timing info

TODO

Agent Milestones

Model

Custom Environment

Misc

History

Comedy of errors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages