this program is largely incomplete and not in a presentable state.
please be mindful when sharing it.
however, feel free to copy any snippets of code you find useful.

TODOs: (that i can remember right now)
- finish implementing backprop
- replace evolution strategy algorithm with
  something that utilizes backprop like PPO
- settle on a network architecture
- normalize and/or embed sprite inputs
- fix lag-frames skipped-inputs bug
- detect frames when Mario is in a controllable state
- fix offscreen sprites sometimes being visible to network
- add some detection for enemies later in the game
- compute how many input neurons the network needs instead of hardcoding

naive:
- learn any combination of buttons, starting from title screen
- learn to run network without frameskip
- learn other games