Frogger
Do you have questions or comments about this model? Ask them here! (You'll first need to log in.)
# About
Frogger is an arcade game developed in 1981 by Konami. For our Frogger-implementation we simplified the game, made it turn-based, gave it a random environment and added the Q-learning algorithm to make the frogger learn.
## How the game works
The goal of the game is to make Frogger - the pixelated frog spawns at the bottom of the window - cross the road without colliding with a car. Always when frogger decides to go to one of the four adjacent patches, all the cars move one - or sometimes 0 or 2 - blocks within their row. Frogger loses, after a car moves onto the same patch, and wins, if it reaches the top yellow row. After each death or win, new street will be generated random ly.
# Basic Usage (For Q-learning)
* Press `✍` for Setup
* If you have a saved Q-matrix, you can press the `⛁Q↑`-button and choose the file you want to use. Wait until success-message opens.
* Make sure you have continuos updates on. Otherwise the frogger movement won't be shown.
* Make sure the chosen strategy is `AI`.
* Press `↻` to see Frogger play the game, using `Go` you can see him do one step.
# Measurement of success and our results
As a measurement of success, we tried two approaches: The first one was to use the reward the agent achieved. Because of that alone, we weren't able to evaluate different reward-configurations, so we added a second measurement namely the average rows the frog managed to cross within a certain amount of Episodes.
A Frogger played by the human player can reach the goal most of the times, so that the score (rows reached) when playing casually is about 12.5/16 for me.
A random Frogger by average doesn't even manage to reach the second row, it has a score of about 0.6/16.
A Frogger using the forward strategy has an average of 4.6/16 tiles.
One instance of the frogger with a field of view of 2 to the front and the side and 1 to the back managed to get an average score of 6.2/16 rows per episode after 100k ticks of learning with `epsilon` = 0. (Larger fields of view were not possible, with one more patch to the side, NetLogos Memory ran out)
It was very interesting to see the Q-learning frogger find tactics e.g. 'jumping over cars', a small flaw we found ourselves when testing the game: If a car is approaching the Frogger and is on the tile next to him, it is possible to jump in the direction of the car and land on the other side.
We had expected a better result, with a higher rate of managing to cross the road without dying, but we are still happy to see that it's working and beating the forward-strategy. Some of the runs Frogger was able to make were pretty amazing, other failed badly.
# Configuration
## Strategies
There are four different modes Frogger can run in. They can be seleted in the dropdown-menu below the direction-keys.
### AI
Frogger uses Q-learning to improve his skills.
### Forward
Frogger always choses to go forward. A surpisingly good strategy
### Random
Frogger chooses his direction randomly.
### User
You can play Frogger by using the `WASD`-keys. This mode will be chosen if you hit one of the `WASD`-keys.
The Q-matrix will be updated in every mode, so the frog can - for example - learn by watching you play.
## Saving and restoring the Q-matrix & the map
Use `⛁Q↓` to load and `⛁Q↑` to restore a Q-Table into a file you can choose. The Q-Table-file is a CSV-file with each line representing one entry in the Q-Table. Each line has the format `[[
Every `auto-save-interval` Episodes the Q-table will be saved into a file in the same directory the application was started from. Using `autosave-filename` you can rename this file.
Use `⛁W↓` to load and `⛁W↑` to restore a Map that was loaded. This includes the row types, the velocity per row and the frequency of cars. Everything will be saved using the CSV format, each line representing a row in reversed order.
Keep in mind that both the Q-Table and the map will be reset by `✍`.
## Success Measurement
### Upper plot
This plot shows the reward of every episode.
### Lower plot
This plot shows the average reward of `small-average-size` ticks using a red and of `big-average-size` using a blue pen.
### `average rows managed`
This output shows the average rows the frog managed to cross within the last `score-history-length` episode.
## Q-Learning configuration
### Q-Learning
#### Reward
You can adjust the reward values for different actions and results: `reward-win` controls , the reward if the frog reaches the yellow row, `reward-die` the negative reward when the frog dies; and `reward-left`, `reward-right`, `reward-up` and `reward-down` for choosing an direction to go to.
#### Alpha, Gamma, Epsilon
By changing `alpha` you can adjust the rate of the frog learning, and by `gamma` the importance of later rewards.
Frogger is epsilon-greedy, with a probability of `epsilon` frogger will choose a random direction, else the most-promising direction.
If `alpha-decay` and `epsilon-decay` being lower than 1, `alpha` and `epsilon` will be reduced every tick by multiplying the this factor. By pressing the `Reset decayed values`-button, they will be reset to `alpha0` and `epsilon0`.
#### Field of vision
The field of vision can be controlled using `see-forward`, `see-sideways` and `see-backward`.
#### `change-map`
By using `change-map` you can toggle if the level should be reset (with new row types etc.) after every death/win.
# Created by
* Albert de Wit
* Florian Magg
* Konstatin Fickel
* Tobias Peter
(ordered alphabetically)
Comments and Questions
extensions [table csv] breed [froggers frogger] breed [cars car] breed [spawners spawner] cars-own [ velocity direction position-in-lane ] spawners-own [ frequency velocity direction time-till-spawn ] froggers-own [ reward ] patches-own [ pdirection ] globals [ q-table ; The last state s tactics-user-direction total-reward ; Not direction of cars! ; Those are +-1 DIRECTION-UP DIRECTION-DOWN DIRECTION-LEFT DIRECTION-RIGHT ROW-TYPE-START ROW-TYPE-LEFT ROW-TYPE-RIGHT ROW-TYPE-GOAL row-types ROW-TYPE ROW-VELOCITY ROW-FREQUENCY ROW-DELAY score-history score-history-average score-history-average-max reward-history-small reward-history-small-average reward-history-big reward-history-big-average ; Computed possible positions of cars relative to frogger see-points ] ; ################### SETUP to setup clear-all init-constants reset-map true reset-ticks end to init-constants set q-table table:make set s 0 set alpha alpha_0 set epsilon epsilon_0 set reward-history-small [] set reward-history-small-average 0 set reward-history-big [] set reward-history-big-average 0 set score-history [] set score-history-average 0 set score-history-average-max 0 set DIRECTION-UP 1 set DIRECTION-DOWN 2 set DIRECTION-LEFT 3 set DIRECTION-RIGHT 4 set ROW-TYPE-START 0 set ROW-TYPE-LEFT -1 set ROW-TYPE-RIGHT 1 set ROW-TYPE-GOAL 2 set ROW-TYPE 0 set ROW-VELOCITY 1 set ROW-FREQUENCY 2 set ROW-DELAY 3 init-state-constants end to init-state-constants let rows see-forward + see-backward + 1 let cols see-sideways * 2 + 1 let x 0 - see-sideways let y see-forward set see-points [] repeat rows [ set x 0 - see-sideways repeat cols [ set see-points lput (list x y) see-points set x (x + 1) ] set y (y - 1) ] end to reset-map [new-map] ask cars [die] ask spawners [die] ask froggers [die] set-default-sprites if new-map = true [ generate-map ] create-map create-frogger 1 initiate-spawners repeat (max-pxcor * 2 + 1) [ spawn-cars move-cars ] end to set-default-sprites set-default-shape froggers "frogger" set-default-shape cars "car1" set-default-shape spawners "spawner" end to generate-map set row-types (list (list ROW-TYPE-START 0 0)) repeat max-pycor - 1 [ let rtype ((random 2) * 2 - 1) let rvelocity 1 + (random-float 0.2 - 0.1) let rfrequency (random 4) + 2 let rdelay (rfrequency * (random(2) + 1)) set row-types lput (list rtype rvelocity rfrequency rdelay) row-types ] set row-types lput (list ROW-TYPE-GOAL 0 0) row-types end to-report get-row-property [row-number property] report (item property (item row-number row-types)) end to create-map ask patches [ let rtype (get-row-property pycor ROW-TYPE) if (rtype = ROW-TYPE-RIGHT) [ set pcolor gray - 4 set pdirection ROW-TYPE-RIGHT ] if (rtype = ROW-TYPE-LEFT) [ set pcolor black set pdirection ROW-TYPE-LEFT ] if (rtype = ROW-TYPE-START) [ set pcolor violet ] if (rtype = ROW-TYPE-GOAL) [ set pcolor yellow ] ] end to create-frogger [number] create-froggers number [ set xcor 0 set ycor 0 set reward 0 ] end to initiate-spawners let y 1 while [y < max-pycor] [ create-spawners 1 [ set ycor y set color blue ifelse ([pdirection] of patch-here = ROW-TYPE-RIGHT) [ set color red set direction ROW-TYPE-RIGHT set xcor min-pxcor ] [ set color green set direction ROW-TYPE-LEFT set xcor max-pxcor ] set velocity (get-row-property pycor ROW-VELOCITY) set frequency (get-row-property pycor ROW-FREQUENCY) set time-till-spawn (get-row-property pycor ROW-DELAY) ] set y (y + 1) ] end to spawn-cars ask spawners [ if ([time-till-spawn] of self <= 0) [ hatch-cars 1 [ set xcor ([xcor] of myself) set ycor ([ycor] of myself) set position-in-lane xcor + random-float (abs (1 - velocity)) set color green ifelse (random 2 = 1) [ set shape "car1" ] [ set shape "car2" ] facexy (xcor + direction) ycor set direction ([direction] of myself) ] set time-till-spawn (frequency * (random(2) + 1)) ] set time-till-spawn (time-till-spawn - 1) ] end to move-cars ask cars [ set position-in-lane (position-in-lane + direction * velocity) if ((abs position-in-lane) > max-pxcor) [ die ] set xcor round position-in-lane ] end ; ######################## GO to go-forever if strategy = "user" [ user-message "User can't play in 'GO FOREVER'-mode!" stop ] go end to go ; show map [table:get q-table ?] filter [item 0 ? = 0] table:keys q-table ; let collisions filter [(member? [1 1] item 0 item 0 ?) or (member? [-1 1] item 0 item 0 ?)] table:keys q-table ; foreach collisions [ ; ;show word ? ", " ; ] ; show reduce + map [table:get q-table ?] collisions / length collisions ; Step numbers from script page 60 ; 1. if not is-list? s [set s get-state] let a get-q-action s ; 2. move-cars spawn-cars move-frogger a ifelse (not is-list? s) [ let r [reward] of one-of froggers ; 4. update-q-table a r 0 0 tick set epsilon (epsilon * epsilon-decay) set alpha (alpha * alpha-decay) auto-save-q reward-history-add total-reward set total-reward 0 if change-map [ reset-map true ] ] [ let r [reward] of one-of froggers let s' get-state ; 3. let a* get-q-action s' ; 4. update-q-table a r s' a* ; 5. set s s' ] end to move-frogger [q-action] let frogger-move 0 ask one-of froggers [ set frogger-move get-frogger-move q-action let next-xcor xcor let next-ycor ycor if (frogger-move = DIRECTION-UP) [ set next-ycor next-ycor + 1 set reward reward-up ] if (frogger-move = DIRECTION-LEFT) [ set next-xcor next-xcor - 1 set reward reward-left ] if (frogger-move = DIRECTION-RIGHT) [ set next-xcor next-xcor + 1 set reward reward-right ] if (frogger-move = DIRECTION-DOWN) [ set next-ycor next-ycor - 1 set reward reward-down ] ifelse (not ((next-ycor <= max-pycor) and (next-ycor >= min-pycor) and (next-xcor <= max-pxcor) and (next-xcor >= min-pxcor))) [ kill-frogger ] [ set xcor next-xcor set ycor next-ycor ] if (count cars-here > 0) [ kill-frogger ] if (get-row-property pycor ROW-TYPE) = ROW-TYPE-GOAL [ goal-frogger ] set total-reward (total-reward + reward) ] end to-report get-frogger-move [q-action] if strategy = "ai" [ report q-action ] if strategy = "user" [ report tactics-user-direction ] if strategy = "forward" [ report DIRECTION-UP ] if strategy = "random" [ report one-of (list DIRECTION-UP DIRECTION-DOWN DIRECTION-LEFT DIRECTION-RIGHT) ] end to score-history-add [amount] repeat (length score-history + 1 - score-history-length) [ set score-history (remove-item 0 score-history) ] set score-history (lput amount score-history) set score-history-average ((sum score-history) / (length score-history)) if (score-history-average-max < score-history-average) [ set score-history-average-max score-history-average ] end to reward-history-add [amount] repeat (length reward-history-small + 1 - small-average-size) [ set reward-history-small (remove-item 0 reward-history-small) ] repeat (length reward-history-big + 1 - big-average-size) [ set reward-history-big (remove-item 0 reward-history-big) ] set reward-history-small (lput amount reward-history-small) set reward-history-small-average ((sum reward-history-small) / (length reward-history-small)) set reward-history-big (lput amount reward-history-big) set reward-history-big-average ((sum reward-history-big) / (length reward-history-big)) end to kill-frogger set reward reward-die set s 0 score-history-add ycor reset-frogger end to goal-frogger set reward reward-win set s 0 score-history-add ycor reset-frogger end to reset-frogger set xcor 0 set ycor 0 end to-report get-q-action [state] let candidates [] ifelse (random-float 1 <= epsilon) [ set candidates (list DIRECTION-UP DIRECTION-DOWN DIRECTION-LEFT DIRECTION-RIGHT) ] [ let keys map [list state ?] (list DIRECTION-UP DIRECTION-DOWN DIRECTION-LEFT DIRECTION-RIGHT) let entries map [ifelse-value (table:has-key? q-table ?) [list ? (table:get q-table ?)] [list ? 0]] keys let maximum max map [last ?] entries set candidates map [last first ?] (filter [last ? = maximum] entries) ] let i 0 if length candidates > 1 [ let l length candidates set i random l ] report item i candidates end to update-q-table [a r s' a*] if (is-list? s) [ let sa list s a let s'a* list s' a* let Qsa ifelse-value (table:has-key? q-table sa) [table:get q-table sa] [0] let Qs'a* ifelse-value (table:has-key? q-table s'a*) [table:get q-table s'a*] [0] table:put q-table sa (Qsa + alpha * (r + gamma * Qs'a* - Qsa)) ] end to-report get-state let rep [] ask one-of froggers [ let x xcor let y ycor let cars-in-visible-area cars at-points see-points let cars-directed-towards-frogger cars-in-visible-area with [xcor = x or (direction = 1 and xcor < x) or (direction = -1 and xcor > x)] let coordinates-of-approaching-cars [(list (xcor - x) (ycor - y))] of cars-directed-towards-frogger set rep list (coordinates-of-approaching-cars) (get-edges x y) ] report rep end ; Determines if there are edges sideways or behind frogger within visible area to-report get-edges [x y] let rep [] ifelse x - min-pxcor > see-sideways [ set rep lput (see-sideways + 1) rep ] [ set rep lput (x - min-pxcor) rep ] ifelse y > see-backward [ set rep lput (see-backward + 1) rep ] [ set rep lput y rep ] ifelse max-pxcor - x > see-sideways [ set rep lput (see-sideways + 1) rep ] [ set rep lput (x - min-pxcor) rep ] report rep end to ui-save-q let file-location user-new-file ifelse file-location != false [ save-q file-location user-message "Saved Q!" ] [ user-message "Not saved!" ] end to save-q [file-location] if file-exists? file-location [ file-delete file-location ] csv:to-file file-location table:to-list q-table end to auto-save-q if ((ticks mod auto-save-interval) = 0) [ if (autosave-filename = "") [ set autosave-filename "q-after-ticks" ] let file-location (word "./" autosave-filename "-" ticks ".csv") save-q file-location ] end to ui-restore-q let file-location user-file ifelse file-location != false [ let q-table-from-file csv:from-file file-location foreach q-table-from-file [ table:put q-table (read-from-string (item 0 ?)) (item 1 ?) ] user-message "Loaded!" ] [ user-message "Not loaded!" ] end to save-map [file-location] if file-exists? file-location [ file-delete file-location ] csv:to-file file-location row-types end to ui-save-map let file-location user-new-file ifelse file-location != false [ save-map file-location user-message "Saved Map!" ] [ user-message "Not saved!" ] end to ui-restore-map let file-location user-file ifelse file-location != false [ set row-types csv:from-file file-location reset-map false user-message "Loaded!" ] [ user-message "Not loaded!" ] end
There is only one version of this model, created almost 7 years ago by Team 2 @ Foundations of Organic Computing WS 2016 Uni Augsburg.
Attached files
File | Type | Description | Last updated | |
---|---|---|---|---|
Frogger.png | preview | Frogger in Action | almost 7 years ago, by Team 2 @ Foundations of Organic Computing WS 2016 Uni Augsburg | Download |
This model does not have any ancestors.
This model does not have any descendants.
Team 2 @ Foundations of Organic Computing WS 2016 Uni Augsburg
LICENSE
[CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/)
Posted almost 7 years ago