Frogger

1 collaborator

Team 2 @ Foundations of Organic Computing WS 2016 Uni Augsburg (Author)

Tags

Model was written in NetLogo 5.3.1 • Viewed 366 times • Downloaded 28 times • Run 0 times

Download this model • Embed this model

To embed this model in another Web site, use the following link:

  <iframe src="http://modelingcommons.org/browse/one_model/5019?embedded=true" />

Info
Discuss
Run in NetLogo Web
Code
History
Files
Family

Do you have questions or comments about this model? Ask them here! (You'll first need to log in.)

# About

Frogger is an arcade game developed in 1981 by Konami. For our Frogger-implementation we simplified the game, made it turn-based, gave it a random environment and added the Q-learning algorithm to make the frogger learn.

## How the game works

The goal of the game is to make Frogger - the pixelated frog spawns at the bottom of the window - cross the road without colliding with a car. Always when frogger decides to go to one of the four adjacent patches, all the cars move one - or sometimes 0 or 2 - blocks within their row. Frogger loses, after a car moves onto the same patch, and wins, if it reaches the top yellow row. After each death or win, new street will be generated random ly.

# Basic Usage (For Q-learning)

* Press `✍` for Setup

* If you have a saved Q-matrix, you can press the `⛁Q↑`-button and choose the file you want to use. Wait until success-message opens.

* Make sure you have continuos updates on. Otherwise the frogger movement won't be shown.

* Make sure the chosen strategy is `AI`.

* Press `↻` to see Frogger play the game, using `Go` you can see him do one step.

# Measurement of success and our results

As a measurement of success, we tried two approaches: The first one was to use the reward the agent achieved. Because of that alone, we weren't able to evaluate different reward-configurations, so we added a second measurement namely the average rows the frog managed to cross within a certain amount of Episodes.

A Frogger played by the human player can reach the goal most of the times, so that the score (rows reached) when playing casually is about 12.5/16 for me.

A random Frogger by average doesn't even manage to reach the second row, it has a score of about 0.6/16.

A Frogger using the forward strategy has an average of 4.6/16 tiles.

One instance of the frogger with a field of view of 2 to the front and the side and 1 to the back managed to get an average score of 6.2/16 rows per episode after 100k ticks of learning with `epsilon` = 0. (Larger fields of view were not possible, with one more patch to the side, NetLogos Memory ran out)

It was very interesting to see the Q-learning frogger find tactics e.g. 'jumping over cars', a small flaw we found ourselves when testing the game: If a car is approaching the Frogger and is on the tile next to him, it is possible to jump in the direction of the car and land on the other side.

We had expected a better result, with a higher rate of managing to cross the road without dying, but we are still happy to see that it's working and beating the forward-strategy. Some of the runs Frogger was able to make were pretty amazing, other failed badly.

# Configuration

## Strategies

There are four different modes Frogger can run in. They can be seleted in the dropdown-menu below the direction-keys.

### AI

Frogger uses Q-learning to improve his skills.

### Forward

Frogger always choses to go forward. A surpisingly good strategy

### Random

Frogger chooses his direction randomly.

### User

You can play Frogger by using the `WASD`-keys. This mode will be chosen if you hit one of the `WASD`-keys.

The Q-matrix will be updated in every mode, so the frog can - for example - learn by watching you play.

## Saving and restoring the Q-matrix & the map

Use `⛁Q↓` to load and `⛁Q↑` to restore a Q-Table into a file you can choose. The Q-Table-file is a CSV-file with each line representing one entry in the Q-Table. Each line has the format `[[] ] `. The states consists of the relative positions of the near cars approaching him in the first array, and the distance to the `[ ]` borders of the map.

Every `auto-save-interval` Episodes the Q-table will be saved into a file in the same directory the application was started from. Using `autosave-filename` you can rename this file.

Use `⛁W↓` to load and `⛁W↑` to restore a Map that was loaded. This includes the row types, the velocity per row and the frequency of cars. Everything will be saved using the CSV format, each line representing a row in reversed order.

Keep in mind that both the Q-Table and the map will be reset by `✍`.

## Success Measurement

### Upper plot

This plot shows the reward of every episode.

### Lower plot

This plot shows the average reward of `small-average-size` ticks using a red and of `big-average-size` using a blue pen.

### `average rows managed`

This output shows the average rows the frog managed to cross within the last `score-history-length` episode.

## Q-Learning configuration

### Q-Learning

#### Reward

You can adjust the reward values for different actions and results: `reward-win` controls , the reward if the frog reaches the yellow row, `reward-die` the negative reward when the frog dies; and `reward-left`, `reward-right`, `reward-up` and `reward-down` for choosing an direction to go to.

#### Alpha, Gamma, Epsilon

By changing `alpha` you can adjust the rate of the frog learning, and by `gamma` the importance of later rewards.

Frogger is epsilon-greedy, with a probability of `epsilon` frogger will choose a random direction, else the most-promising direction.

If `alpha-decay` and `epsilon-decay` being lower than 1, `alpha` and `epsilon` will be reduced every tick by multiplying the this factor. By pressing the `Reset decayed values`-button, they will be reset to `alpha0` and `epsilon0`.

#### Field of vision

The field of vision can be controlled using `see-forward`, `see-sideways` and `see-backward`.

#### `change-map`

By using `change-map` you can toggle if the level should be reset (with new row types etc.) after every death/win.

# Created by

* Albert de Wit

* Florian Magg

* Konstatin Fickel

* Tobias Peter

(ordered alphabetically)

Comments and Questions

Team 2 @ Foundations of Organic Computing WS 2016 Uni Augsburg

LICENSE

[CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/)

Posted over 8 years ago

Click to Run Model

extensions [table csv]

breed [froggers frogger]
breed [cars car]
breed [spawners spawner]

cars-own [
  velocity
  direction
  position-in-lane
]

spawners-own [
  frequency
  velocity
  direction
  time-till-spawn
]

froggers-own [
  reward
]

patches-own [
  pdirection
]

globals [
  q-table
  ; The last state
  s

  tactics-user-direction
  total-reward

  ; Not direction of cars!
  ; Those are +-1
  DIRECTION-UP
  DIRECTION-DOWN
  DIRECTION-LEFT
  DIRECTION-RIGHT

  ROW-TYPE-START
  ROW-TYPE-LEFT
  ROW-TYPE-RIGHT
  ROW-TYPE-GOAL
  row-types

  ROW-TYPE
  ROW-VELOCITY
  ROW-FREQUENCY
  ROW-DELAY

  score-history
  score-history-average
  score-history-average-max

  reward-history-small
  reward-history-small-average
  reward-history-big
  reward-history-big-average

  ; Computed possible positions of cars relative to frogger
  see-points
]

; ################### SETUP

to setup
  clear-all

  init-constants
  reset-map true

  reset-ticks
end 

to init-constants
  set q-table table:make
  set s 0
  set alpha alpha_0
  set epsilon epsilon_0

  set reward-history-small []
  set reward-history-small-average 0
  set reward-history-big []
  set reward-history-big-average 0
  set score-history []
  set score-history-average 0
  set score-history-average-max 0

  set DIRECTION-UP 1
  set DIRECTION-DOWN 2
  set DIRECTION-LEFT 3
  set DIRECTION-RIGHT 4

  set ROW-TYPE-START 0
  set ROW-TYPE-LEFT -1
  set ROW-TYPE-RIGHT 1
  set ROW-TYPE-GOAL 2

  set ROW-TYPE 0
  set ROW-VELOCITY 1
  set ROW-FREQUENCY 2
  set ROW-DELAY 3

  init-state-constants
end 

to init-state-constants
  let rows see-forward + see-backward + 1
  let cols see-sideways * 2 + 1
  let x 0 - see-sideways
  let y see-forward
  set see-points []

  repeat rows [
    set x 0 - see-sideways
    repeat cols [
      set see-points lput (list x y) see-points
      set x (x + 1)
    ]
    set y (y - 1)
  ]
end 

to reset-map [new-map]
  ask cars [die]
  ask spawners [die]
  ask froggers [die]
  set-default-sprites
  if new-map = true [
    generate-map
  ]

  create-map
  create-frogger 1
  initiate-spawners

  repeat (max-pxcor * 2 + 1) [
    spawn-cars
    move-cars
  ]
end 

to set-default-sprites
  set-default-shape froggers "frogger"
  set-default-shape cars "car1"
  set-default-shape spawners "spawner"
end 

to generate-map
  set row-types (list (list ROW-TYPE-START 0 0))
  repeat max-pycor - 1 [
    let rtype ((random 2) * 2 - 1)
    let rvelocity 1 + (random-float 0.2 - 0.1)
    let rfrequency (random 4) + 2
    let rdelay (rfrequency * (random(2) + 1))

    set row-types lput (list rtype rvelocity rfrequency rdelay) row-types
  ]
  set row-types lput (list ROW-TYPE-GOAL 0 0) row-types
end 

to-report get-row-property [row-number property]
  report (item property (item row-number row-types))
end 

to create-map
  ask patches [
    let rtype (get-row-property pycor ROW-TYPE)

    if (rtype = ROW-TYPE-RIGHT) [
      set pcolor gray - 4
      set pdirection ROW-TYPE-RIGHT
    ]
    if (rtype = ROW-TYPE-LEFT) [
      set pcolor black
      set pdirection ROW-TYPE-LEFT
    ]
    if (rtype = ROW-TYPE-START) [
      set pcolor violet
    ]
    if (rtype = ROW-TYPE-GOAL) [
      set pcolor yellow
    ]
  ]
end 

to create-frogger [number]
  create-froggers number [
    set xcor 0
    set ycor 0
    set reward 0
  ]
end 

to initiate-spawners
  let y 1
  while [y < max-pycor] [
      create-spawners 1 [
        set ycor y
        set color blue
        ifelse ([pdirection] of patch-here = ROW-TYPE-RIGHT) [
          set color red
          set direction ROW-TYPE-RIGHT
          set xcor min-pxcor
        ] [
          set color green
          set direction ROW-TYPE-LEFT
          set xcor max-pxcor
        ]

        set velocity (get-row-property pycor ROW-VELOCITY)
        set frequency (get-row-property pycor ROW-FREQUENCY)
        set time-till-spawn (get-row-property pycor ROW-DELAY)
      ]

    set y (y + 1)
  ]
end 

to spawn-cars
  ask spawners [
    if ([time-till-spawn] of self <= 0) [
      hatch-cars 1 [
        set xcor ([xcor] of myself)
        set ycor ([ycor] of myself)
        set position-in-lane xcor + random-float (abs (1 - velocity))
        set color green

        ifelse (random 2 = 1) [
          set shape "car1"
        ] [
          set shape "car2"
        ]

        facexy (xcor + direction) ycor
        set direction ([direction] of myself)
      ]
      set time-till-spawn (frequency * (random(2) + 1))
    ]
    set time-till-spawn (time-till-spawn - 1)
  ]
end 

to move-cars
  ask cars [
    set position-in-lane (position-in-lane + direction * velocity)

    if ((abs position-in-lane) > max-pxcor) [
      die
    ]

    set xcor round position-in-lane
  ]
end 

; ######################## GO

to go-forever
  if strategy = "user" [
    user-message "User can't play in 'GO FOREVER'-mode!"
    stop
  ]
  go
end 

to go
;  show map [table:get q-table ?] filter [item 0 ? = 0] table:keys q-table
;  let collisions filter [(member? [1 1] item 0 item 0 ?) or (member? [-1 1] item 0 item 0 ?)] table:keys q-table
;  foreach collisions [
;    ;show word ? ", "
;  ]
;  show reduce + map [table:get q-table ?] collisions / length collisions



  ; Step numbers from script page 60
  ; 1.
  if not is-list? s [set s get-state]
  let a get-q-action s
  ; 2.
  move-cars
  spawn-cars
  move-frogger a


  ifelse (not is-list? s) [
    let r [reward] of one-of froggers
    ; 4.
    update-q-table a r 0 0

    tick
    set epsilon (epsilon * epsilon-decay)
    set alpha (alpha * alpha-decay)

    auto-save-q

    reward-history-add total-reward

    set total-reward 0
    if change-map [
      reset-map true
    ]
  ] [
    let r [reward] of one-of froggers
    let s' get-state
    ; 3.
    let a* get-q-action s'
    ; 4.
    update-q-table a r s' a*
    ; 5.
    set s s'
  ]
end 

to move-frogger [q-action]
  let frogger-move 0
  ask one-of froggers [
    set frogger-move get-frogger-move q-action
    let next-xcor xcor
    let next-ycor ycor

    if (frogger-move = DIRECTION-UP) [
      set next-ycor next-ycor + 1
      set reward reward-up
    ]

    if (frogger-move = DIRECTION-LEFT) [
      set next-xcor next-xcor - 1
      set reward reward-left
    ]

    if (frogger-move = DIRECTION-RIGHT) [
      set next-xcor next-xcor + 1
      set reward reward-right
    ]

    if (frogger-move = DIRECTION-DOWN) [
      set next-ycor next-ycor - 1
      set reward reward-down
    ]

    ifelse (not ((next-ycor <= max-pycor) and (next-ycor >= min-pycor) and (next-xcor <= max-pxcor) and (next-xcor >= min-pxcor))) [
      kill-frogger
    ] [
      set xcor next-xcor
      set ycor next-ycor
    ]

    if (count cars-here > 0) [
      kill-frogger
    ]

    if (get-row-property pycor ROW-TYPE) = ROW-TYPE-GOAL [
      goal-frogger
    ]
    set total-reward (total-reward + reward)
  ]
end 

to-report get-frogger-move [q-action]
  if strategy = "ai" [
    report q-action
  ]

  if strategy = "user" [
    report tactics-user-direction
  ]

  if strategy = "forward" [
    report DIRECTION-UP
  ]

  if strategy = "random" [
    report one-of (list DIRECTION-UP DIRECTION-DOWN DIRECTION-LEFT DIRECTION-RIGHT)
  ]
end 

to score-history-add [amount]
  repeat (length score-history + 1 - score-history-length) [
    set score-history (remove-item 0 score-history)
  ]

  set score-history (lput amount score-history)
  set score-history-average ((sum score-history) / (length score-history))
  if (score-history-average-max < score-history-average) [
    set score-history-average-max score-history-average
  ]
end 

to reward-history-add [amount]
  repeat (length reward-history-small + 1 - small-average-size) [
    set reward-history-small (remove-item 0 reward-history-small)
  ]

  repeat (length reward-history-big + 1 - big-average-size) [
    set reward-history-big (remove-item 0 reward-history-big)
  ]

  set reward-history-small (lput amount reward-history-small)
  set reward-history-small-average ((sum reward-history-small) / (length reward-history-small))
  set reward-history-big (lput amount reward-history-big)
  set reward-history-big-average ((sum reward-history-big) / (length reward-history-big))
end 

to kill-frogger
  set reward reward-die
  set s 0
  score-history-add ycor
  reset-frogger
end 

to goal-frogger
  set reward reward-win
  set s 0
  score-history-add ycor
  reset-frogger
end 

to reset-frogger
  set xcor 0
  set ycor 0
end 

to-report get-q-action [state]
  let candidates []
  ifelse (random-float 1 <= epsilon) [
    set candidates (list DIRECTION-UP DIRECTION-DOWN DIRECTION-LEFT DIRECTION-RIGHT)
  ] [
    let keys map [list state ?] (list DIRECTION-UP DIRECTION-DOWN DIRECTION-LEFT DIRECTION-RIGHT)
    let entries map [ifelse-value (table:has-key? q-table ?) [list ? (table:get q-table ?)] [list ? 0]] keys
    let maximum max map [last ?] entries
    set candidates map [last first ?] (filter [last ? = maximum] entries)
  ]
  let i 0
  if length candidates > 1 [
    let l length candidates
    set i random l
  ]
  report item i candidates
end 

to update-q-table [a r s' a*]
  if (is-list? s) [
    let sa list s a
    let s'a* list s' a*
    let Qsa ifelse-value (table:has-key? q-table sa) [table:get q-table sa] [0]
    let Qs'a* ifelse-value (table:has-key? q-table s'a*) [table:get q-table s'a*] [0]
    table:put q-table sa (Qsa + alpha * (r + gamma * Qs'a* - Qsa))
  ]
end 

to-report get-state
  let rep []

  ask one-of froggers [
    let x xcor
    let y ycor
    let cars-in-visible-area cars at-points see-points
    let cars-directed-towards-frogger cars-in-visible-area with [xcor = x or (direction = 1 and xcor < x) or (direction = -1 and xcor > x)]
    let coordinates-of-approaching-cars [(list (xcor - x) (ycor - y))] of cars-directed-towards-frogger
    set rep list (coordinates-of-approaching-cars) (get-edges x y)
  ]

  report rep
end 

; Determines if there are edges sideways or behind frogger within visible area

to-report get-edges [x y]
  let rep []

  ifelse x - min-pxcor > see-sideways [
    set rep lput (see-sideways + 1) rep
  ] [
    set rep lput (x - min-pxcor) rep
  ]
  ifelse y > see-backward [
    set rep lput (see-backward + 1) rep
  ] [
    set rep lput y rep
  ]
  ifelse max-pxcor - x > see-sideways [
    set rep lput (see-sideways + 1) rep
  ] [
    set rep lput (x - min-pxcor) rep
  ]

  report rep
end 

to ui-save-q
   let file-location user-new-file
   ifelse file-location != false [
     save-q file-location
     user-message "Saved Q!"
   ] [
     user-message "Not saved!"
   ]
end 

to save-q [file-location]
  if file-exists? file-location [ file-delete file-location ]
  csv:to-file file-location table:to-list q-table
end 

to auto-save-q
  if ((ticks mod auto-save-interval) = 0) [
    if (autosave-filename = "") [
      set autosave-filename "q-after-ticks"
    ]

    let file-location (word "./" autosave-filename "-" ticks ".csv")
    save-q file-location
  ]
end 

to ui-restore-q
   let file-location user-file
   ifelse file-location != false [
     let q-table-from-file csv:from-file file-location

     foreach q-table-from-file [
        table:put q-table (read-from-string (item 0 ?)) (item 1 ?)
     ]

     user-message "Loaded!"
   ] [
     user-message "Not loaded!"
   ]
end 

to save-map [file-location]
  if file-exists? file-location [ file-delete file-location ]
  csv:to-file file-location row-types
end 

to ui-save-map
   let file-location user-new-file
   ifelse file-location != false [
     save-map file-location
     user-message "Saved Map!"
   ] [
     user-message "Not saved!"
   ]
end 

to ui-restore-map
   let file-location user-file
   ifelse file-location != false [
     set row-types csv:from-file file-location
     reset-map false
     user-message "Loaded!"
   ] [
     user-message "Not loaded!"
   ]
end

There is only one version of this model, created over 8 years ago by Team 2 @ Foundations of Organic Computing WS 2016 Uni Augsburg.

Attached files

File	Type	Description	Last updated
Frogger.png	preview	Frogger in Action	over 8 years ago, by Team 2 @ Foundations of Organic Computing WS 2016 Uni Augsburg	Download

This model does not have any ancestors.

This model does not have any descendants.

NetLogo