Game Theory
- 21 minutes read - 4437 wordsContext
- In most real-life situations, economic agents do not operate in isolation. Their gains and losses depend not only on their own choices but also on the choices of others.
- Markets are typical examples of economic situations where social interactions matter.
- How can we study social interactions in economics?
- How do economic agents compete and coordinate with each other?
- What are the social dilemmas that arise in such situations?
Course Structure Overview
Lecture Structure and Learning Objectives
Structure
- Street Fighter Mechanics (Case Study)
- Basic Concepts
- Examples
- A Spatial Competition Application
- A Market Entry Application
- Current Field Developments
Learning Objectives
- Explain why social interactions can lead to social dilemmas.
- Explain how game theory models social interactions.
- Describe the concept of equilibrium in models with interactions.
- Illustrate the concept of equilibrium in static and dynamic market settings.
Street Fighter Mechanics
- Street Fighter II: The World Warrior is a fighting game released in 1991.
- It was originally released in arcade.
- It reestablished the arcade competition from high score chasing to one-vs-one play.
- It inspired competitive video game tournaments in the early 2000s.
- Today the e-sports market is valued at more than a billion dollars.
- Why was Street Fighter II so successful?
Command Grab
- Grapplers are (typically large) slow characters that have powerful grappling moves.
- The execution of these moves requires the avatars to be close.
- Command grabs are special grappling moves that cause a lot of damage.
- But they are very slow.
Neutral Jump
- Grabs do not work if the defender jumps vertically (neutral jump).
- Moreover, because command grabs are so slow, the defender can punish the grappler on his way down.
Normal Grab and Anti-Air
- Instead, the grappler can do a normal grab which recovers faster.
- In addition, with the fast recovery, the grappler can punish the defender in the air using a follow up, anti-air move.
Throw Technical
- An alternative option for the defender is to counter the grab with a technical counter.
- This tactic avoids the normal grab and anti-air punishment.
- However, it is vulnerable to the grappler’s command grab.
- How do players resolve this situation?
Social Dilemmas
- For some social interactions, individual interests do not always work in favor of society as a whole.
- Individual producers' interests suggest using cheap, brown instead of more expensive, green technologies.
- However, if all producers act in this way, pollution increases and the lives of everyone become worse off.
- A social dilemma is a situation in which actions taken independently by agents pursuing their individual objectives result in inferior outcomes to other outcomes that are feasible if agents coordinate.
Social Interactions
- Game theory is the main apparatus used for examining social interactions.
- A game is a description of a social interaction specifying
- the players (who is participating?),
- the feasible actions (when is someone playing? What can she do?)
- the information (what is known by players when making their decisions?)
- the payoffs (what is the outcome for each possible combination of actions?)
Common knowledge
- In the games that we will examine, the utilities and the choices of players are common knowledge.
- Common knowledge is information that is known and understood in the same way by all the players of a game.
- There is an element of infinite recursion in the idea of common knowledge.
- The agents of a game have common knowledge of a property \(P\) when they all know \(P\), they all know that they know \(P\), they all know that they all know that they know \(P\), etc.
Actions and Strategies
- Each player in a game has one or more decisions to make.
- A single choice made at a particular decision node is called an action.
- The collection of all actions of a player in a game is called a pure strategy (or simply strategy when it is understood from context that it is pure).
- In games where a player has a single decision to make, her actions and strategies coincide.
Representations of Games
- Games can be represented in various ways.
- The representations are not always interchangeable.
- Some games admit only certain representations. Others can be represented in multiple ways.
- Each representation has certain advantages.
Normal Form
- Simple games can be represented using a table documenting the primitives of the game.
- This representation is called the normal form of a game.
Extensive Form
-
The extensive form of a game is a representation in terms of a tree.
-
The extensive form depicts more information than the normal form:
- It illustrates the order in which the players act.
- It illustrates the information available to each player.
Information Set
- If a player has the same information at two (or more) nodes, the nodes are connected with a dotted line.
- An information set is a collection of decision nodes that the player making decisions cannot distinguish at the time of decision.
- The player knows that she is located at one of the nodes of the information set, but she does not know at which one of them.
Best Responses
- Given a player’s strategy, what is the best strategy with which the other player can respond?
- A best response strategy is a strategy that maximizes a player’s payoff for given strategies of the remaining players of the game.
- The best response mapping is an association that gives the strategies that maximize a player’s payoff for each combination of strategies of the remaining players of the game.
- \(B_{A}(Left) = \{ Bottom\}\), \(B_{A}(Right) = \{ Bottom\}\)
- \(B_{B}(Top) = \{ Left\}\), \(B_{B}(Bottom) = \{ Left\}\)
Dominant Strategies
- On some occasions, a player can choose a strategy that makes her better off irrespective of the strategies chosen by other players.
- A strategy that, for all strategies other players can choose, gives a higher payoff to a player compared to every other strategy available to her is called a dominant strategy.
- \(Left \succ_{B} Right\)
Nash Equilibria
- A collection of strategies, one for each player, such that each strategy constitutes a best response to the remaining players' strategies is called a Nash equilibrium.
- In short but less accurate, a Nash equilibrium is a collection of mutual best responses.
- Intuitively, a Nash equilibrium is a collection of strategies from which no one has an incentive to deviate.
- \(\mathrm{NE} = \left\{ \left\{ Bottom, Left \right\} \right\}\)
Do Nash equilibria predict the outcomes of games?
- Nash equilibria do not say how, why, or whether these strategies are reached in a game.
- The definition of Nash equilibria suggests that if they are reached, then there is no incentive for anyone to change her behavior.
Are Nash equilibria unique?
- Nash equilibria are not unique.
- Multiple situations in a game may constitute points from which no one wants to deviate.
- The payoffs of the players in different Nash equilibria can be significantly different.
- \(\mathrm{NE} = \left\{ \left\{ Bottom, Left \right\}, \left\{ Top, Right \right\} \right\}\)
Are Nash equilibria necessarily Pareto efficient?
- No. Nash equilibria can be Pareto inefficient.
- A classic example is the prisoner’s dilemma.
Spatial Competition
- There are two firms on a street.
- Points on the street are given by \([0, 1]\).
- Each firm chooses a point.
- Firms have the same cost and charge the same price.
- Customers on the street prefer the firm that is the closest.
An illustration of the game
Non equilibrium placements
- If firm \(2\) chooses \(x_{2} > \frac{1}{2}\), firm \(1\) would like to undercut by a small amount and set \(x_{1} = x_{2} - \varepsilon > \frac{1}{2}\).
- Then firm \(2\) has a profitable deviation by changing to \(x_{2} = \frac{1}{2}\).
- Thus any \(x_{2} > \frac{1}{2}\) cannot be an equilibrium.
- Similarly, any \(x_{2} < \frac{1}{2}\) cannot be an equilibrium.
- Analogous arguments hold for firm \(1\) because of symmetry.
Equilibrium placements
- Therefore, the only possible equilibrium is \(x_{1} = \frac{1}{2} = x_{2}\).
- Firms split the market and make equal profits.
- Any deviation leads to less profit for the firm that moved.
Sequential Games
- A game is called sequential if its players play sequentially instead of simultaneously.
- Nash equilibria also exist in such games.
- We can find some of them (the subgame-perfect ones) using backward induction.
- The best action of the player that acts at the last date is calculated.
- Given this best response, the best action of the player that acts at the previous to last date is calculated.
- We continue in this fashion until we have calculated the best action of the player who acts at the initial date.
Backward Induction
- \(\mathrm{NE} = \left\{ \left\{Bottom, \left(Left', Right \right)\right\}, \left\{Bottom, \left(Right', Right \right)\right\}\right\}\)
A Game of Market Entry
- Consider a market with one firm already operating and a potential entrant.
- The entrant decides whether to enter the market.
- The incumbent decides whether to follow aggressive or complying competition strategies.
- \(\mathrm{NE} = \left\{ \left\{Stay\ out, Fight \right\} \right\}\)
Current Field Developments
- Game theory has a deep theoretical basis and a wide variety of applications.
- In the last 60 years, it grew to be one of the most active research areas in many sciences.
- Current game theory models in economics involve behavioral biases and cognitive limitations.
- Game theory is also used in political sciences to analyze topics ranging from voters' behavior to war conflicts.
- Computer science uses game theory concepts and tools to develop artificial intelligence agents.
- In biology, game theory has been used to describe some social aspects of evolutionary processes.
Concise Summary
- Social dilemmas are situations in which private actions can lead to inferior social outcomes.
- Social interactions can be analyzed using game theory.
- Social interaction settings can be either static or dynamic.
- Equilibria in social interaction settings are situations in which no one has an incentive to change her behavior.
- Such equilibria are not necessarily economically efficient.
Further Reading
- Varian (2010, secs. 29.1-6)
- CORE Team (2017, secs. 4.1-5, 4.13)
- Nash (1950)
Mathematical Details
Mixed strategies
Pure strategies are collections of actions, one for each decision to be made. Sometimes the players prefer to choose strategies based on some randomization rule. Players can randomize by assigning the probabilities (weights) with which they use their pure strategies. A distribution over the player’s pure strategies is called a mixed strategy.
For example, suppose that player \(A\) has two pure strategies (\(Top\) and \(Bottom\)). The mixed strategy \((p, 1-p)\) assigns probability \(p\) to choosing \(Top\) and probability \(1-p\) to choosing \(Bottom\).
Finite games always have at least one Nash equilibrium in mixed strategies (Nash 1950). A game is finite if the numbers of players, actions, and decision nodes are finite.
The Grappler Game
Going back to the Street fighter mechanics, suppose that the payoffs in the interaction between a grappler and a defender are given by the following table.
How can we calculate the mixed strategy Nash equilibrium in this case? The grappler has two pure strategies (\(Command\ Grab\) and \(Normal\ Grab\)). The mixed strategy \((p, 1-p)\) assigns probability \(p\) to choosing \(Command\ Grab\) and probability \(1-p\) to choosing \(Normal\ Grab\). The grappler chooses these probabilities so that it makes the defender indifferent between \(Neutral\ Jump\) and \(Tech\) (why?). \[\underbrace{2p + (-2)(1-p)}_{\substack{\text{Expected payoff when}\\ \text{the defender chooses }\\ Neutral\ Jump}} = \underbrace{(-3)p + 0(1-p)}_{\substack{\text{Expected payoff when}\\ \text{the defender chooses }\\ Tech}} \implies p = \frac{2}{7}.\] Similarly the defender mixes \(Neutral\ Jump\) with \(q\) and \(Tech\) with \(1-q\), such that \[\underbrace{(-2)q + 3(1-q)}_{\substack{\text{Expected payoff when}\\ \text{the grappler chooses }\\ Command\ Grab}} = \underbrace{2q + 0(1-q)}_{\substack{\text{Expected payoff when}\\ \text{the grappler chooses }\\ Normal\ Grab}} \implies q = \frac{3}{7}.\] The Nash equilibrium of the grappler game is \[\left\{\left(\frac{2}{7}, \frac{5}{7}\right), \left(\frac{3}{7}, \frac{4}{7}\right)\right\}.\]
Rock Paper Scissors
There is no pure strategy Nash equilibrium. However, there is a symmetric mixed strategy equilibrium in which both players randomize using \(\left(\frac{1}{3}, \frac{1}{3}, \frac{1}{3}\right)\).
Repeated games
Repeated games are games played at a finite or infinite number of dates, and at each date, the same strategic interaction is repeated. The strategic interaction repeated at each date is called a stage game. When there is a future, threats can be used to sustain cooperation.
For example, suppose that the prisoner’s dilemma is used as the stage game of a repeated game with an infinite time horizon. The payoff of all future dates is discounted by \(0 <\delta < 1\). Is there any room for coordination in this case? Players can coordinate by using trigger strategies: If at all previous dates the other player has denied, then deny. Otherwise, confess. If players coordinate, then their payoffs are \[u^{c,i} = -\frac{1}{1 - \delta}\] If player \(i\) deviates at the current date, then her payoff is \[u^{d,i} = -3\frac{\delta}{1 - \delta}\] Coordination can be supported if \[u^{c,i} \ge u^{d,i} \iff \delta \ge \frac{1}{3}\]
Exercises
Group A
-
Consider the following game.
- Find all the pure strategy Nash equilibria.
- How do you interpret the objective of the players in this game?
- There are two Nash equilibria in this game, namely \(\{Foo, Foo\}\) and \(\{Bar, Bar\}\).
- The objective of the players is to coordinate. The players would like to choose the same action. Games of this type are called coordination games.
-
Suppose that two players play a game, and it is known with certainty that player \(A\) is not choosing a Nash equilibrium strategy of a game. Should player \(B\) choose her Nash equilibrium strategy?
Not necessarily. In general, player \(B\) maximizes her payoff by choosing the best response strategy to player \(A\)’s strategy. If the Nash equilibrium strategy of player B is a dominant strategy, then she should choose it. Dominant strategies are best responses irrespective of the choice of other players. Therefore, if her Nash equilibrium strategy is a dominant strategy, she will still maximize her payoff by choosing it. For example, if player \(A\) chooses \(Deny\) in a prisoner’s dilemma game, since the Nash equilibrium strategy of player \(B\), namely \(Confess\), is a dominant strategy, she maximizes her payoff by choosing it.
In other games, in which her Nash equilibrium strategy is not dominant, the best response to the choice of player \(A\) can differ from the Nash equilibrium strategy and, therefore, player \(B\) might switch to a non Nash equilibrium strategy. For example, consider the game obtained by modifying the actions' labels and payoffs of the prisoner’s dilemma in the following way.
In this game, if player \(A\) chooses \(Bottom\), the best response of player \(B\) is \(Right\). As is to be expected, \(Left\) is not a dominant strategy.
-
Find the equilibrium of the following game using backward induction.
The Nash equilibrium obtained by backward induction is \[\left\{\left(Bottom, Up', Up \right), \left(Right', Right\right)\right\}\] -
Consider the game
- Find all the pure strategy Nash equilibria of the game.
- Suppose that player A plays first. Draw the extensive form of the game.
- Solve the sequential game using backward induction.
- Are all Nash equilibria of the static game present in the sequential game? Why, or why not?
-
There are two Nash equilibria given by \(\{Top, Left\}\) and \(\{Bottom, Right\}\).
-
The extensive form of the sequential game is
-
The Nash equilibrium obtained by backward induction is \(\{Top, \left(Left', Right\right)\}\).
-
Nash equilibria in which player \(A\) plays \(Bottom\) cannot be obtained by backward induction in the sequential game. This happens because player \(A\) prefers choosing \(Top\) at date \(1\) as she receives a higher payoff. Even if player \(B\) attempts to convince player \(A\) to choose \(Bottom\) by threatening that she will play \((Right',Right)\), her statement is not credible. Once player \(A\) chooses \(Top\), player \(B\) receives a greater payoff by choosing \(Left'\). In general, dynamic aspects in a game can affect the resulting the credibility of some Nash equilibria. Therefore, when analyzing strategic interactions for which time is a central component, it is important not to neglect it because doing so might lead to less refined equilibrium outcomes.
Group B
-
Consider the game
- Does the game have any pure strategy Nash equilibria?
- Find all the Nash equilibria of the game.
-
No. There is no set of strategies that constitute best responses to one another.
-
Suppose that player \(A\) randomizes by playing \(Top\) with probability \(p > 0\) and \(Bottom\) with probability \(1-p\). If this mixed strategy is a best response, it must equalize the expected payoff of the strategies of player \(B\) because otherwise, player \(B\) would choose either \(Left\) or \(Right\). The unique best response of player \(A\) to \(Left\) is \(Bottom\), which means that the mixed strategy \((p, 1-p)\) with \(p>0\) cannot be the best response to player \(B\)’s choice of \(Left\). Similarly, we can exclude the case of player \(B\) choosing \(Right\).
Therefore, for the mixed strategy to be a best response, it should hold \[ \underbrace{p \cdot 2 + (1-p) \cdot 0}_{\text{Player B’s expected payoff for $Left$}} = \underbrace{p \cdot 0 + (1-p) \cdot 2}_{\text{Player B’s expected payoff for $Right$}}. \] This implies that \(p= 1/2\).
Suppose that player B randomizes by playing \(Left\) with probability \(q > 0\) and \(Right\) with probability \(1-q\). Using similar arguments as above, the randomization of player B should equalize the payoff of the strategies of player A, namely \[ \underbrace{q \cdot 0 + (1-q) \cdot 2}_{\text{Player A’s expected payoff for $Top$}} = \underbrace{q \cdot 2 + (1-q) \cdot 0}_{\text{Player A’s expected payoff for $Bottom$}}. \] The last condition implies that \(q= 1/2\).
Combining the above, the unique Nash equilibrium of the game is given by the mixed strategies \[ \left\{\left(p = \frac{1}{2}, 1-p = \frac{1}{2}\right), \left(q = \frac{1}{2}, 1-q = \frac{1}{2}\right)\right\}. \]
-
We consider the prisoner’s dilemma as the stage game of a repeated game.
- Suppose that the stage game is repeated twice. Find the unique Nash equilibrium of the game.
- Suppose that the stage game is repeated an arbitrary finite number of times (say \(n=10^{10}\)). Find the unique Nash equilibrium of the game.
- Why is it impossible for the players to coordinate and play the Pareto efficient strategies in the above cases?
-
We can find the Nash equilibrium by backward induction. At date \(2\), the unique Nash equilibrium of the stage game is obtained when both players choose \(Confess\). Since there are no further dates in which the stage game is repeated, the players cannot coordinate in the Pareto efficient outcome by promising cooperation in the future. At date \(1\), the unique Nash equilibrium of the stage game is obtained once more for both players choosing \(Confess\). Since the only Nash equilibrium of the game at date \(2\) is the pair \(\{Confess, Confess\}\), players cannot coordinate at the Pareto efficient outcome at date \(1\) by threatening to punish non-cooperation in the future. Therefore, the unique Nash equilibrium of the game is obtained by \(\{(Confess, Confess), (Confess, Confess)\}\).
-
If the game is repeated a finite number of times, the unique Nash equilibrium is obtained when both players choose \(Confess\) at every date. This result can be obtained by backward induction. At the final date of the game, the unique Nash equilibrium of the stage game is \(\{Confess, Confess\}\), justified as in the case of the two repetitions. Suppose that we have established that the unique Nash equilibrium of the last \(k\) repetitions of the game is \[ \left\{\underbrace{(Confess, …, Confess)}_{k-times}, \underbrace{(Confess, …, Confess)}_{k-times}\right\}. \] With the argument used for date 1 in the case of two repetitions, one concludes that the unique Nash equilibrium of the \((n-k)\) -th date of the game is \(\{Confess, Confess\}\). Continuing in this respect, we conclude that the unique Nash equilibrium of the repeated game is obtained when both players choose \(Confess\) at every date.
-
When a stage game with a unique Nash equilibrium is repeated a finite number of times, the players cannot establish credible promises or threats. However, if the repeated game has an infinite horizon, then coordination strategies that deviate from stage game Nash equilibria are feasible, even if there is only a unique Nash equilibrium in the stage game. An example of this is the infinitely repeated prisoner’s dilemma. Promises and threats can be established when a stage game is repeated only a finite number of times, only if the stage game has multiple Nash equilibria, one of which can be used in coordination and one in non-coordination.
Group C
-
Consider the prisoner’s dilemma game in which if both players do not confess, they get a payoff of \(-1\); if they both confess, they get a payoff of \(-6\); and if one confesses and the other does not, the one that confessed receives a payoff of \(0\) and the other a payoff of \(-10\)
- Classify the strategy profiles into those yielding Pareto efficient and Pareto inefficient allocations.
- Show that for each strategy profile yielding a Pareto efficient allocation, there exist positive weights \(\left(\alpha, 1-\alpha\right)\) such that the weighted average of the payoffs of the players for this profile is greater or equal to the weighted average payoffs of all other strategy profiles.
- Show that for each strategy profile yielding Pareto inefficient allocations, such weights do not exist.
-
The game has three Pareto efficient allocations. One from the strategy profile for which both players do not confess, and both combinations where one player confesses and the other does not. The only Pareto inefficient allocation is the result of the strategy profile in which both players confess since the profile in which both players do not confess constitutes a Pareto improvement.
-
Let \(u_{i}\) denote the payoff of player \(i\in\{A,B\}\). We consider the cases for the strategy profiles \(\{N,N\}\), \(\{N,C\}\), and \(\{C,N\}\), where \(C\) is short for the strategy confess and \(N\) for the strategy not confess.
For the case \(\{N,N\}\), choose \(\alpha=1/2\) and let \[W_{1}(s_{Α}, s_{Β}) = \frac{1}{2}u_{Α}(s_{Α}, s_{Β}) + \frac{1}{2}u_{Β}(s_{Α}, s_{Β}).\] Then, we have \[\underbrace{W_{1}(N,N)}_{=-1} > \underbrace{W_{1}(C,N) = W_{1}(N,C)}_{=-5} > \underbrace{W_{1}(C,C)}_{-6}.\]
For the case \(\{N,C\}\), choose \(\alpha=1/20\) (any \(\alpha \le 1/10\) works) and let \[W_{2}(s_{A}, s_{B}) = \frac{1}{20}u_{A}(s_{A}, s_{B}) + \frac{19}{20}u_{B}(s_{A}, s_{B}).\] Then, we have \[\underbrace{W_{2}(N,C)}_{=-1/2} > \underbrace{W_{2}(N,N)}_{=-1} > \underbrace{W_{2}(C,C)}_{-6} > \underbrace{W_{2}(C,N)}_{-9.5}.\]
The last case \(\{C,N\}\) is symmetric to the case \(\{N,C\}\) and we can choose any \(\alpha\in[9/10, 1)\) to show the claim.
-
Let \(\alpha\) take an arbitrary value in \((0,1)\) and define \[W_{3}(s_{A}, s_{B}) = \alpha u_{A}(s_{A}, s_{B}) + (1 - \alpha)u_{B}(s_{A}, s_{B}).\] We can then calculate \[W_{3}(C, C)=-6 < -1 = W_{3}(N, N).\] Since the choice of \(\alpha\) was arbitrary, the claim follows.
References
References
Topic's Concepts
- sequential
- nash equilibrium
- dominant strategy
- best response mapping
- best response strategy
- information set
- extensive form
- normal form
- strategy
- pure strategy
- action
- common knowledge
- game
- social dilemma