02 GameTheory

•

Outros

0

Central de Apuntes

26/5/2022

¡Este material tiene más páginas!

Entonces, ¿te gustó este material?

Ayude a animar a otros estudiantes a mejorar el contenido

¿Te gustó este material? ¡Compartir! 🧡

Administración

601.990 Materiales compartidos

Descarga la aplicación para disfrutar aún más

Lea materiales sin conexión, sin usar Internet. Además de muchas otras características!

Vista previa del material en texto

Introduction to Non-Cooperative Game Theory
Gastón Llanes Francisco Ruiz-Aliseda
August 21, 2017
1 Introduction
Managing a company involves making decisions in situations involving strategic in-
teraction, as happens when to deciding about team work, provision of managerial
incentives, or product market competition. Strategic interaction arises when an
agent’s payoff not only depends on her actions but on the action of other agents with
whom she interacts. For example, the payoff that Coca-Cola makes when selling coke
depends not only on the price it charges, but on the price charged by Pepsi. Anal-
ogously, Pepsi’s payoff depends on its price and that charged by Coca-Cola. Each
company needs to understand how its incentive to undertake some set of actions
interacts with the incentives of the other one, a pretty complex situation in principle
when it comes to determining what to do.
2 Ingredients of a (non-cooperative) game
The most widely used toolkit for analyzing strategic interactions that may have a
dynamic component is non-cooperative game theory (NCGT). In contrast to cooper-
ative game theory, NCGT is explicit about the actions that firms take to create and
capture value, emphasizing timing and informational aspects that are not considered
in cooperative-game-theoretic contexts. Probably the most critical insight of NCGT
is that it forces a decision maker to think beyond her own payoff by putting herself
1
in other decision makers’shoes so as to anticipate how they may react to her actions.
A (non-cooperative) game consists of the following elements:
• Set of players: Who is involved in the situation at hand?
• "Rules" of the game (i.e., sequencing of actions available to players at different
points in time and their information structures at such points): Which player
moves when? What do they know when they move? What actions can they
undertake when they move?
• Set of outcomes: For each feasible set of actions that players can take, what is
the outcome?
• Set of payoffs: What is the payoff to each player for any feasible outcome?
For instance, consider a situation in which there are two workers 1 and 2 who
must decide whether or not to put a certain amount of effort in a given activity.
Effort is costly, but it yields some product that would not be produced otherwise.
The wage of workers is tied to production, so that the utility of each worker is as
given by the following payoff matrix:
Effort
No effort
Effort No effort
2, 2 −1, 3
3,−1 0, 0
Note that, by convention, the payoff of the row player (worker 1 in this case) is given
by the first component of each pair. For example, worker 1 earns 3 when it puts no
effort in the task allocated but the other worker does. Also, there are four outcomes
in this game, (Effort, Effort), (No effort, Effort), (Effort, No effort) and (No effort,
No effort), with an associated payoff to both workers in each of them. Finally, it
is worth noting that a plan of action is called “strategy,” so each player has two
strategies in this simple game. Actions and strategies do not coincide in games in
which players can take a sequence of moves, but they do coincide in the absence of
dynamics, as in the static situation currently analyzed.
2
Some relevant assumptions underlying this illustrative situation are worth high-
lighting:
• A worker chooses her action not observing the other worker’s effort (because
for example they are chosen simultaneously).
• Workers cannot commit before engaging in this interaction to undertaking some
action.
• For each feasible set of actions, a worker knows her payoff as well as the other
worker’s payoff.
• Workers simply interact once.
3 Nash equilibrium
In order to solve a game, a notion of stable outcome from which neither of the players
wants to deviate is necessary. Such a stable outcome is called equilibrium. In the
equilibrium of a game, each player needs to understand the incentives of other players
so as to anticipate what they will do. Note that what these other players may be
anticipated to do may in turn depend on what they anticipate the player to do, and
so on, so there is a sense of circularity. The standard solution concept for games
is Nash equilibrium, in which this circularity is resolved because such anticipations
happen to be correct by definition of equilibrium; the other requirement of Nash
equilibrium is that each decision maker does what is best for her given what she
anticipates others to do. These two conditions —that each player best-responds to
what she believes the others play and that such beliefs be correct—will be useful in
numerous applications of game theory.
3.1 Alternative definition of Nash equilibrium
An alternative view of the Nash equilibrium requirements is that strategies form an
equilibrium when no player has an incentive to deviate from her strategy keeping
3
the other players’strategies fixed. The absence of unilateral incentives to deviate
from a strategy by any player is therefore what characterizes a Nash equilibrium.
For example, the strategy profile (Effort, Effort) cannot constitute an equilibrium
because any worker could earn 3 instead of 2 by exerting no exert, keeping the
other worker’s effort level fixed. However, the strategy profile (No effort, Effort)
cannot constitute an equilibrium either because worker 2 could earn 0 instead of −1
by not exerting any effort, given that worker 1 is exerting no effort. For a similar
reason, the strategy profile (Effort, No effort) is not an equilibrium because worker
1 could profitably deviate in a unilateral manner. To conclude with the search for
an equilibrium, note that the strategy profile (No effort, No effort) is indeed an
equilibrium because no worker has a unilateral deviate from her strategy. In such an
equilibrium, each worker earns 0.
In order to introduce a more systematic way to find the Nash equilibrium of a
static game, consider the following fictitious game played by Apple and Microsoft.
Suppose that Apple benefits from improving its iPhone device, regardless of what
Microsoft does. However, Microsoft benefits from developing a new app if and only
if Apple introduces the improvements. Improving the iPhone is more profitable for
Apple if Microsoft develops the new app, though. The payoff matrix with Apple as
the row player and Microsoft as the column player is as follows:
Improvement
No improvement
Development No development
3, 2 2, 0
1,−1 1, 0
The unique Nash equilibrium is (Improvement, Development), with respective
payoffs for Apple and Microsoft of 3 and 2. If Apple enhances the iPhone and
Microsoft develops a new app, none of them has a unilateral incentive to deviate.
As any other tool, game theory has its limitations, but this simple example helps
to illustrate well its power. What Microsoft would like to do depends on what it
believes Apple will do: Microsoft would not develop the app if it believed that Apple
will not improve the iPhone. Analyzing Apple’s incentives reveals that Apple wishes
4
to enhance the iPhone regardless of what Microsoft, so Microsoft has no reason to
believe that Apple will not improve the iPhone. Based on the belief that Apple will
improve the iPhone, Microsoft will be led to develop the new app. In fact, even
though Apple cannot observe Microsoft’s action, it can perfectly foresee it, just as
Microsoft could foresee Apple’s.
3.2 Reaction functions
This way of looking at the problem suggests a useful approach towards finding an
equilibrium. This approach is based on so-called best-response analysis. Following
up on the original definition of Nash equilibrium, a best-response function for a
player indicates the optimal strategy that such a player would follow given her beliefs
about the strategies followed by other players. Best-response functions are also called
reaction functions. In a Nash equilibrium, each player must be best-responding to
all the other players, soone simply needs to seek for the intersection of reaction
functions to find a Nash equilibrium. At the intersection of reaction functions, all
players are best responding to what they believe the others to be playing and such
beliefs happen to be correct, thus satisfying the Nash equilibrium requirements. This
approach to the problem shows that a player typically needs to understand not only
her incentives (captured by her reaction function) but also the incentives of other
players (captured by their respective reaction functions).
In our last example, Apple best-responds to any belief it has about Microsoft’s
strategy by improving the iPhone, which is indicated in Figure 1 by the circles.
I
NI
D ND
Figure 1
If Microsoft believes Apple to improve the iPhone, then it will best-respond by
developing the app; however, if Microsoft believes that Apple will not improve the
5
iPhone, it will best-respond by not developing the app. Microsoft’s reaction function
is represented by means of crosses in Figure 1. Inspecting where reaction functions
overlap, one can easily see that they do at the only Nash equilibrium that exists in
this game: (Improvement, Development).
3.3 Nash equilibrium in games with a continuum of actions
This way of proceeding is very useful when players have complex action sets, as
the following example illustrates. Thus, let us suppose that Francisco and Gastón
(respectively denoted by i = 1 and i = 2) are collaborating in a research project that
needs to be finished by one week. Each of them has to decide how many weekly hours
to devote to the project, which we denote by ei. Any teamworker can only observe
his action, not the action chosen by the other one. Research quality q depends on the
total number of hours put on the project and some interaction term that captures
sinergies between the two teamworkers:
q(e1, e2) = e1 + e2 + e1e2.
The cost of choosing an effort level ei ∈ [0, 168] equals e2i , and the total utility of
teamworker i ∈ {1, 2} is
ui(e1, e2) = e1 + e2 + e1e2 − e2i .
Suppose that the Nash equilibrium is denoted by (e∗1, e
∗
2). Then it must be true
that e∗1 maximizes u1(e1, e
∗
2) = e1 + e
∗
2 + e1e
∗
2 − e21 with respect to e1. Since this
function is strictly concave (i.e., ∂2u1(e1, e∗2)/∂e
2
1 = −2 < 0), the following first-order
condition must hold:
1 + e∗2 − 2e∗1= 0. (1)
Analogously, player 2 must be best-responding to e∗1 by choosing e
∗
2, which leads to
the following condition:
1 + e∗1 − 2e∗2= 0. (2)
6
Solving equations (1) and (2) yields that
e∗1 = e
∗
2 = 1,
so the unique Nash equilibrium is (e∗1, e
∗
2) = (1, 1).
3.4 Reaction functions with a continuum of actions
Paralleling what we did with the previous example, we can use reaction functions to
find out the Nash equilibrium of the current game by means of graphical analysis.
If player 2 is believed to choose ê2, then the reaction function of player 1 must be
given by the choice of e1 that maximizes u1(e1, ê2) = e1 + ê2 + e1ê2 − e21. Since this
function is strictly concave (i.e., ∂2u1(e1, ê2)/∂e21 = −2 < 0), the following first-order
condition must indeed give player 1’s reaction function:
1 + ê2 − 2e1 = 0.
Solving it gives
eR1 (ê2) =
1 + ê2
2
,
where eR1 (ê2) denotes player 1’s reaction function. For example, if player 2 is believed
to exert no effort, then player 1 best-responds by choosing eR1 (0) = 1/2; if player 2
is believed to choose ê2 = 1, then player 1 best-responds by choosing eR1 (1) = 1. A
similar analysis yields how player 2 best-responds to a belief that player 1 chooses
ê1:
eR2 (ê1) =
1 + ê1
2
.
Plotting these functions as in Figure 2, it is easy to see that they intersect at (e∗1, e
∗
2) =
(1, 1), since this is the only point in the current case in which each player best-
responds to the other one.
7
)ˆ( 21 ee
R
)̂( 12 ee
R
2̂e
1̂e1
*
1 =e
1*2 =e
Figure 2
4 Dynamic games
Thus far, we have considered static (one-shot) games in which players move once at
the same time. Many relevant situations involve a sequence of moves, as the following
example illustrates. Two firms A and B must each choose whether to enter a certain
market with a large or a small scale of production. A chooses first and B chooses
after having observed A’s choice. If both have a large scale, then each gains 0. If
both have a small scale, then each gains 3. If the firms have different scales, then
the small firm gains 2 and the large firm gains 4.
4.1 Extensive-form representation and backward induction
This two-stage game can be represented through the so-called extensive-form repre-
sentation of the game as in Figure 3:
8
A
(2,4) (3,3)
B
Large Small
SmallLarge
(0,0) (4,2)
B
Large Small
Figure 3
Such a representation is also called game tree. Game trees are solved backwards:
one starts from the bottom and works his/her way through the top. This backward
induction logic captures the idea that a rational player must foresee how his/her
current actions will impact future play. The solution concept underlying this logic is
called perfect Nash equilibrium. Starting from the bottom (the game’s second stage
in the current case), one must identify the nodes where B makes a decision and finds
out the optimal behavior for each node, as illustrated in Figure 4:
A
(2,4) (3,3)
B
Large
SmallLarge
(0,0) (4,2)
B
Large Small
Figure 4
Next, one goes up along the tree, taking into account how firm A anticipates that
B will react to each of its choices, and finds out the optimal behavior of A in the
first stage, as Figure 5 illustrates:
9
A
(2,4) (3,3)
B
Large Small
SmallLarge
(0,0) (4,2)
B
Large Small
Figure 5
If we changed the example and let firms choose simultaneously rather than se-
quentially, then B would not observe A’s choice, which is represented through a
“broken oval”as in Figure 6:
A
(2,4) (3,3)
B
Large Small
SmallLarge
(0,0) (4,2)
B
Large Small
Figure 6
When firms move sequentially, the outcome of the unique perfect Nash equilib-
rium is that firm A chooses a large scale of entry and B chooses a small scale. Firms
A and B earn 4 and 2, respectively. If both firms chose their scale of entry at the
same time, then there would exist another Nash equilibrium in which firm B would
choose a large scale and A would choose a small scale. Having firm A choose first
allows it to make the equilibrium that is worse for it disappear. Unlike what this
example seems to teach, it is worth noting that moving first is not always better than
moving second.
10
4.2 Subgame-perfect Nash equilibrium
The example we have just analyzed is simple but instructive. We now introduce a
richer one that requires a bit more of sophistication when solving it. To this end,
consider a situation in which firm A must choose whether or not to enter a market in
which B is already active. If A enters, then B observes it, and each simultaneously
chooses whether to charge a high or a low price. Figure 7 displays the extensive-form
representation of this game given some payoffs:
A
(-15,-5) (5,-10) (-10,-5) (15,5)
No entry Entry
(0,2) A
BB
plow phigh phighplow
plow phigh
Figure 7
As is standard, the game tree is solved backwards, starting from below and identi-
fying the so-called game’s subgames (i.e., the circled nodes not contained in a broken
oval and everything under them, ovals included). In the previous example, there
were three subgames, but there are just two in the currrent one: the one that starts
the second time A moves (see Figure 8) as well as the entire game:
11
(-15,-5) (5,-10) (-10,-5) (15,5)
A
BB
plow phigh phighplow
plow phigh
Figure 8
When players’strategies induce a Nash equilibrium in every subgame of the game,
the resulting Nash equilibrium is labeled “subgame-perfect.”In the current case, the
unique subgame-perfect Nash equilibrium outcome is that A enters and both firms
charge a high price afterwards, as illustrated by Figure 9:
A
(-15,-5) (5,-10) (-10,-5)(15,5)
No entry Entry
(0,2) A
BB
plow phigh phighplow
plow phigh
Figure 9
12