Descarga la aplicación para disfrutar aún más
Vista previa del material en texto
Introduction to Non-Cooperative Game Theory Gastón Llanes Francisco Ruiz-Aliseda August 21, 2017 1 Introduction Managing a company involves making decisions in situations involving strategic in- teraction, as happens when to deciding about team work, provision of managerial incentives, or product market competition. Strategic interaction arises when an agent’s payoff not only depends on her actions but on the action of other agents with whom she interacts. For example, the payoff that Coca-Cola makes when selling coke depends not only on the price it charges, but on the price charged by Pepsi. Anal- ogously, Pepsi’s payoff depends on its price and that charged by Coca-Cola. Each company needs to understand how its incentive to undertake some set of actions interacts with the incentives of the other one, a pretty complex situation in principle when it comes to determining what to do. 2 Ingredients of a (non-cooperative) game The most widely used toolkit for analyzing strategic interactions that may have a dynamic component is non-cooperative game theory (NCGT). In contrast to cooper- ative game theory, NCGT is explicit about the actions that firms take to create and capture value, emphasizing timing and informational aspects that are not considered in cooperative-game-theoretic contexts. Probably the most critical insight of NCGT is that it forces a decision maker to think beyond her own payoff by putting herself 1 in other decision makers’shoes so as to anticipate how they may react to her actions. A (non-cooperative) game consists of the following elements: • Set of players: Who is involved in the situation at hand? • "Rules" of the game (i.e., sequencing of actions available to players at different points in time and their information structures at such points): Which player moves when? What do they know when they move? What actions can they undertake when they move? • Set of outcomes: For each feasible set of actions that players can take, what is the outcome? • Set of payoffs: What is the payoff to each player for any feasible outcome? For instance, consider a situation in which there are two workers 1 and 2 who must decide whether or not to put a certain amount of effort in a given activity. Effort is costly, but it yields some product that would not be produced otherwise. The wage of workers is tied to production, so that the utility of each worker is as given by the following payoff matrix: Effort No effort Effort No effort 2, 2 −1, 3 3,−1 0, 0 Note that, by convention, the payoff of the row player (worker 1 in this case) is given by the first component of each pair. For example, worker 1 earns 3 when it puts no effort in the task allocated but the other worker does. Also, there are four outcomes in this game, (Effort, Effort), (No effort, Effort), (Effort, No effort) and (No effort, No effort), with an associated payoff to both workers in each of them. Finally, it is worth noting that a plan of action is called “strategy,” so each player has two strategies in this simple game. Actions and strategies do not coincide in games in which players can take a sequence of moves, but they do coincide in the absence of dynamics, as in the static situation currently analyzed. 2 Some relevant assumptions underlying this illustrative situation are worth high- lighting: • A worker chooses her action not observing the other worker’s effort (because for example they are chosen simultaneously). • Workers cannot commit before engaging in this interaction to undertaking some action. • For each feasible set of actions, a worker knows her payoff as well as the other worker’s payoff. • Workers simply interact once. 3 Nash equilibrium In order to solve a game, a notion of stable outcome from which neither of the players wants to deviate is necessary. Such a stable outcome is called equilibrium. In the equilibrium of a game, each player needs to understand the incentives of other players so as to anticipate what they will do. Note that what these other players may be anticipated to do may in turn depend on what they anticipate the player to do, and so on, so there is a sense of circularity. The standard solution concept for games is Nash equilibrium, in which this circularity is resolved because such anticipations happen to be correct by definition of equilibrium; the other requirement of Nash equilibrium is that each decision maker does what is best for her given what she anticipates others to do. These two conditions —that each player best-responds to what she believes the others play and that such beliefs be correct—will be useful in numerous applications of game theory. 3.1 Alternative definition of Nash equilibrium An alternative view of the Nash equilibrium requirements is that strategies form an equilibrium when no player has an incentive to deviate from her strategy keeping 3 the other players’strategies fixed. The absence of unilateral incentives to deviate from a strategy by any player is therefore what characterizes a Nash equilibrium. For example, the strategy profile (Effort, Effort) cannot constitute an equilibrium because any worker could earn 3 instead of 2 by exerting no exert, keeping the other worker’s effort level fixed. However, the strategy profile (No effort, Effort) cannot constitute an equilibrium either because worker 2 could earn 0 instead of −1 by not exerting any effort, given that worker 1 is exerting no effort. For a similar reason, the strategy profile (Effort, No effort) is not an equilibrium because worker 1 could profitably deviate in a unilateral manner. To conclude with the search for an equilibrium, note that the strategy profile (No effort, No effort) is indeed an equilibrium because no worker has a unilateral deviate from her strategy. In such an equilibrium, each worker earns 0. In order to introduce a more systematic way to find the Nash equilibrium of a static game, consider the following fictitious game played by Apple and Microsoft. Suppose that Apple benefits from improving its iPhone device, regardless of what Microsoft does. However, Microsoft benefits from developing a new app if and only if Apple introduces the improvements. Improving the iPhone is more profitable for Apple if Microsoft develops the new app, though. The payoff matrix with Apple as the row player and Microsoft as the column player is as follows: Improvement No improvement Development No development 3, 2 2, 0 1,−1 1, 0 The unique Nash equilibrium is (Improvement, Development), with respective payoffs for Apple and Microsoft of 3 and 2. If Apple enhances the iPhone and Microsoft develops a new app, none of them has a unilateral incentive to deviate. As any other tool, game theory has its limitations, but this simple example helps to illustrate well its power. What Microsoft would like to do depends on what it believes Apple will do: Microsoft would not develop the app if it believed that Apple will not improve the iPhone. Analyzing Apple’s incentives reveals that Apple wishes 4 to enhance the iPhone regardless of what Microsoft, so Microsoft has no reason to believe that Apple will not improve the iPhone. Based on the belief that Apple will improve the iPhone, Microsoft will be led to develop the new app. In fact, even though Apple cannot observe Microsoft’s action, it can perfectly foresee it, just as Microsoft could foresee Apple’s. 3.2 Reaction functions This way of looking at the problem suggests a useful approach towards finding an equilibrium. This approach is based on so-called best-response analysis. Following up on the original definition of Nash equilibrium, a best-response function for a player indicates the optimal strategy that such a player would follow given her beliefs about the strategies followed by other players. Best-response functions are also called reaction functions. In a Nash equilibrium, each player must be best-responding to all the other players, soone simply needs to seek for the intersection of reaction functions to find a Nash equilibrium. At the intersection of reaction functions, all players are best responding to what they believe the others to be playing and such beliefs happen to be correct, thus satisfying the Nash equilibrium requirements. This approach to the problem shows that a player typically needs to understand not only her incentives (captured by her reaction function) but also the incentives of other players (captured by their respective reaction functions). In our last example, Apple best-responds to any belief it has about Microsoft’s strategy by improving the iPhone, which is indicated in Figure 1 by the circles. I NI D ND Figure 1 If Microsoft believes Apple to improve the iPhone, then it will best-respond by developing the app; however, if Microsoft believes that Apple will not improve the 5 iPhone, it will best-respond by not developing the app. Microsoft’s reaction function is represented by means of crosses in Figure 1. Inspecting where reaction functions overlap, one can easily see that they do at the only Nash equilibrium that exists in this game: (Improvement, Development). 3.3 Nash equilibrium in games with a continuum of actions This way of proceeding is very useful when players have complex action sets, as the following example illustrates. Thus, let us suppose that Francisco and Gastón (respectively denoted by i = 1 and i = 2) are collaborating in a research project that needs to be finished by one week. Each of them has to decide how many weekly hours to devote to the project, which we denote by ei. Any teamworker can only observe his action, not the action chosen by the other one. Research quality q depends on the total number of hours put on the project and some interaction term that captures sinergies between the two teamworkers: q(e1, e2) = e1 + e2 + e1e2. The cost of choosing an effort level ei ∈ [0, 168] equals e2i , and the total utility of teamworker i ∈ {1, 2} is ui(e1, e2) = e1 + e2 + e1e2 − e2i . Suppose that the Nash equilibrium is denoted by (e∗1, e ∗ 2). Then it must be true that e∗1 maximizes u1(e1, e ∗ 2) = e1 + e ∗ 2 + e1e ∗ 2 − e21 with respect to e1. Since this function is strictly concave (i.e., ∂2u1(e1, e∗2)/∂e 2 1 = −2 < 0), the following first-order condition must hold: 1 + e∗2 − 2e∗1= 0. (1) Analogously, player 2 must be best-responding to e∗1 by choosing e ∗ 2, which leads to the following condition: 1 + e∗1 − 2e∗2= 0. (2) 6 Solving equations (1) and (2) yields that e∗1 = e ∗ 2 = 1, so the unique Nash equilibrium is (e∗1, e ∗ 2) = (1, 1). 3.4 Reaction functions with a continuum of actions Paralleling what we did with the previous example, we can use reaction functions to find out the Nash equilibrium of the current game by means of graphical analysis. If player 2 is believed to choose ê2, then the reaction function of player 1 must be given by the choice of e1 that maximizes u1(e1, ê2) = e1 + ê2 + e1ê2 − e21. Since this function is strictly concave (i.e., ∂2u1(e1, ê2)/∂e21 = −2 < 0), the following first-order condition must indeed give player 1’s reaction function: 1 + ê2 − 2e1 = 0. Solving it gives eR1 (ê2) = 1 + ê2 2 , where eR1 (ê2) denotes player 1’s reaction function. For example, if player 2 is believed to exert no effort, then player 1 best-responds by choosing eR1 (0) = 1/2; if player 2 is believed to choose ê2 = 1, then player 1 best-responds by choosing eR1 (1) = 1. A similar analysis yields how player 2 best-responds to a belief that player 1 chooses ê1: eR2 (ê1) = 1 + ê1 2 . Plotting these functions as in Figure 2, it is easy to see that they intersect at (e∗1, e ∗ 2) = (1, 1), since this is the only point in the current case in which each player best- responds to the other one. 7 )ˆ( 21 ee R )̂( 12 ee R 2̂e 1̂e1 * 1 =e 1*2 =e Figure 2 4 Dynamic games Thus far, we have considered static (one-shot) games in which players move once at the same time. Many relevant situations involve a sequence of moves, as the following example illustrates. Two firms A and B must each choose whether to enter a certain market with a large or a small scale of production. A chooses first and B chooses after having observed A’s choice. If both have a large scale, then each gains 0. If both have a small scale, then each gains 3. If the firms have different scales, then the small firm gains 2 and the large firm gains 4. 4.1 Extensive-form representation and backward induction This two-stage game can be represented through the so-called extensive-form repre- sentation of the game as in Figure 3: 8 A (2,4) (3,3) B Large Small SmallLarge (0,0) (4,2) B Large Small Figure 3 Such a representation is also called game tree. Game trees are solved backwards: one starts from the bottom and works his/her way through the top. This backward induction logic captures the idea that a rational player must foresee how his/her current actions will impact future play. The solution concept underlying this logic is called perfect Nash equilibrium. Starting from the bottom (the game’s second stage in the current case), one must identify the nodes where B makes a decision and finds out the optimal behavior for each node, as illustrated in Figure 4: A (2,4) (3,3) B Large SmallLarge (0,0) (4,2) B Large Small Figure 4 Next, one goes up along the tree, taking into account how firm A anticipates that B will react to each of its choices, and finds out the optimal behavior of A in the first stage, as Figure 5 illustrates: 9 A (2,4) (3,3) B Large Small SmallLarge (0,0) (4,2) B Large Small Figure 5 If we changed the example and let firms choose simultaneously rather than se- quentially, then B would not observe A’s choice, which is represented through a “broken oval”as in Figure 6: A (2,4) (3,3) B Large Small SmallLarge (0,0) (4,2) B Large Small Figure 6 When firms move sequentially, the outcome of the unique perfect Nash equilib- rium is that firm A chooses a large scale of entry and B chooses a small scale. Firms A and B earn 4 and 2, respectively. If both firms chose their scale of entry at the same time, then there would exist another Nash equilibrium in which firm B would choose a large scale and A would choose a small scale. Having firm A choose first allows it to make the equilibrium that is worse for it disappear. Unlike what this example seems to teach, it is worth noting that moving first is not always better than moving second. 10 4.2 Subgame-perfect Nash equilibrium The example we have just analyzed is simple but instructive. We now introduce a richer one that requires a bit more of sophistication when solving it. To this end, consider a situation in which firm A must choose whether or not to enter a market in which B is already active. If A enters, then B observes it, and each simultaneously chooses whether to charge a high or a low price. Figure 7 displays the extensive-form representation of this game given some payoffs: A (-15,-5) (5,-10) (-10,-5) (15,5) No entry Entry (0,2) A BB plow phigh phighplow plow phigh Figure 7 As is standard, the game tree is solved backwards, starting from below and identi- fying the so-called game’s subgames (i.e., the circled nodes not contained in a broken oval and everything under them, ovals included). In the previous example, there were three subgames, but there are just two in the currrent one: the one that starts the second time A moves (see Figure 8) as well as the entire game: 11 (-15,-5) (5,-10) (-10,-5) (15,5) A BB plow phigh phighplow plow phigh Figure 8 When players’strategies induce a Nash equilibrium in every subgame of the game, the resulting Nash equilibrium is labeled “subgame-perfect.”In the current case, the unique subgame-perfect Nash equilibrium outcome is that A enters and both firms charge a high price afterwards, as illustrated by Figure 9: A (-15,-5) (5,-10) (-10,-5)(15,5) No entry Entry (0,2) A BB plow phigh phighplow plow phigh Figure 9 12
Compartir