Logo Studenta

Nagaraj (2018)

¡Este material tiene más páginas!

Vista previa del material en texto

The Private Impact of Public Information:
Landsat Satellite Maps and Gold Exploration
Abhishek Nagaraj*
October 22, 2018
Abstract
The public sector provides many types of information, such as geographic and census maps, that firms
use when making decisions. However, the economic implications of such information infrastructure
remain unexamined. This study estimates the impact of information from Landsat, a NASA satellite
mapping program, on the discovery of new deposits by large and small firms in the gold exploration
industry. Using a simple theoretical framework, I argue that public sector information guides firms on
the viability of risky projects and increases the likelihood of project success. This effect is especially
relevant for smaller firms, who face higher project costs and are particularly deterred from engaging in
risky projects. I test the predictions of this framework by exploiting idiosyncratic timing variation in
Landsat coverage across regions. Landsat maps nearly doubled the rate of significant gold discoveries
after a region wasmapped and increased themarket share of smaller, junior firms from about 10% to 25%.
Public information infrastructure, including mapping efforts, seem to be an important, yet overlooked,
driver of private-sector productivity and small business performance.
*UC Berkeley-Haas. Email: nagaraj@berkeley.edu.
“Freely available data from the US Government is an important national resource, serving as
fuel for entrepreneurship, innovation, scientific discovery, and other public benefits.”
—US Government Open Data Directive (Martin et al., 2013)
1 Introduction
Public infrastructure is often seen as a key determinant of private-sector investment (Munnell and Cook,
1990), productivity (Aschauer, 1989), and economic growth (Barro, 1990). Research focusing specifically
on physical infrastructure, such as roads, airports, the internet and the postal system, has offered quantitative
estimates of the benefits of this infrastructure on productivity and innovation in the private sector (Agrawal
et al., 2014; Fernald, 1999; Bernstein et al., 2016; Giroud, 2013; Agrawal and Goldfarb, 2008; Forman et al.,
2012; Seamans et al., 2017).
While this literature has provided useful evidence on the effects of public infrastructure investment, less
attention has been paid to the impact of “information infrastructure,” which refers to the public sector’s
historically important role in providing basic data. Prominent examples includemaps of a region’s geography
and economy collected by organizations such as the USCensus Bureau and the USGeological Survey (David
andWright, 1997; Anderson, 2015), administrative data such as tax and insurance records (Card et al., 2010),
as well as non-geographic “mapping” initiatives like the Human Genome Project or the Hubble Telescope
(Williams, 2013; Stephan, 2012). While the cost of public information can be quite significant, the diffuse
nature of information means that its value is often more in doubt, and harder to quantify, as compared to
physical infrastructure.
Further, public information could affect private-sector investment decisions by providing data on the viability
of risky projects. This channel might not only increase total investment, but it could also be particularly
valuable for smaller firms. These firms face higher costs of starting a new project and are consequently less
likely to undertake risky projects. Public data might help small firms evaluate uncertain projects, lowering
their risk and permitting entry despite higher project costs. For example, a retail entrepreneur might use
public census data to reduce uncertainty about a promisingmarket, increase the expected value of her venture,
and make it viable, notwithstanding the higher cost of external capital that she might face. Despite this
possibility, we lack systematic evidence on the differential value of public information infrastructure for
smaller and larger firms.
Estimates on the value of public information infrastructure to the private sector, including differential es-
timates for smaller and larger firms, would inform important policy debates in this area. Lacking such
evidence, prominent (albeit costly) information infrastructure projects, including the census, weather data,
and satellite mapping, are mired in debate about the extent to which they should be funded (Borowitz, 2017;
Gabrynowicz, 2005). For example, as recently as May 2018, the US government cited cost considerations
and uncertain value as a reason to possibly curtail funding to the Landsat satellite imagery program, the case
I study to make progress on this topic (Popkin, 2018).
The NASA Landsat satellite mapping program provided the first maps of Earth from space and happened to
contain useful geological information relevant to firms in the $5 billion gold exploration industry (Rowan
et al., 1977; Schodde, 2011).1 I focus on the impact of Landsat maps on two related dimensions: its role in
shaping discoveries of new gold deposits at the aggregate level and the distribution of discoveries between
larger firms (“seniors”) and smaller firms (“juniors”) in this industry.2 While satellite imagery is just one
of many types of public information, a significant role for Landsat in affecting junior and senior discovery
would suggest that publicly-provided information could be an important type of government infrastructure
that needs further investigation.
I first develop a simple theoretical framework that clarifies how public sector can increase downstream
investment by providing signals about the viability of risky projects. This framework also shows how public
information could benefit smaller firms in sectors where they face higher project costs. In the model, juniors
and seniors make risky investment decisions in gold exploration and I explore how the pattern of discoveries
changes after the arrival of public mapping. In line with this setting, I assume that juniors face higher project
(or exploration) costs, on average, as compared to larger firms. The arrival of a public map provides firms
with an imperfect, yet useful and unbiased signal on the likelihood of discoveries, and firms use this signal
to construct a posterior likelihood of discovery in a Bayesian framework. In this setup, public mapping leads
to (1) increased total discoveries, (2) increased market share for junior-led discoveries and (3) an inverted-U
shape relationship between the cost disadvantage of juniors and the increase in their market share. This third
point provides a useful empirical prediction that validates the differences in project cost mechanism as an
important driver of the heterogeneous effects of public mapping on juniors and seniors. Figure 1 provides
an overview of the theoretical framework.
To test the predictions of this model, I make use of the fact that the first phase of the Landsat program was
operational over two decades. Many blocks on earth (regions of 100 sq. mile) weremapped in the early stages
of the program, but there was a long tail of regions that were mapped significantly later, over the next decade.
Quantitative assessments (as shown in Figure 2) and qualitative interviews indicate that even though some
of this variation was driven by endogenous choices (prioritizing the USA), a large part of this variation was
unintentional and occurred due to technical failures and cloud-cover in imagery. I match this variation in the
timing of the mapping effort with data on significant gold discoveries obtained from a proprietary database
of major discoveries between 1950 and 1990. The quantitative estimates isolate the impact of the quasi-
random variation in the timing of the mapping effort on exploration outcomes in a differences-in-differences
framework and a battery of robustness tests help to confirm the validity of this specification.
The empirical results confirm the predictions of the theoretical model. In baseline estimates, mapped regions
were almost twice as likely tosee a discovery when compared to unmapped regions after controlling for
1Conceptually, the findings from this study could generalize to exploration for other natural resources such as oil, gas, copper,
and uranium, although global discovery data for these industries are harder to obtain.
2In keeping with industry convention, I use the terms seniors and juniors to indicate larger and smaller firms when I refer to
these firms in the context of gold exploration.
region and time indicators. Further, public information from Landsat maps significantly increases the share
of junior-led discoveries: before Landsat, juniors made about one of every ten gold discoveries; after regions
were mapped with Landsat, this rate jumped to one in four. This translates to a 5.8-fold increase in the rate
of discoveries for junior firms, compared to a factor only 1.7 for senior firms. Finally, using the quality of
local institutions in regions around the world as a proxy for the junior cost disadvantage, I show that the
junior benefit follows an inverted-U pattern. In line with the final prediction of the theory, the junior share is
magnified in regions with a moderately-high project costs, but decreases in regions with very poor quality of
institutions and extremely-high project costs. It is important to note that I do not observe costs, so additional
benefits accrue to juniors in terms of discoveries, but the corresponding implications for profitability are
beyond the scope of this study.
This work makes four contributions to our understanding of how public infrastructure impacts private in-
vestment. I highlight public information infrastructure an important, but poorly understood, topic of inquiry.
Second, I develop a theoretical framework that clarifies how public information infrastructure might affect
the private sector. By providing signals about the viability of risky projects, public information provision
encourages downstream investment. Third, this framework highlights how public infrastructure might have
differential effects across larger and smaller firms and how it could be used to encourage downstream in-
vestment by smaller firms who face higher project costs. And finally, I present a novel empirical framework
through which the value of information infrastructure projects might be evaluated. This exercise provides
quasi-experimental estimates of the value of information infrastructure, which can guide policy-making in
this area. This study also informs ongoing policy debates about the value of the Landsat and related programs.
The paper proceeds as follows. Section 2 discusses the related literature and provides a theoretical frame-
work and empirical predictions. Section 3 explains how the Landsat project was implemented and provides
an overview of gold exploration and the role of junior and senior firms. Section 4 describes the data and
research design and Section 5 highlights key estimates corresponding to the theoretical predictions. Section
6 concludes.
2 The Private Impact of Public Information: Theoretical Framework
2.1 Related Literature
The stock of US public capital amounts to over $11 trillion (IMF, 2017). Public capital usually includes tan-
gible capital stock owned by the public sector including so-called “core” infrastructure i.e., roads, railways,
airports, and utilities, such as sewerage and water facilities, hospitals, educational buildings, and other public
buildings (Bom and Ligthart, 2014). The literature has argued that spending on such infrastructure, such as
the transportation network, can be beneficial for productivity because it reduces the cost of operations for
regular business activities, such as transportation (Aschauer, 1989; Munnell, 1992). For example, massive
investments in the national highway network in the 1950s and 1960s were an important driver of productiv-
ity among vehicle-intensive industries (Fernald, 1999) and a 10% increase in a region’s stock of interstate
highways has been associated with a 1.7% increase in regional patenting (Agrawal et al., 2014). Similarly
lower communication costs from internet infrastructure has increased research productivity of universities
(Agrawal and Goldfarb, 2008) and boosted tradable services across borders (Arora and Gambardella, 2005).
Despite a large amount of prior work on physical infrastructure, the management and economics literatures
have largely overlooked the impact of public sector investments in non-physical “information infrastructure.”
Public information infrastructure can be broadly defined as “information, including information products and
services, generated, created, collected, processed, preserved, maintained, disseminated, or funded by or for a
government or public institution” (Ubaldi, 2013, p.5). While public information can take many forms, three
prominent types are worth mentioning. The first includes geographic information about a region such as
aerial and satellite imagery, geological and hydrological data, social data (on labor markets, health, demo-
graphics etc) as well as meteorological information (including climate data and weather forecasts). Second,
administrative data produced by a variety of different agencies in the course of their daily operations can
be valuable (Card et al., 2010). This includes data from from public insurance agencies, social security
payments, as well as data on patent and trademark applications (Ray, 1997; Graham et al., 2013; Marco
et al., 2015). Finally, data from publicly-funded scientific projects are also relevant. This includes landmark
scientific efforts such as the Human Genome Project, the Hubble Telescope and the NIH BRAIN mapping
initiative (Williams, 2013; Stephan, 2012; Insel et al., 2013; International Human Genome Sequencing Con-
sortium, 2001).
Proponents of public information suggest strongly that it can help “to make better decisions and improve
the quality of lives” making it an “important source of economic growth, new forms of entrepreneurship
and social innovation” (Ubaldi, 2013, p.4). In terms of estimates, the claimed social value of public data is
significant. According to a survey conducted by the European Commission, the overall market size of public
information in the EU is estimated to range from EUR 10 to EUR 40 billion (Data Policy and Innovation
Unit, EU, 2018). Numerous examples of the use of public information in the private sector also abound.
The Government Lab at NYU publishes a list called the Open Data 500 that lists hundreds of companies that
rely on publicly-provided data in different countries around the world (Martin et al., 2013). For example,
data from the UK’s national mapping agency, the Ordanance Survey is used by IDC Consulting, an energy
consulting firm, to identify optimal locations and terrain for renewable energy sites (Verhulst and Young,
2016).
Smaller firms are especially likely to benefit from public information since they are resource constrained
and face higher project costs. Public information can help lower the risks associated with new projects and
increase the expected value of investment making previously unprofitable investments viable. For example,
public demographic information could help retail entrepreneurs derisk a business location decision, and
increase the expected value of their project. In fact, demographic information fromNewYork city’s Business
Atlas database has been used by entrepreneurs to identify promising markets in the face of high costs of
external capital from banks (Verhulst and Young, 2016).
While public information could be valuable for the private sector and small businesses, it is also quite ex-
pensive to provide. In the US, public information projects with significant budgets include the Census Bu-
reau (budget $3.8 billion), the National Oceanic and Atmospheric Administration (NOAA), which provides
weather and disaster maps (budget $1.35 billion); and the US Geological Survey (USGS), which provides
topographic and geological maps (budget $859.7 million) (Officeof Management and Budget, 2018; U.S.
Geological Survey, 2018). These costs have led to serious and ongoing debates in policy circles about
whether and how public information projects should be funded. For example, the Canadian government
canceled the long-form of their Census in 2010, before re-instating it in 2016 (Young, 2015). The Landsat
program has faced even stronger debate in Congress, with one account comparing it to “the heroine of an
episodic silent film serial” given that it “has been tied to the figurative railroad tracks a number of times in
its tumultuous history” (Gabrynowicz, 2005, p.45). Multiple acts of Congress since the mid-1980s that have
either tried to transfer the Landsat program to the private sector or bring it in the public domain, depend-
ing on the political climate of the day (Johnston and Cordes, 2003; Borowitz, 2017) and similar efforts are
underway as recently as 2018 (Popkin, 2018).3
In addition to cost considerations, the debates on the value of public information are fueled by the lack of
empirical evidence on their value. While illuminating, past work mostly relies on surveys or qualitative case
studies by policy researchers and causal estimates of the value of public information remain elusive. This
work concedes that “the measurement of benefits ... is still imprecise” (Stott, 2014, p.14) owing to limited
empirical data and identification challenges. Work that specifically tries to value satellite-derived informa-
tion has found the “economic benefit ... to be smaller than conventional belief might suggest” (Macauley,
2006, p.274). Further, while the potential of public information to level the playing field between larger and
smaller firms has been suggested, the literature is also silent on the heterogeneous impact of public informa-
tion for smaller and larger firms. Therefore, despite accounts of the value of public information for small
businesses, information infrastructure has rarely been used as a policy tool to encourage entrepreneurship.
To make progress, I develop a theoretical and empirical case for the value of Landsat for larger and smaller
firms in the gold exploration industry. This question is guided by the insight from the literature that public
infrastructure matters for the private sector, although it largely ignores the role of information infrastructure.
Policy evidence, surveys and case studies that point to the possible importance of public information provide
additional motivation as do the ongoing policy debates on this topic.
2.2 A Simple Model of Gold Exploration
“A map is not something that tells you where to go, it’s a tool that lowers the risk involved in
your journey ... [it] is the ultimate tool of investment because it derisks a process.”
—Richard Jefferson, Skoll World Forum 2013 4
3In this paper, I study the first era of the Landsat program when these fluctuations were less meaningful.
4Full talk can be accessed at: https://www.youtube.com/watch?v=of8ai1HhqK4.
https://www.youtube.com/watch?v=of8ai1HhqK4.
In order to develop a framework to assess the value of Landsat, I take inspiration from the theoretical lit-
erature in decision theory and economics to assess the “value of information” when making risky decisions
(Hirshleifer and Riley, 1979; Macauley, 2006; Iyer et al., 2007) as well as more recent theoretical frame-
works on the value of information in business experimentation (Ewens et al., 2018) and evaluation markets
(Luo et al., 2018). This general idea has been applied in environmental science to assess the value of geo-
graphic information (Bouma et al., 2009). I describe this simple framework and the ensuing predictions in
this section, expanding on some of the proofs in the appendix.
A. Setup: The purpose of this model is to assess the role of mapping information in influencing the number
of discoveries across junior and senior firms in the gold exploration industry. In the model, a continuum of
two types of firms, juniors and seniors, are independently engaged in gold exploration on a parallel set of
assigned tracts of land, which I call blocks. A block can belong to one of two states {G,NG}, where G
indicates that it contains a gold deposit of value V , or NG that indicates that no gold is present and ensuing
value is zero. Firms are not aware of the true state of their assigned block. Instead, all firms have a common
prior p0 of the likelihood of discovering gold. Each firm i bears a unique cost ci if it decides to explore and
before it realizes the value of its exploration.
In this setup, the costs for juniors and seniors are drawn from different distributions such that on average,
juniors have a higher exploration costs for their project than seniors. The distribution of these project costs
and the expected value of gold discovery determines the equilibrium number of discoveries and the equi-
librium market share of juniors. The arrival of a public map is modeled as the arrival of an imperfect but
useful signal as to whether gold is likely to be present for each block where firms are exploring. In the post-
mapping period, all firms observe this signal for the block they are exploring and then update their priors
before making their costly investment decisions and realizing their final outcomes.
The map is modeled as providing a positive or negative signal s ∈ {s+, s−} depending on the “true” stateG
orNG. The map is defined by a parameterm, which is equal toP (s+|G) (true positive) and alsoP (s−|NG)
(true negative). The map is not fully reliable in the sense that if it reports a positive signal, it is more likely
that the true state of nature is to have gold than not have gold, although a positive signal is not a guarantee of
making a discovery. In other words, I assume thatm > 1−m or thatm > 1/2 andm < 1. We also assume
that firms’ priors are accurate, i.e. P (G) = p0, and the map in unbiased, i.e. it transmits signals in line
with P (G). These assumptions mimic other models of bayesian updating in the literature (Luo et al., 2018;
Iyer et al., 2007). Based on these properties and Bayes rule, we can derive P (G|s+) and P (G|s−) which
are the posterior probabilities of discovering gold conditional on observing a positive or negative signal,
respectively. Note that P (G|s+) > p0 > P (NG|s+), i.e. the posterior after a positive signal is greater than
the prior and the posterior after a negative is lower than the prior. Finally, we can also derive P (s+), which
is the unconditional probability that a map indicates a positive signal. See Figure 1, Panels A and B, which
provide an overview of the theoretical setup.
Within this setup, consider a senior and junior firm exploring for resources. For the sake of simplicity, assume
that these firms do not compete directly but rather engage in independent exploration. This assumption,
while simplistic, is representative of the gold exploration process because firms often explore independently
in different regions and obtain monopoly exploration rights for those regions. The model captures the stage
of the exploration process when firms have chosen different regions to explore and are deciding on whether
to make any further exploration investments. In the appendix, I develop an extension of the model to the
case where multiple firms compete to explore in the same block (albeit without strategic interaction) and
show that the baseline results are robust to this relaxing this assumption (see Appendix B, competitive case).
In the model, junior and senior firms diverge in terms of their fixed costs of exploration. Specifically, a senior
firm engaged in exploration has costs ci ∼ U [CS , C̄], where U denotes the uniform distribution and C̄ is a
sufficiently high integer. Similarly, a junior firm has costs ci ∼ U [CJ , C̄]. We assume a mass of firms spaced
evenly along this uniform distribution. To capture that on average, junior firms have higher fixed exploration
costs than seniors, we assume thatCJ > CS . This assumption that junior firms have a higher lowerbound of
fixed costs of exploration is a simplified representation of a number of different drivers of differential cost.
These include higher transaction costs and difficulty gaining exploration licenses, higher costs of capital, and
their inability to spread costs of equipment and personnel across different projects. While I do not model
the drivers of differences in cost, the difference between CJ and CS captures this difference in a reduced
form. In Section 3.2 I provide more background to back up this assumption. Note that this assumption does
not rule out the possibility that certain junior firms might have lower costs of exploration than certain senior
firms, given that there is a common and high upper bound on the costs of exploration for either junior or
senior firms. The assumption only implies that the cheapest junior firm has higher costs that the cheapest
senior firm. With this setup, we will derive Dpre and Dpost which is the total number of discoveries before
and after mapping, and Spre, Spost and Jpre and Jpost which are the number of senior and junior discoveries
pre- and post-mapping respectively.
B. Simple example: Before deriving the predictions formally, it is helpful to run through a simple numerical
example that summarizes the framework. Assume the following values for the five external parameters
governing the model. Let the value from a discovery V equal $100 million and the prior probability of gold
discovery p0 to be 5%. Let CS = $0.5million and CJ = $4.5million, implying that seniors are drawn from
a uniform distribution with lower bound at $0.5 million and juniors from a uniform distribution with lower
bound $4.5 million. Finally, m = 0.7, meaning that there is a 70% chance of the map producing a positive
signal if the true state is G or a negative signal if the true state is NG.
With these parameters, note that the unconditional probability of a positive signal, P (s+) = P (G) ∗
P (s+|G) + (1 − P (G)) ∗ P (s+|NG) = (.05 ∗ 0.7) + (0.95 ∗ 0.3) = 32%. By Bayes rule, P (G|s+) =
P (s+|G) ∗P (G)/P (s+) = (0.05 ∗ 0.7)/0.32 = 10.94% and similarly, P (G|s−) = 2.21%. In other words,
after mapping, the posterior probability of discovering gold goes up from 5% to 10.94% after a positive
signal and goes down to 2.21% after a negative signal.
Now, in the pre-mapping period, expected value of investment is V p0. Therefore, all firms with costs below
V p0, i.e. $5 million will invest and in expectation, 5% of these firms will discover a deposit, leading to
E(Jpre) = 0.05 ∗ (5 − 4.5) = 0.025. Similarly, E(Spre) = 0.225 and E(Dpre) = 0.25. Juniors therefore
have a 10% market share. Now assume that a public map is available that provides a positive or negative
signal depending on the true state of the block. Conditional on a positive signal, the expected value of
investing is $10.94 million, so all firms with cost below this value will invest, and 10.94% of those firms
will make discoveries in expectation. By this logic, we get E(Spost|s+) = 1.142 and E(Jpost|s+) = .704.
Similarly after a negative signal the shares are E(Spost|s−) = 0.038 and E(Jpost|s−) = 0. Also note
that the unconditional expected share of discoveries is simply the weighted sum of the conditional shares,
i.e. E(Spost) = P (s+) · E(Spost|s+) + (1 − P (s+)) · E(Spost|s−) with a similar expression for juniors.
Therefore we get E(Spost) = 0.391, E(Jpost) = 0.225 and therefore, E(Dpost) = 0.616. The market share
of juniors is 36%.
Compared to the pre-mapping period, total discoveries jump from 0.250 to 0.616 and the market share of
juniors jumps from 10% to 36% after a public map is available. In other words, the arrival of a public
map increases the number of total discoveries substantially and this increase disproportionately benefits
junior firms, even though both juniors and seniors increase total discoveries. The only difference between
junior and senior firms in this model was the distribution of costs, which points to the possible role of this
channel in driving the heterogenous impact of mapping between junior and senior firms. In what follows, I
develop formal predictions for the two main effects, but also an additional prediction that points to the role
of differences in cost as an important driver of the increased market share for junior.
C. Theoretical Predictions: Expressions for pre- and post-mapping discoveries: I first describe ex-
pressions for the key outcomes (see Appendix B for derivation). In the pre-mapping period, E(Jpre) =
p0(V p0−CJ) andE(Spre) = p0(V p0−CS). Note that sinceCJ > CS , we haveE(Jpre) < E(Spre) =⇒
that juniors have a smaller share of the market as compared to seniors, even though they do have a positive
market share as long as CJ < V p0. This explains why, juniors might have a smaller, albeit non-zero, market
share before the arrival of mapping information. Now, assume that p0 ·P (G|S+) + (1− p0)P (G|S−) = p̃,
which can be thought of as the sum of the posterior probabilities weighted by the common prior. In the post-
mapping period we have E(Jpost) = p0(V p̃−CJ) and analogously, E(Spost) = p0(V p̃−CS). Intuitively,
if V p0 is the cost threshold beyond which firms do not invest in the pre-mapping period, V p̃ is the analogous
threshold in the post-mapping period after taking into account the likelihood of both negative and positive
signals. Further, E(Dpre) = p0
(
2V p0 − CJ − CS
)
and E(Dpost) = p0
(
2V p̃ − CJ − CS
)
. With these
expressions, we are now ready to make our key predictions.
Proposition 1. p̃ > p0 This proposition states that the sum of the posterior probabilities weighted by the
common prior is always greater than the prior. This is a standard result in Bayesian signalling models where
the signal is assumed to be unbiased, i.e. the signal returns positive and negative signals exactly in line with
the firms’ priors (Iyer et al., 2007). In cases where firms’ priors might be too optimistic or pessimistic about
the probability of making a discovery, this result might not hold. See Appendix B for the complete proof for
this proposition.
Armed with Proposition 1, we can now derive the key predictions that we will test in the empirical results.
Prediction 1 (Total discoveries increase). The number of total discoveries increases in a region after a map
is provided, i.e. E(Dpost) > E(Dpre)
One way to think of the model is in terms of cost thresholds. All firms with costs below the expected value
from discovery will invest and a constant proportion of those firms will make a discovery. Therefore, an
increase in the threshold, will lead to greater entry from higher cost firms and increase discoveries. In the
model, the cost thresholds beyond which firms do not invest goes from V p0 to V p̃, and since p̃ > p0, we have
that the cost threshold under which firms invest goes up. Therefore, we get the result that total discoveries
should increase. See Appendix B for the complete proof.
Prediction 2 (Junior market share increases). While both juniors and seniors will see an increase in the num-
ber of discoveries, junior firms see an increase inmarket share after amap is provided, i.e. [E(Jpost)/(E(Jpost)+
E(Spost))] > [E(Jpre)/(E(Jpre) + E(Spre))]
A key to the model is that, irrespective of whether a public map is available, the threshold below which
firms invest is the same for both junior and senior firms. The main difference is that there are more senior
firms with costs under the threshold as compared to junior firms. As outlined in Prediction 1, the common
threshold increases from V p0 to V p̃, which leads to an equal number of junior and senior firms with costs in
the interval {V p0, V p̃} to enter. Since a smaller number of junior firms are already exploring, this increase
in percent terms is greater for juniors as compared to seniors. Therefore, juniors increase their market share
in the post-mapping period. See Appendix B outlines for the complete proof.
Prediction 3 (Inverted-U pattern of junior market share increase). Junior market-share increases within-
creasing cost-gap but this pattern reverses in regions where cost-gap is so high that pre-mapping mar-
ket share is zero, i.e. if ∆ = [E(Jpost)/(E(Jpost) + E(Spost)] − [E(Jpre)/(E(Jpre) + E(Spre)] and
X = CJ − CS then d∆dX > 0 for CS +X < V p0 and
d∆
dX < 0 for CS +X > V p0.
Finally, while in principle heterogenous impact of public mapping between juniors and seniors could be
driven by a number of different mechanisms, the model focuses on cost differences as one prominent driver.
Prediction 3, traces out the implications of this mechanism. Specifically, this prediction states that the benefit
for junior firms of a publicly available map as compared to the cost-gap betwen juniors and seniors has an
inverted-U shape. Intuitively, as the cost gapX increases, Jpost decreases at a slower rate than Jpre, thereby
leading to a greater benefit to the junior sector. However, once the cost gap is so high that CJ > V p0, then
Jpre is zero, and therefore cost gap increases only serve to decrease Jpost without a corresponding decrease
in Jpre. This translates into a decrease in the ∆ as the cost gap increases. Figure 1, Panel C provides an
illustration of this prediction. Proof for Proposition 3 is outlined in Appendix B.
Overall, predictions 1, 2 and 3 provide a clear picture on the possible mechanisms through which public
mapping affects exploration outcomes. I now turn to describing the empirical data and strategy and then test
the theoretical predictions.
3 Empirical Setting
3.1 Landsat program
Landsat is the first and longest-running program to provide images of the Earth from space. Launched in
1972, the Landsat program has overseen seven satellite launches that capture images of Earth with multi-
spectral cameras. These are not high resolution images where one can divine individual buildings or struc-
tures, but are useful for analyzing land use and geological features such as mountain ranges and water bod-
ies. Each image from the first generation Landsat program covers an area of about 100 by 100 miles. In my
dataset, I include the 9493 satellite imaging locations that are required to cover all of Earth’s land masses (not
including Antarctica and Greenland). The unit of analysis in this paper is a block of land that corresponds to
a Landsat image.
The focus of this paper is the first generation of satellites in the Landsat series (Landsats 1, 2 and 3) that
operated between 1972 and 1983. It was not possible for NASA officials to significantly change the orbits
of these satellites; however, program operators usually controlled what locations were prioritized for data
collection through regular instructions issued to the satellites. The Landsat satellites orbited the surface of
the earth every 18 days, so in principle, it was possible to take repeated images of every location on earth at
that frequency. However, as the Landsat literature notes, “The ill-founded but frequently-held assumption
that Landsat-type sensors are operated continuously as they orbit the Earth is not true” (Goward et al., 2006,
p.1155). In practice, as I will discuss in Section 4, many regions were left unmapped for almost a decade
after the launch of the program because of difficulties with collecting, storing, and relaying data back to
NASA.
The photos taken by the Landsat satellites imagery were relayed to the Earth Resources Observation and
Science (EROS) center in Sioux Falls, South Dakota. The center then distributed the data as tapes or physical
images and the captured information was openly distributed at a reasonable cost and without intellectual
property considerations, as we required by law. The prices for these data ranged from about $10 for a 10-
inch negative to about $50 for a 40-inch color photograph (Draeger et al., 1997). Because all Landsat imagery
was collected at EROS, by studying the archives of this institution I am able to collect information as to the
location of each block, when they were imaged, and the quality of the images including a measure of cloud-
cover at the image level.5 According to one estimate, the cost of the program at launch was approximately
5My interview with an EROS center employee suggest that the data on the use of these images by firms was highly sensitive
and has since been destroyed (Personal Communication, March 24, 2015). As such, it is unavailable for use in this research.
$125 million (Mack, 1990, p.83).
3.2 The Gold Exploration Industry
Gold is the second most intensively explored natural resource after oil and gas, and gold mining is a capital-
and time-intensive process. Even though the Landsat program had implications for a number of different
natural resources, my focus is on gold mining industry because of its relative size and importance in the
mining sector, as well as for reasons of data availability. Gold deposits tend to be relatively dispersed and
small across thousands of locations around the world, which reduces the entry barriers for smaller junior
companies (Dougherty, 2013). The process of exploring for gold is a risky endeavor with many points of
failure. These steps can be broadly classified as planning, reconnaissance exploration, target appraisal and
drilling (Gandhi and Sarkar, 2016). The early steps of planning and exploration rely on a team of geologists
analyzing existing data to decide on a target region. Since the presence of gold is quite diffuse and target areas
are small, it is not uncommon for firms to be exploring in parallel blocks, even if they are within the same
broad region. The later stages of gold exploration involves collecting and analyzing samples of rock and
sediment and testing these samples in the lab. Finally, the firm will then drill holes in the surface to confirm
the presence of ores and identify the economic potential of a target. Each stage of the process can be costly–
early exploration can cost upto a several tens of millions of dollars, while advanced exploration could be over
ten times this number. In early exploration, the largest cost items are geochemical and geophysical surveys
and drilling. The payoff for this exploration could be as much as a billion dollars or more per discovery
(Hart, 2014), although there is wide variation in this number.
The gold mining industry is organized into large “senior” firms that both operate mines and invest in ex-
ploration and small “junior” firms mostly funded by risk capital that are purely in the exploration business
(Humphreys, 2016). Government geological agencies are also involved in gold exploration and given their
relatively large size, will be treated to be a part of the “seniors” group. While I differentiate between juniors
and seniors in terms of their size (small vs. large), it is also appropriate to think of them as older vs. younger
firms as well. Juniors are usually market entrants looking to make their first major discovery. If successful,
they will return the proceeds to their investors and founders through sale or acquisition of the assets and so
it is rare for a junior to transition to a senior.
Juniors usually face higher project costs of exploration as compared to seniors. This difference can be
attributed to three related factors. First, seniors usually fund exploration projects through their cash flow
from other operations, while junior firms must raise capital externally, either through private placement
from specialized investors or equity financing from specialized stock markets (such as the Toronto Stock
Exchange, TSX) (Humphreys, 2016). While seniors do need to convince investors about the viability of
projects and have a tolerance for failure, junior firms spend substantial energy producing presentations and
chart books hoping to convey “good news” from early test results, in order to continually raise capital at
a reasonable cost (Gandhi and Sarkar, 2016). Second, seniors are usually able to redeploy assets such as
drilling equipment and specialized tools, which creates economies of scale and reduces exploration costs at
the projectlevel. This lowers costs at the project level. And finally, mining involves a number of transaction
costs such as obtaining property rights through exploration licenses and claims, as well as environmental
assessments and permissions from the local government (Schodde, 2015). Seniors typically have experience
with these activities which lowers their costs in this regard. Combined, higher cost of capital, inability to
spread fixed costs and higher transaction costs contribute to higher project costs faced by juniors as compared
to seniors.
Cost differences between juniors and seniors vary by geography. In regions with a poor quality of institu-
tions this cost difference is enlarged, given uncertain property rights and higher transaction costs, increasing
expropriation risk. Consider the case of Mundoro Mining, a junior company financed by American investors
operating in China. This firm had discovered promising prospects but to continue to explore it would need a
business license and an exploration permit from local agencies, but neither was forthcoming from the gov-
ernment, and ultimately the firm had to leave the country for a minimal profit after over five years of costly
exploration (Hart, 2014, p.135). Such incidents are common with junior firms who must pay an additional
price in terms of transaction costs in dealing with local authorities, which increases project costs consid-
erably. Further, capital costs are also higher since investors seek additional return for investing in regions
where property rights are less assured.6 Higher capital costs and increased expropriation risk means that the
cost difference between juniors and seniors is enlarged in regions with poor institutional quality.
In this setup, the gold mining industry was affected by the arrival of Landsat imagery which was useful
for understanding the Earth’s geology. A number of geologists and academics published papers in the mid
1970s that demonstrated how satellite imagery could be used to guide exploration (Rowan, 1975; Vincent,
1975; Rowan et al., 1977; Ashley et al., 1979; Krohn et al., 1978). Landsat imagery allowed geologists to
spot geological features, such as faults and lineaments, that might otherwise have gone unnoticed. Accurate
knowledge of faults and lineaments is crucial for geologists because mineral resources often occur along
these features. Landsat was the only satellite imagery provider in this period, and while far from perfect, it
was an important tool for firms to reduce uncertainty in their exploration process.7 While it was possible
to use airplanes to collect aerial imagery (Spurr, 1954), this process was quite cost intensive. In the context
of the theoretical framework, Landsat information can be thought of as providing additional positive signals
through the presence of certain markers associated with a gold deposit, and negative signals can be thought
of as the absence of these markers or the presence of geological features that are usually negatively corre-
lated with a discovery. While the use of Landsat information is common and it is claimed in textbooks that
“providing basic geoscientific information by government can act as a catalyst for mineral exploration in
6Note that in my theoretical framework, I do not assume that juniors have lower costs that seniors, on average. Rather, I make the
weaker assumption that the most cost-efficient senior firm has lower costs than the most cost-efficient junior firm. This assumption
allows for the possibility that there exist certain juniors who face lower project costs as compared to certain seniors, perhaps because
of an efficient and lean operation or greater reliance on technology.
7An alternate, commercial satellite imagery provider was launched in the late 1980s through the Satellite Pour Observation de
la Terre (“SPOT”) satellite system (Chevrel et al., 1981).
unexplored areas” (Gandhi and Sarkar, 2016, p.177), the empirical effects of Landsat information on gold
exploration remain unexamined.
4 Data and Research Design
Conceptually, I’m interested in three different kinds of data to help identify the relationship between new
maps and the discovery of new gold deposits. First, to quantify the timing and spatial variation in Landsat
coverage, data on themapping date, location, and quality (cloud-cover) is required. Second, a comprehensive
list of all major gold discoveries, along with discovery location and firm-type (junior or senior) is needed.
Finally, in order to test prediction 3, I am interested in identifying a covariate at the regional level that
captures the cost difference between juniors and seniors. All data needs to be linked to a block, a 100 square
mile area of the surface of the Earth covered by one Landsat image. This section describes these sources in
further detail.
4.1 Data
A. Landsat Coverage Data: I construct data on Landsat coverage from the EROS data center’s sensor meta-
data files.8 These data provide a list of all images collected by the Landsat sensors, including the location
being imaged, the date the image was collected, and information about the quality of the image, including an
assessment of cloud coverage in the image (Goward et al., 2006). I use these data to construct my main inde-
pendent variables at the block-year level. First, for each block, I record the first time that it wasmapped by the
Landsat program to form the PostMappedit indicator variable. The data were made available for follow-on
use immediately after they were available at the EROS data center, and there were no significant delays in
disseminating this information to downstream users. Similarly, I construct a variable Post Low − Cloudit
which is an indicator variable expressing whether a block has received a low-cloud image with less than 30
percent cloud cover. I choose the 30 percent cutoff because remote-sensing specialists indicate that images
with over thirty percent cloud cover in imagery are usually unusable in practice (Goward et al., 2006). The
results are not sensitive to the particular value of this cutoff choice (as shown in Table D.1).
B. Outcomes: It is a non-trivial exercise to collect all gold discoveries by junior and senior firms because
of the lack of an official database that tracks such discoveries. I worked with a private consulting firm
to create a database that provides the date, location, and additional details about economically significant
gold discoveries reported since 1950. These data have been collected over a period of many years using
press reports, disclosure documents, and other industry sources. To construct my dataset, I first match each
discovery to a specific block-year using geographic coordinates in my data. Having performed this matching,
I aggregate all discoveries within a given block-year and conduct my analysis at this level. In practice, in
all but 49 cases, a block-year experiences either one or zero discoveries; multiple discoveries in the same
block-year are rare. Accordingly, the main outcome variable for my analysis is Any Discoveryit, which is
8This data is available at http://landsat.usgs.gov/metadatalist.php
an indicator variable for whether a discovery was made in a given block-year. In total, 460 unique blocks
have seen a total of about 740 significant discoveries in this period of forty years. A map of the blocks
making these discoveries is provided in Appendix Figure D.1. Further, for each discovery, the database lists
the names of one or more entities responsible for the discovery and a classification of whether these firms
are juniors or seniors. Appendix A provides more details on data construction, how discoveries are defined,
and how the date of discovery is coded.
It is important to acknowledge and clarify a few concerns with the discovery data. First, while reporting
practices became better in the mid-1990s following the BreX accounting scandal in Canada (Brown and
Burdekin, 2000), reports of discoveries in public reports I employ might be quite limited in the1970s and
80s. Second, gaps in discovery reports are likely to be particularly salient for some regions in the world,
like Russia, the former Soviet bloc, and China. Finally, what counts as a discovery is a matter of significant
debate among geologists and industry professionals, as some sites that I count as discoveries could have
been known for centuries (even though not systematically explored) while some sites that I do not include as
meaningful could turn out to be so if the real options method is applied to valuing mineral resources (Slade,
2001). While these challenges are difficult to address completely, I do tackle them in a few different ways.
First, while my database is unlikely to have 100% coverage, it is reassuring to note that cross-validation of
these data with other databases by the data provider suggest that about 93–99% of all valuable discoveries
are included (Schodde, 2011). Second, I provide a robustness analysis in which I estimate the main effect
using only significantly large discoveries (over 2.2 million ounces of proven reserves), with the hope that
coverage challenges will be more limited for this set. I also provide numerous analyses that exclude certain
problematic regions (such as all blocks in the USSR and China) to address the concern that there is systematic
undercounting in some regions. It is worth noting that even if coverage is imperfect, there is no reason
to suspect that measurement error is correlated with the timing of the Landsat mapping effort, which is
reassuring. Third, I code the discovery data as the date of the first economic drill intersection, in an attempt
tominimize over-counting discoveries. Thismethod of counting discoveries doesmiss discoveries that might
be valuable if a real-options based method were applied to valuation, but these methods were not widely used
from 1950-1990, the period under study (Slade, 2001).9 Finally, I also purchased proprietary discovery data
from SNL Metals and Mining, a private data provider, and I validated my results against this measure as
well. While the data source I use is the most comprehensive available for research that I know of, the SNL
database offers an alternative that is relatively credible for discoveries between 1980 and 1990. The SNL
data are incomplete to estimate my baseline regressions but I use them for some validation exercises and in
an alternate specification.
C. Measuring Cost Variation: Next, I collect data to explore the third theoretical prediction that links the
increase in junior market share to their cost disadvantage in the exploration process. First, I match each block
9Slade (2001) mentions that as of 2001, “most mining-industry analysts still use some version of the discounted-cash flow” and
that the “e gap between theory and practice is large in the area of project evaluation” (page 195).
to the country in which it is located.10 Next, for each country, I rely on a survey of institutional conditions
in the mining industry to obtain a relatively good measure of the institutional barriers to operating in a given
region. As justified in Section 3.2, my assumption here is that these barriers increase project costs more
for junior firms than for senior firms because senior firms are able to overcome institutional and regulation-
related barriers at a lower cost. Specifically, I rely on the “Survey of Mining Companies,” conducted by
the Fraser Institute (McCahon and Fredricksen, 2014), which measures the costs of doing business in the
exploration industry. While the survey has been conducted annually since 1997, I use cross-sectional data
from the 2014 edition for all the years inmy sample because it is themost comprehensive, with information on
over 122 different jurisdictions around the world, including provinces in major mining countries including
Canada, Australia, and the United States. Institutional conditions are heavily correlated over time, so the
institutional conditions in 2014 are a good proxy for the measure of interest.11 Responses are collected from
surveys administered to more than 4,200 managers in the industry.12 The survey asks managers about the
relative project costs of exploration arising from institutional features such as environmental regulation, legal
institutions, and labor regulations. Fraser then assigns each jurisdiction a rank based on the responses. The
idea is that jurisdictions that have a low rank on this survey are likely to be places where juniors are especially
likely to face barriers in implementing their project, leading to larger cost differences between juniors and
seniors. This measure can then be used as a proxy to test prediction 3 of the theoretical framework.
D. Summary Statistics: Table 1 provides a list of key variables used in the quantitative analysis and sum-
mary statistics for the sample. Panel A provides summary statistics for key variables that vary at the block-
year level. The main outcome variable is Any Discovery, which is an indicator variable that is set to one
if a new gold discovery is reported in a block-year. This variable is scaled by a factor of 100 for legibility
throughout the analysis. The mean of this variable, 0.188, which means that there is a 0.188% chance that
a discovery is reported in a block-year. Any Junior Disc is set to one when Any Discovery is set to
one and at least one discovery was reported in a block-year by a junior firm. On average, 0.038 percent of
block-year observations report a junior-led discovery. Panel A also provides summary statistics for the key
independent variables, Post Mapped and Post Low −Cloud, which are indicator variables that are set to
one if a block has been mapped or mapped with a low-cloud image, respectively, by the Landsat program.
Note that there is a small percentage of blocks which were never mapped by the first generation of the Land-
sat program. For these blocks, the Post Mapped and Post Low − Cloud variables are always set to zero,
although the results are robust to excluding these blocks altogether (as demonstrated in Table D.5B). Panel
B provides summary statistics for variables that do not vary over time across blocks. Similar to the stylized
example in the theoretical framework, these data indicate that about 4.8 percent of the blocks ever reported a
discovery between 1950 and 1990, and about 3.9 percent of blocks reported a discovery after 1972, the year
10For blocks that belong to multiple countries, I match them to the country in which most of their area lies.
11The results are robust to employing historical measures of institutional conditions derived from the Polity IV database as well.
12The survey received 485 responses (for a response rate of 11.5%). Firms in the survey reported exploration expenditures totaling
about $2.5 billion in 2014, of a total expenditure of about $4.5 billion dollars by the industry in 2014 (Carlson, 2014), and represented
nearly all significant organizations in the exploration industry.
when Landsat was launched. These data also show that the median block is mapped by a low-cloud image
in 1972, however there is a long tail of blocks that remain unmapped until 1990.
4.2 Research Design
In order to identify the impact of Landsat on the gold exploration industry, an ideal experiment would ran-
domly assign Landsat images to regions around the world while leaving other regions without images, and
then measure the impact of Landsat on exploration outcomes. In this study, I use a differences-in-differences
specification to approximate this ideal experiment. Specifically, I establish that there are significant varia-
tions in the timing of Landsat imagery in different regions of the world (see Appendix Figure D.2). I then
estimate the differential impact of the arrival of imagery at the block-level and include non-parametric block
and year-level fixed effects. Block effects control for time-invariant differences between blocks (such as the
potential to discover gold) and time fixed effects help control for location-invarianttime trends, such as gold
prices. 13. In addition this baseline specification, I conduct a number of different robustness checks as well
as present estimates from an alternate instrumental variables specification.
Themain concern that remains with the baseline specification is that blocksmapped early were on differential
trends in terms of discoveries as compared with blocks that were mapped at a later point in time. Here, I
present some qualitative and quantitative evidence to justify why this concern might not be significant.
A. Qualitative Evidence: Analysis of historical Landsat imagery reveals significant gaps in coverage and
Landsat experts have investigated the reasons for these omissions (Draeger et al., 1997). The overarching
conclusion is that the gaps are likely related to (a) administrative decisions to focus on complete coverage
of the continental United States and (b) technical failures in mission operations (Goward et al., 2006). As
Goward et al. (2006) note, this variation was both unexpected and unnoticed till quite recently. An interview
with a Landsat administrator confirmed, “What we had not expected to see in the coverage maps were the
variations in the geographic coverage achieved from year to year. ... As we investigated further, we found
that technical issues such as the on-board tape recorders on Landsats 1, 2, and 3, which typically failed early
in the missions, may have caused the annual or seasonal gaps in coverage” (Interview, 8th April 2015). The
Landsat administrators I interviewed also said that the Landsat planning team was deliberately insulated
from firms in the private sector (such as exploration companies) because NASA did not want to be seen
to be catering to the needs of a select few. They stated that the Landsat mission was primarily focused on
complete coverage of the United States, and while global coverage was desirable, the program administrators
acknowledged “that’s the one that ended up suffering the most” (Interview, 8th April 2015).
Variation in coverage was also due to satellite images that were rendered unusable due to significant cloud
cover. To this day, a central challenge in using satellite imagery is the presence of clouds between the satellite
sensor and the land surface being imaged. For example, one of my interviews makes the point that cloudiness
13There was a significant increase in gold prices in the early 1970s, as showin in http://www.macrotrends.net/1333/
historical-gold-prices-100-year-chart, which the year fixed effects help to control for.
http://www.macrotrends.net/1333/historical-gold-prices-100-year-chart
http://www.macrotrends.net/1333/historical-gold-prices-100-year-chart
affects the timing of when regions were first mapped by Landsat: “Our ability to predict clouds [is limited] ...
everything comes in big fronts, especially around the equator, where there are convector, pop-up storms, and
no predicting when or where they are; after a few tries you might end up with only about one or two scenes
that are very clear” (Interview 22nd November, 2014). These facts suggest that the timing of the arrival of
cloud-free maps follows an even more random process than the timing of the mapping effort. I therefore use
the arrival of cloud-free imagery in my baseline specification and use average cloud-cover in a region in the
IV specification as a robustness analysis.
B. Quantitative Evidence: I also quantitatively explore the possibility that the timing of Landsat maps is
correlated with gold exploration trends. Figure 2 provides a map showing the timing of the mapping effort
across blocks around the world (Panel A), as well as a histogram of the years in which blocks were first
mapped by the Landsat program (Panel B). This evidence makes clear that (a) there is significant variation
in terms of the geographic location of blocks that were mapped early or late (although there is significant
clustering, especially in the US) and (b) while a majority of the blocks were mapped for the first time in
the first two years of program operation, there is a long tail of blocks that were mapped considerably later
in the program. A simple time-series comparison of average gold discoveries between blocks that received
Landsat coverage early versus late confirms this finding (see Figure 3, Panel A). Blocks “Mapped Early”
are all blocks that received a Landsat image for the first time before 1974. If the potential for gold were a
determining factor for when a region wasmapped by Landsat, we’d expect to see that blocks mapped early by
Landsat had a higher rate of gold discovery prior to Landsat, indicating that Landsat administrators selected
regions based on the fact that gold had been found in that region and thus more gold was likely present too.
However, as the figure illustrates, discoveries in blocks mapped early and late had fairly flat and parallel
growth rates before 1973, when Landsat data was made available. However, while there do not seem to be
any differences in trends between early and late blocks before the launch of the Landsat effort, there remain
some differences in levels that could be a concern for the analysis.
Combined, the qualitative evidence and a preliminary analysis of the quantitative data provide confidence in
the validity of the baseline difference-in-difference specifications that exploit the differential timing of the
mapping effort.
5 Results
5.1 Did Landsat Boost Gold Discovery?
A. Baseline Regression Specification: I now turn to analyzing my first theoretical prediction, that Landsat
maps boosted the discovery of gold, using a regression model. I use OLS to estimate the following regression
specification using the block-year level panel: Yit = α+β1×Postit+γi+δt+ϵit, where γi and δt represent
block and time fixed effects respectively for block i and year t. In the first specification, Postit represents
Post Mappedit and equals one for all blocks after they have been mapped; in the second specification,
Postit represents Post Low − Cloudit and equals one for all blocks after they have been mapped with
a low-cloud cover image. This specification compares the difference between blocks that have received
mapping information with blocks that have yet to receive maps, in a differences-in-differences framework.
In linewith Prediction 1, if blocks that aremapped first by Landsat do indeed report more gold discoveries and
earlier discoveries than blocks mapped later, then we should find that the difference-in-difference estimate β1
is positive. All my specifications cluster standard errors at the block level, given the concern that discoveries
within blocks are likely to be correlated over time. In additional robustness checks, I include more general
clustering (for example at the country and block-group levels) that takes seriously spatial proximity between
different blocks. I find that the results are generally robust to these additional restrictions (see Table D.2).
Table 2 presents estimates from this regression for both the Post Mappedit and Post Low − Cloudit
variables. Columns (1) and (2) do not include block fixed effects, while columns (3), (4), and (5) include
them. The coefficients generally reduce in size after controlling for block fixed effects, indicating their
importance in this setting. In line with Prediction 1, the results indicate that after controlling for block- and
year-level fixed effects, there is a positive impact of Landsat coverage on gold discovery. Specifically, the
estimate of β1 indicates an average increase of between 0.152 and 0.164 percentage points on the likelihood
of a gold discovery after the Landsat mapping effort, a significant increase given that the baseline rate of
discovery is about 0.19 percent. This means the rate of discovery in treated regions is almost doubled.14
Finally, in Column (5) I present estimates with both Post Mappedit and Post Low−Cloudit variables in
the same specification. If we expect the results to be driven by information contained in the Landsatimages
and not simply the activation of the mapping program, we should expect the Post Low − Cloudit variable
to be positive and significant. As indicated in Column (5), the coefficients follow this pattern. The estimate
on the Post Low − Cloudit variable is 0.155 and significant (very similar to the estimate in column 4),
while the estimate on the Post Mappedit variable is small and not statistically different than zero. This
is reassuring because, in line with the model, it seems that the effect of the Landsat mapping depends on
the provision of improved information in the form of a low-cloud image containing information about the
geology of the pictured land, as described in the theoretical framework, as opposed to the arrival of an image
that may or may not be informative. As an additional robustness check, I estimate the above specification
using negative binomial models, given the skewed distribution of the outcome variable and find the results to
be consistent with the OLS estimates (Table D.3). Further, since there is some concern about the possibility
of underreporting in the main dependent variable in some regions around the world, I also estimate another
specification where I include only major discoveries (those with proven reserves being above 2.2 MOz. in
average size) as the main dependent variable. The results are robust to this restriction (Table D.4).
The baseline results and the robustness checks both confirm Prediction 1, that the arrival of positive and
negative signals via information infrastructure such as public maps tends to increase total discoveries at the
industry level.
14While this estimate suggests a relatively large impact of Landsat information on discovery, it should be noted that the absolute
increase in the number of discoveries is likely to be more modest given that rare nature of this dependent variable.
B. Time-varying Estimates: I then turn to estimating the time varying impact of Landsat coverage on gold
discovery. Specifically, I estimate Yit = α+Σz βt × 1(z) + γi + δt + ϵit, where γi and δt represent block
and time fixed effects, respectively, for block i and year t, and z represents the “lag,” or the number of
years that have elapsed since a block was first mapped with a low-cloud image.15 Figure 3 Panel B presents
estimates of βt from this regression, which measure the difference between treated and control blocks for
every lag year. This figure makes two points. First, there are no pre-existing differences in trends between
blocks mapped early and those mapped late. Second, there is a large and persistent increase in discoveries
that appears after about seven to eight years following a low-cloud image. This delay accords well with my
interviews with gold exploration companies who confirm that Landsat represents early-stage exploration and
is typically followed by many years of further exploration as well as with reports of discovery timelines in
the gold exploration industry.
5.2 Additional Robustness Checks
A. Stress-testing the timing variation: Two concerns are important to note about the validity of the baseline
analysis. First, as shown in Figure 2, the timing of the mapping is such that approximately 75 percent of the
globe was mapped in the first two years of the Landsat program, after which there was a significant delay
before the remaining blocks were mapped. While this pattern is not a direct concern for the analysis in terms
of identification, it would be helpful to establish the robustness of the finding if only the 25 percent of blocks
mapped after 1974 are included, thereby exploiting the variation in the long tail of the mapping effort in the
later years of the Landsat program. Accordingly, Table D.7 estimates the baseline specification excluding
all blocks mapped in the first year (columns 1 and 2) and the first two years (columns 3 and 4) of the Landsat
program. These results, albeit smaller and less precise, remain positive and statistically significant.
As another way to address this issue, I implement a cross-sectional specification that does not rely on the
panel variation within blocks. This specification under-emphasizes small timing differences in the arrival
of maps and instead simply relates overall delays to the probability that any discovery is reported in the
20 year period following the launch of the Landsat program. The baseline specification is of the form Yi =
α+β1×Delayi+γi+ϵi, where the main outcome variable, Yi, is an indicator for whether any discovery was
made in a given block i between 1972 and 1990,Delayi is the difference between the year in which a block
was mapped with a low cloud image and 1972, and γi represents spatial fixed effects, such as at the continent,
subregion, or block-group level. The estimates are presented in Appendix Table D.12. In particular, I replace
my main discovery measure with data from another data provider SNL Metals and Mining, and show that
the results are robust to this alternate measure as well. The SNL data has limited coverage before 1982
and therefore cannot be used in the main specification, but can be employed here as an alternate measure
of gold discovery. Further, I also estimate another version of the cross-sectional specification where the
dependent variable is time from 1972 until the first discovery in a given block.16 Here the estimates are
15For the small percentage of blocks that never get a low-cloud image, z is always set to zero.
16I thank a referee for this suggestion.
positive and significant, indicating that greater delays in Landsat mapping are associated with a greater
delay in the discovery of deposits. The three sets of results support the conclusion that delays are associated
with a lower probability of discovery for Landsat blocks, even in the cross-sectional specification.
Finally, another concern might be that Landsat mapping coincides with improvements in institutional quality
in a number of major regions (e.g., the former Soviet Union), and so confounds the direct effect on gold
exploration. While this explanation seems implausible because it would require local conditions to change
precisely around the timing of the Landsat mapping for a large number of blocks, I am able to directly test
it using region-specific time-trends in the regressions specification. Instead of including a common year-
specific indicator variable at the global level, this specification includes separate year-specific dummies for
different sets of regions around the world. Estimates from this specification, presented in Appendix Table
D.8, allow us to evaluate the robustness of the baseline results to region-specific time trends using three
different groupings of blocks. The most stringent of these specifications includes 861 indicator variables
representing separate fixed effects for each of the 21 subregions over the period 1950 to 1990. The results
remain positive and significant, establishing the robustness of the main estimates to regional time-trends.
Additional details are included with Table D.8.
B. Excluding Potentially Problematic Blocks: As is evident from Figure 2 Panel A, the spatial variation in
early mapping is not completely random. In particular, the US seems well covered while parts of the USSR
seem to have been mapped late, although variation in other parts of the world seems more idiosyncratic. In
order to address concerns that regions with problematic variation are driving the effect, I repeat the analysis
for different subsamples of the data excluding certain regions. I run the baseline regression excluding just
the USA, excluding United States, Canada, and Australia (the top three gold producers in the world), and
excluding USSR and China (where measurement error is likely to exist in terms of gold discoveries). Apart
from excluding certain countries, it is also helpful to exclude subsamples of blocks that might be viewed
as problematic. I therefore exclude blocks where discoveries had already been reported prior to 1972 and
blocks that werenever mapped by Landsat. As shown in Appendix Table D.5, the results remain largely
unaffected. Further, it is also important to examine geographic variation in the baseline results in order to
verify that no one region is driving the overall results. To this end, in Table D.6, I estimate the baseline
specification excluding blocks for each one of the six continents in my data, one at a time. These results
show that while the baseline effect does vary slightly depending on which continent has been dropped, the
treatment effect of the Landsat program remains large, positive and significant. Further, in Figure D.5, I
estimate the baseline specification where Post Low − Cloud is interacted with an indicator variable for
each one of the six continents in the data. These continent-specific estimates indicate that the results are not
driven by one specific region, but rather by blocks in Oceania, North America and South America, although
there effect in Africa, Asia or Europe is minimal.
C. IV, Placebo and specification checks: In addition to the baseline specification, I also investigate a set
of specifications that use cloud-cover as an instrument for the timing of Landsat mapping. I describe this
strategy and present these results in Appendix C. Overall, the IV analysis confirms the baseline specification,
although these results are considerably larger. While the IV provides a useful check on the baseline estimates,
I emphasize the baseline estimates which are more conservative and where the identifying variation is more
transparent.
Next, my interviews revealed that while Landsat is useful for understanding the geology of a region for gold
exploration, its utility is severely diminished in regions where tree cover obscures details of the land surface
underneath. This fact can be exploited to conduct a placebo check: is the value of Landsat diminished in
regions with tree cover? Accordingly, I use a dataset of global tree-cover to extract blocks that contain
significant tree-cover and estimate the impact of Landsat on this limited group only. The results from this
analysis are presented in column 3 of Appendix Table D.5 Panel B and suggest that places where Landsat
images are obscured by trees, there is not effect of mapping on discovery. Similarly, Appendix Table D.10
presents estimates from a placebo specification where for each block I randomly assign it a year between
1972 and 1990 in which the block was “mapped.” I run the analysis using this “fake” treatment year. The
estimates for these placebo checks are close to zero and not significant, which helps establish that the main
estimates do not seem to depend on the particulars of the specification. Finally, in addition to the battery
of tests presented above, I present a number of other robustness checks in Appendix D. Table D.1 presents
estimates for different definitions of “low-cloud” cover; Table D.9 presents estimates with different start and
end years for the panel; and Table D.2 presents estimates with standard errors clustered at different spatial
groups rather than the Landsat block. Together these robustness checks help to further bolster the validity of
the baseline estimates.
5.3 Differential Impact of Landsat on Juniors and Seniors
The results from Section 5.1 and 5.2 provide strong evidence that public investment in basic knowledge
through the Landsat program boosted the total number of gold discoveries. This evidence is consistent
with Prediction 1 from Section 2.2, which explains how the arrival of informative signals from the Landsat
mapping effort leads to an overall increase in the rate of gold discovery. In this section, I test the Prediction
2, that public information from the Landsat program helps junior firms increase their market share because
of differences in the costs of exploration. I also test Prediction 3, which argues for an inverted-U shaped
relationship between the cost differences of juniors and seniors, and the increase in junior market share.
A. Testing Prediction 2—Impact of Landsat on Juniors and Seniors: As a preliminary exercise to ex-
amine whether Landsat benefits junior firms more than seniors, in Figure 4, Panel A, I compare the average
level of discoveries for juniors and seniors before and after the arrival of a low-cloud image. The first fact
to note is that, before public mapping, juniors are a very small part of the industry, making an average of
only 0.007 discoveries, whereas seniors make about 0.069 discoveries. However, these proportions change
drastically post-mapping. Both seniors and juniors increase their rate of discoveries significantly, but juniors
have a much greater increase. Specifically, juniors now make an average of 0.087 discoveries, compared to
seniors who make about 0.29. In other words, junior market share increases from about 9.2 percent to about
23 percent.
Next I estimate regressions similar to the baseline specification to formally test Prediction 2. I first set the
dependent variable equal to 1 if junior firms made a discovery and 0 otherwise (columns 1 and 2), and then
set the dependent variable equal to 1 if senior firms made a discovery and 0 otherwise (columns 3 and 4).
The estimates of β1 from these regressions provide separate estimates of the boost to discovery for juniors
and seniors, facilitating a comparison of whether Landsat helped one group more than the other.
The results from these regressions are presented in Table 3. The estimates suggest that the impact of the
Landsat program on juniors is about 0.047 while the impact for seniors is about 0.12. In other words, the
total gain from the Landsat program (about 0.16 percent more discoveries) are split such that smaller firms
make 0.04 percent more discoveries at the block-year level while seniors capture the remaining 0.12 per-
cent. In terms of percentage points, it seems then that seniors benefit more from the Landsat mapping effort.
However, when the previous market share of juniors is taken into consideration, this interpretation changes
considerably. Specifically, before the Landsat program was launched, juniors made only about 0.008 per-
cent discoveries in a given block-year on average, while seniors made 0.0694 percent. This suggests that
seniors were almost entirely responsible for new gold discoveries prior to the Landsat program, but after the
arrival of Landsat images, juniors made one out of every four discoveries. Thus it seems that the Landsat
program improved the performance of junior firms in this industry in terms of making new discoveries. In
other words, juniors were 5.8 times more likely to report a discovery in mapped regions than in unmapped
regions, while seniors only benefited by a factor of 1.7. Therefore, the estimates suggest that even though
seniors made a significant portion of new discoveries in mapped regions, their market position eroded con-
siderably, and juniors were able to make considerable gains in performance.17 These predictions are exactly
in line with the mechanics of the model; both seniors and juniors increase their market share, but because the
additional discoveries due to mapping are more evenly shared between these two groups, the junior market
share increases. The data, therefore, support the idea in Prediction 2 that public information infrastructure
provides a disproportionate benefit to smaller firms who face higher project costs in terms of discovery.
B. Prediction 3—Probing the Cost Mechanism: For prediction 3, I derived the result that as the project
costs for juniors increase, the increase in their market share initially increases, but beyond a certain thresh-
old decreases. I now test this prediction empirically. Finding results consistent with this pattern would be
evidence of the validity of the theoretical model, which identifies the difference in the project costs as one
important reason why public information infrastructure is particularly beneficial for junior firms.
17One concern with this interpretation is that seniors might be

Continuar navegando

Materiales relacionados

12 pag.
McKenzie-2012

User badge image

Apuntes Generales

19 pag.
17 pag.
8_HBS_Market_Research

User badge image

Apuntes Generales