“Victorian Holocausts”: The Long-Term Consequences of Famine in British India

Adithya V. Raajkumar

Abstract: This paper seeks to examine whether famines occur- ring during the colonial period affect development outcomes in the present day. We compute district level measures of economic development, social mobility, and infrastructure using cross-sectional satellite luminosity, census data, and household survey data. We then use a panel of recorded famine severity and rain- fall data in colonial Indian districts to construct cross-sectional counts measures of famine occurrence. Finally, we regress modern day outcomes on the number of famines suffered by a district in the colonial era, with and without various controls. We then instrument for famine occurrence with climate data in the form of negative rainfall shocks to ensure exogeneity. We find that districts which suffered more famines during the colonial era have higher levels of economic development; however, high rates of famine occurrence are also associated with a larger percentage of the labor force working in agriculture, lower rural consumption, and higher rates of income inequality. We attempt to explain these findings by showing that famine occurrence is simultaneously related to urbanization rates and agricultural development. Overall, this suggests that the long-run effects of natural disasters which primarily afflict people and not infrastructure are not al- ways straightforward to predict.

1. Introduction What are the impacts of short-term natural disasters in the long-run, and how do they affect economic development? Are these impacts different in the case of disasters which harm people but do not affect physical infrastructure? While there is ample theoretical and empirical literature on the impact of devastating natural disasters such as hurricanes and earthquakes, there are relatively few studies on the long-term consequences of short-term disasters such as famines. Further- more, none of the literature focuses on society-wide development outcomes. The case of colonial India provides a well-recorded setting to examine such a question, with an unfortunate history of dozens of famines throughout the British Raj. Many regions were struck multiple times during this period, to the extent that historian Mike Davis characterizes them as “Victorian Holocausts” (Davis 2001 p.9). While the short-term impacts of famines are indisputable, their long-term effects on economic development, perhaps through human development patterns, are less widely understood. The United Kingdom formally ruled India from 1857 to 1947, following an ear- lier period of indirect rule by the East India Company. The high tax rate imposed on peasants in rural and agricultural India was a principal characteristic of British governance. Appointed intermediaries, such as the landowning zamindar caste in Bengal, served to collect these taxes. Land taxes imposed on farmers often ranged from two-thirds to half of their produce, but could be as high as ninety to ninety-five percent. Many of the intermediaries coerced their tenants into farming only cash crops instead of a mix of cash crops and agricultural crops (Dutt 2001). Aside from high taxation, a laissez-faire attitude to drought relief was another principal characteristic of British agricultural policy in India. Most senior officials in the imperial administration believed that serious relief efforts would cause more harm than they would do good and consequently, were reluctant to dispatch aid to afflicted areas (ibid). The consequences of these two policies were some of the most severe and frequent famines in recorded history, such as the Great Indian Famine of 1893, during which an estimated 5.5 to 10.3 million peasants perished from starvation alone, and over 60 million are believed to have suffered hardship (Fieldhouse 1996). Our paper focuses on three sets of outcomes in order to assess the long-term impact of famines. First, we measure macroeconomic measures of overall development, such as rural consumption per capita and the composition of the labor force. We also use nighttime luminosity gathered from satellite data as a proxy for GDP, of which measurement using survey data can be unreliable. Second, we look at measures of human development: inequality, social mobility, and education, constructed from the India Human Development Survey I and II. Finally, we examine infrastructure, computing effects on village-level electrification, numbers of medical centers, and bus service availability. To examine impacts, we regress famine occurrence on these outcomes via ordinary least-squares (OLS). We use an instrumental-variables (IV) approach to ensure a causal interpretation via as-good-as-random assignment (1). We first estimate famine occurrence, the endogenous independent variable, as a function of rainfall shocks–a plausibly exogenous instrument–before regressing outcomes on predicted famine occurrence via two-stage least-squares (2SLS). Since the survey data are comparatively limited, we transform and aggregate panel data on rainfall and famines as counts in order to use them in a cross-section with the contemporary outcomes. We find for many outcomes that there is indeed a marginal effect of famines in the long-run, although where it is significant it is often quite small. Where famines do have a significant impact on contemporary outcomes, the results follow an interesting pattern : a higher rate of famine occurrence in a given district is associated with greater economic development yet worse rural outcomes and higher inequality. Specifically, famine occurrence has a small but positive impact on nighttime luminosity–our proxy for economic development–and smaller, negative impacts on rural consumption and the proportion of adults with a college education. At the same time, famine occurrence is also associated with a higher proportion of the labor force being employed in the agricultural sector as well as a higher level of inequality as measured by the Gini index (2). Moreover, we find limited evidence that famine occurrence has a slightly negative impact on infrastructure as more famines are associated with reduced access to medical care and bus service. We do not find that famines have any significant impact on social mobility–specifically, intergenerational income mobility–or infrastructure such as electrification in districts. This finding contradicts much of the established literature on natural disasters, which has predominantly found large and wholly negative effects. We at- tempt to explain this disparity by analyzing the impact of famines on urbanization rates to show that famine occurrence may lead to a worsening urban-rural gap in long-run economic development. Thus, we make an important contribution to the existing literature and challenge past research with one of our key findings: short-term natural disasters which do not destroy physical infrastructure may have unexpectedly positive outcomes in the long-run. While the instrumental estimates are guaranteed to be free of omitted variable bias, the OLS standard errors allow for more precise judgments due to smaller confidence intervals. In around half of our specifications, the Hausman test for endogeneity fails to reject the null hypothesis of exogeneity, indicating that the ordinary-least squares and instrumental variables results are equally valid (3). How- ever, the instrumental variables estimate helps address other problems, such as attenuation bias, due to possible measurement error (4). Section 2 presents a review of the literature and builds a theoretical framework for understanding the impacts of famines on modern-day outcomes. Section 3 describes our data, variable construction, and summary statistics. Sections 4 and 5 present our results using ordinary least-squares and instrumental two-stage least- squares approaches. Section 6 discusses and attempts to explain these results.

2. Review and Theoretical Framework 1. The Impact of Natural Disasters Most of the current literature on natural disasters as a whole pertains to physical destructive phenomena such as severe weather or seismic events. Moreover, most empirical studies, such as Nguyen et al (2020) and Sharma and Kolthoff (2020) , focus on short-run aspects of natural disasters relating to various facets of proxi- mate causes (Huff 2020) or pathways of short-term recovery (Sharma and Kolthoff 2020). Famines are a unique kind of natural disaster in that they greatly affect crops, people, and animals but leave physical infrastructure and habitation relatively unaffected. We attempt to take this element of famines into account when explaining our results. Of the portion of the literature that focuses on famines, most results center on individual biological outcomes such as height, nutrition, (Cheng and Hui Shui 2019) or disease (Hu et al. 2017. A percentage of the remaining studies fixate on long-term socioeconomic effects at the individual level (Thompson et al. 2019). The handful of papers that do analyze broad long-term socioeconomic outcomes, such as Ambrus et al. (2015) and Cole et al. (2019), all deal with either long-term consequences of a single, especially severe natural disaster or the path dependency effects that may occur because of the particular historical circumstances of when a disaster occurs, such as in Dell (2013). On the other hand, our analysis spans several occurrences of the same type of phenomenon in a single, relatively stable sociohistorical setting, thereby utilizing a much larger and more reliable sample of natural disasters. Thus, our paper is the first to examine the long-term effects of a very specific type of natural disaster, famine, on the overall development of an entire region, by considering multiple occurrences thereof. Prior econometric literature on India’s famine era has highlighted other areas of focus, such as Burgess and Donaldson (2012), which shows that trade openness helped mitigate the catastrophic effects of famine. There is also plenty of historical literature on the causes and consequences of the famines, most notable in academic analyses from British historians (contemporarily, Carlyle 1900 and Ewing 1919; more recently Fieldhouse), which tend to focus on administrative measures, or more specifically, the lack thereof. In terms of the actual effects of famine, all of the established literature asserts that natural disasters overwhelmingly influence economic growth through two main channels: destruction of infrastructure and resulting loss of human capital (Lima and Barbosa 2019, Nguyen et al. 2020, Cole et al. 2019), or sociopolitical historical consequences, such as armed conflict (Dell 2013, Huff 2019). Famines pose an interesting question in this regard since they tend to result in severe loss of human capital through population loss due to starvation but generally result in smaller-scale infrastructure losses (Agbor and Price 2013). This is especially the case for rural India, which suffered acute famines while having little infrastructure in place (Roy 2006). We examine three types of potential outcomes: overall economic development, social mobility, and infrastructure, as outlined in section three. Our results present a novel finding in that famine occurrence seems to positively impact certain outcomes while negatively impacting most others, which we attempt to explain by considering the impact of famines on urbanization rates. Famines can impact outcomes through various mechanisms; therefore, we leave the exact causal mechanism unspecified and instead treat famines as generic shocks with subsequent recovery of unknown speed. If famines strike repeatedly, their initial small long-term effects on outcomes can escalate. In order to distinguish long- run effects of famines, we construct a simple growth model where flow variables such as growth quickly return to the long-run average after the shock, but stock variables such as GDP or consumption only return to the average asymptotically. Our intuition for the basis of distinguishing a long-run effect of famines rests on a simple growth model in which flow variables such as growth quickly return to the long-run average after a shock, but stock variables such as GDP or consumption only return to the average asymptotically (5). Thus, over finite timespans, the differences in stock variables between districts that undergo famines and those that do not should be measurable even after multiple decades. As mentioned below, this is in line with more recent macroeconomic models of natural disasters such Hochrainer (2009) and Bakkensen and Barrage (2018). Assume colonial districts (indexed by i) suffer ni famines over the time period (in our data, the years 1870 to 1930), approximated as average constant rates fi. The occurrence of famine can then be modeled by a Poisson process with interval parameter fi, which represents the expected time between famines–even though the exact time is random and thus unknown–until it is realized (6). For simplicity, we assume that famines cause damage d to a district’s economy, for which time ri is needed to recover to its assumed long-run, balanced growth path (7). We make no assumptions on the distributions of d and ri except that ri is dependent on d and that the average recovery time E[ri] is similarly a function of E[d]. If the district had continued on the growth path directly without the famine, absent any confounding effects, it would counterfactually have more positive out- comes today by a factor dependent on niE[tf] and thus ni, the number of famines suffered. We cannot observe the counterfactuals (the outcome in the affected district had it not experienced a famine), so instead, we use the unaffected districts in the sample as our comparison group. Controlling for factors such as population and existing infrastructure, each district should provide a reasonably plausible counterfactual for the other districts in terms of the number of famines suffered. Then, the differences in outcomes among districts measured today, yi, can be modeled as a function of the differences in the number of famines, ni. Finally, across the entire set of districts, this can be used to represent the average outcome E[yi] as a function of the number of famines, which forms the basis of our ordinary-least squares approach in section four. This assumes that the correlation between famine occurrence and outcome is equal to 0. To account for the possibility their correlation is non-zero, we also use rainfall shocks to isolate the randomized part of our independent variable in order to ensure that famine occurrence is uncorrelated with our outcome variables. The use of rainfall shocks, in turn, forms the basis of our instrumental variables approach in section five. The important question is the nature of the relationship between d and ri. While f can be easily inferred from our data, d and especially r are much more difficult to estimate without detailed, high-level, and accurate data. Since the historical record is insufficiently detailed to allow precise estimation of the parameters of such a model, we constrain the effects of famine to be linear in our estimation in sections four and five.

2. Estimation Having constrained the hypothesized effects of famine to be linear, in section four, we would prefer to estimate (1) below, where represents our estimate of the effect of famine severity (faminei), measured as the number of famines undergone by the district, on the outcome variable outcome yi, and Xi is a vector of contem- porary (present-day) covariates, such as mean elevation and soil quality. The con- stant term captures the mean outcomes across all districts andis a district-specific error term.

Much of the research on famine occurrence in colonial India attributes the occurrence of famines and their consequences to poor policies and administration by the British Raj. If this is the case, and these same policies hurt the development of districts in other ways, such as by stunting industrialization directly, then the estimation of (1) will not show the correct effect of famines per se on comparative economic development. Additionally, our observations of famines, which are taken indirectly from district-level colonial gazetteers and reports, may be subject to “measurement” error that is non-random. For example, the reporting of famines in such gazetteers may be more accurate in well-developed districts that received preferential treatment from British administrators. To solve this problem, we turn to the examples of Dell et al. (2012), Dell (2013), Hoyle (2010), and Donaldson and Burgess (2012), who use weather shocks as instruments for natural disaster severity. While Dell (2013) focuses on historical consequences arising from path dependency and Hoyle (2010) centralizes on productivity, the instrumental methodology itself is perfectly applicable to our work. Another contribution of our pa- per is to further the use of climate shocks as instruments. We expand upon the usage of climate shocks as instruments because they fit the two main criteria for an instrumental variable. Primarily, weather shocks are extremely short-term phenomena, so their occurrence is unlikely to be correlated with longer-term climate factors that may impact both historical and modern outcomes. Secondly, they are reasonably random and provide exogenous variation with which we can estimate the impact of famines in an unbiased manner. We first estimate equation (2) below before estimating (1) using the predicted occurrence of famine from (2):

We calculate famine as the number of reported events occurring in our panel for a district and rainfall as the number of years in which the deviation of rainfall from the mean falls below a certain threshold, nominally the fifteenth and tenth percentiles of all rainfall deviations for that district. As in (1), there is a constant term and error term. As is standard practice, we include the control variables in the first-stage even though they are quite plausibly unrelated to the rainfall variable. This allows us to estimate the impacts of famine with a reasonably causal interpretation; since the assignment of climate shocks is ostensibly random, using them to “proxy” for famines in this manner is akin to “as good as random” estimation. The only issue with this first-stage specification is that while we instrument counts of famine with counts of lo w rainfall years, the specific years in which low rainfall occurs theoretically need not match up with years in which famine is recorded in a given district. Therefore, we would prefer to estimate (3) below instead, since it provides additional identification through a panel dataset. Any other climate factors should be demeaned out by the time effects. Other district characteristics that may influence agricultural productivity and therefore famine severity, such as soil quality, should be differenced out with district effects, represented by the parameters.

Differences in administrative policy should be resolved with provincial fixed effects. Unfortunately, we would then be unable to implement the standard instrumental variables practice of including the control variables in both stages since our modern-day outcomes are cross-sectional (i.e, we only have one observation per district for those measures). Nevertheless, our specification in (2) should reason- ably provide randomness that is unrelated to long-term climate factors, as mentioned above. Finally, we collapse the panel by counting the number of famines that occur in the district over time in order to compare famine severity with our cross-sectional modern-day outcomes and to get an exogenous count measure of famine that we can use de novo in (1). To account for sampling variance in our modern-day estimates, we use error weights constructed from the current population of each district meaning that our approach in section 5 is technically weighted least-squares, not ordinary. While this should account for heteroscedasticity in the modern observations, we use robust SM estimators in our estimations (McKean 2004, Barrera and Yohai 2006) to assure that our standard errors on the historical famine and rainfall variables are correct (8). The results of these approaches are detailed in section six.

3. Data 1. Sources and Description Our principal data of interest is a historical panel compiled from a series of colonial district gazetteers by Srivastava (1968) and details famine severity at the district level over time in British India from 1870 to 1930. Donaldson and Burgess (2010) then code these into an ordinal scale by using the following methodology:

4 – District mentioned in Srivastava’s records as “intensely affected by famine” 3 – District mentioned as “severely affected” 2 – Mentioned as “affected” 1 – Mentioned as “lightly affected” 0 – Not mentioned 9 – Specifically mentioned as being affected by spillover effects from a neighbor- ing district (there are only four such observations, so we exclude them)

In our own coding of the data, we categorize famines as codes 2, 3, and 4, with severe famines corresponding to codes 3 and 4. We compute further cross-sectional measures, chiefly the total number and proportion of famine-years that a district experienced over the sixty-year periods. This is equivalent to tabulating the frequency of code occurrences and adding the resulting totals for codes 2 to 4 to obtain a single count measure of famine. Our results are robust to using “severe” (codes 3 and 4) famines instead of codes 2, 3, and 4. Across the entire panel, codes from 0 to 4 occurred with the following frequencies: 4256, 35, 207, 542, and 45 respectively. We also supplemented this panel with panel data on rainfall over the same time period. Several thousand measuring stations across India collected daily rainfall data over the time period, which Donaldson (2012) annualizes and compares with crop data. The rainfall data in Donaldson (2012) represents the total rainfall in a given district over a year, categorized by growing seasons of various crops (for ex- ample, the amount of total rainfall in a district that fell during the wheat growing season). Since different districts likely had different shares of crops, we average over all crops to obtain an approximation of total rainfall over the entire year. We additionally convert this into a more relevant measure in the context of famine by considering only the rainfall that fell during the growing seasons of crops typically grown for consumption in the dataset; those being bajra, barley, gram (bengal), jowar (sorghum), maize, ragi (millet), rice, and wheat. Finally, to ensure additional precision over the growing season, we simply add rainfall totals during the grow- ing seasons of the two most important food crops - rice and wheat - which make up over eighty percent of food crops in the country (World Bank, UN-FAOSTAT). The two crops have nearly opposite growing seasons, so the distribution of rainfall over the combined growing seasons serves as an approximation of total annual rainfall. Our results are robust with regards to all three definitions; the pairwise correlations between the measures are never less than ninety percent. Moreover, the cross-sectional famine instruments constructed from these are almost totally identical as the patterns in each type of rainfall (that is, their statistical distributions over time) turn out to be the same. As expected, there appears to be significant variation in annual rainfall. The ex- ample of the Buldana district (historically located in the Bombay presidency, now in Maharashtra state) highlights this trend, as shown in Figure 1 on the following page. In general, the trends for both measures of rainfall over time are virtually in- distinguishable aside from magnitude. As anticipated, famine years are marked by severe and/or sustained periods of below-average rainfall although the correlation is not perfect. There are a few districts which have years with low rainfall and no recorded famines, but this can mostly be explained by a lack of sufficient records, especially in earlier years. On the opposite end of the spectrum, there are a few districts that recorded famines despite above-average rainfall, which could possibly be the result of non-climatic factors such as colonial taxation policies, conflicts, or other natural disasters, such as insect plagues. However, the relationship between rainfall patterns and famine occurrence suggests that we can use the former as an instrument for the latter especially since the correlation is not perfect, and famine occurrence is plausibly non-random due to the impact of British land ownership policies.

Figure 1: Rainfall over time for Buldana from 1870 to 1920

Notes: The dashed line shows mean rainfall for all food crops; the solid line shows the total rainfall over the wheat and rice growing seasons. The blue and purple lines represent the historical means for these measures of rainfall. The rad shading denotes years in which famines are recorded as having affected the district.

We construct count instruments for famines by first computing the historic mean and annual deviation for rainfall in each district. We can then count famines as years in which the deviation was in the bottom fifteenth percentile in order to capture relatively severe and negative rainfall shocks as plausible famine causes. For severe famines, we use the bottom decile instead. The percentiles were chosen based on famine severity so that the counts obtained using this definition were as similar as possible to the actual counts constructed from recorded famines (see above) in the panel dataset. For modern-day outcomes, we turn to survey data from the Indian census as well as the Indian Human Development Survey II, which details personal variables (ex. consumption and education), infrastructure measures (such as access to roads), and access to public goods (ex. hospital availability) at a very high level of geographical detail. An important metric constructed from the household development surveys is that of intergenerational mobility as measured by the expected income percentile of children whose parents belonged to a given income percentile, which we obtain from Novosad et al. (2019). Additionally, as survey data can often be unreliable, we supplement these with an analysis of satellite luminosity data, which provides measures of the (nighttime) luminosity of geographic cells, which should serve as a more reliable proxy for economic development, following Henderson et al. (2011) and Pinkovsky and Sala-i-Martin (2016). These data are mostly obtained from Novosad et. al (2018, 2019) and Iyer (2010), which we have aggregated to the district level. The outcomes variables are as follows:

1. Log absolute magnitude per capita. We intend this to serve as a proxy for a district’s economic development in lieu of reliable GDP data. This is the logarithm of the total luminosity observed in the district divided by the district’s population.

These are taken from Vernon and Storeygard (2011) by way of Novosad et al. (2018). 2. Log rural consumption per capita. This is taken from the Indian Household Survey II by way of Novosad et al. (2019). 3. Share of the workforce employed in the cultivation sector, intended as a mea- sure of rural development and reliance on agriculture (especially subsistence agri- culture). This is taken from Iyer et al. (2010). 4. Gini Index, from Iyer (2010), as a measure of inequality. 5. Intergenerational income mobility (father-son pairs), taken from Novosad et al. (2018). Specifically, we consider the expected income percentile of sons in 2012 whose fathers were located in the 25th percentile for household income (2004), using the upper bound for robustness (9). 6. The percentage of the population with a college degree, taken from census data. 7. Electrification, i.e. the percent of villages with all homes connected to the power grid (even if power is not available twenty-four hours per day). 8. Percent of villages with access to a medical center, taken from Iyer (2010), as a measure of rural development in the aspect of public goods. 9. Percent of villages with any bus service, further intended as a measurement of public goods provision and infrastructure development.

Broadly speaking, these can be classified into three categories with 1-3 representing broad measures of economic development, 4-6 representing inequality and human capital, and 7-9 representing the development of infrastructure and the provision of public goods. As discussed in section two, our preliminary hypothesis is that the occurrence of famines has a negative effect on district development, which is consistent with most of the literature on disasters. Hence, given a higher occurrence of famine, we expect that districts suffering from more famines during the colonial period will be characterized by lower levels of development, being (1) less luminous at night, (2) poorer in terms of a lower rural consumption, and (3) more agricultural, i.e have a higher share of the labor force working in agriculture. Similarly, with regards to inequality and human capital, we expect that more famine-afflicted districts will have (4) higher inequality in terms of a higher Gini index, (5) lower upward social mobility in terms of a lower expected income percentile for sons whose fathers were at the 25th income percentile, and (6) a lower percentage of adults with a college education. Finally, by the same logic, these districts should be relatively underdeveloped in terms of infrastructure, and thus (7) lack access to power, (8) lack access to medical care, and (9) lack access to transportation services.

Finally, even though our independent variable when instrumented should be exogenous, we attempt to control for geographic and climatic factors affecting agriculture and rainfall in each district, namely:

- Soil type and quality (sandy, rocky or barren, etc.) - Latitude (degree) and mean temperature (degrees Celsius) - Coastal location (coded as a dummy variable) - Area in square kilometers (it should be noted that district boundaries correspond well, but not perfectly, to their colonial-era counterparts)

As mentioned previously, research by Iyer and Banerjee (2008, 2014) suggests that the type of land-tenure system implemented during British rule has had a huge impact on development in the districts (10). We also argue that it may be re- lated to famine occurrence directly (for example, in that tenure systems favoring landlords may experience worse famines), in light of the emerging literature on agricultural land rights, development, and food security (Holden and Ghebru 2016, Maxwell and Wiebe 1998). Specifically, we consider specifications with and without the proportion of villages in the district favoring a landlord or non-land- lord tenure system, obtained from Iyer (2010). In fact, the correlation between the two variables in our dataset is slightly above 0.23, which is not extremely high but enough to be of concern in terms of avoiding omitted variable bias. We ultimately consider four specifications for each dependent variable based on the controls in X from equation (1): no controls, land tenure, geography, and land tenure with geography. Each of these sets of controls addresses a different source of omitted variable bias: the first, land-tenure, addresses the possibility of British land-tenure policies causing both famines and long-term development outcomes. The second, geography, addresses the possibility of factors such as mean elevation and temperature impacting crop growth while also influencing long-term development (for example, if hilly and rocky districts suffer from more famines because they are harder to grow crops in but also suffer from lower development because they are harder to build infrastructure in or access via transportation). We avoid using contemporary controls for the outcome variables (that is, including infrastructure variables, income per capita, or welfare variables in the right- hand side) because many of these could reasonably be the result of the historical effects (the impact of famines) we seek to study. As such, including them as controls would artificially dilate the impact of our independent variable.

2. Summary statistics Table I presents summary statistics of our cross-sectional dataset on the follow- ing page. One cause for potential concern is that out of the over 400 districts in colonial India, we have only managed to capture 179 in our sample. This is due chiefly to a paucity of data regarding rainfall; there are only 191 districts captured in the original rainfall data from Donaldson (2012). In addition, the changing of district names and boundaries over time makes the matching of old colonial districts with modern-day administrative subdivisions more imprecise than we would like. Nevertheless, these districts cover a reasonable portion of modern India as well as most of the regions which underwent famines during imperial rule. The small number of districts may also pose a problem in terms of the standard errors on our coefficients, as the magnitude of the impacts of famines that occurred over a hundred years ago on outcomes today is likely to be quite small.

Table 1 – Summary Statistics

Source: Author calculations, from Iyer (2010), Iyer and Bannerjee (2014), Novosad et. al (2018), Asher and No- vosad (2019), Donaldson and Burgess (2012).

4. Ordinary Least Squares Although we suspect that estimates of famine occurrence and severity based on recorded historical observations may be nonrandom for several reasons (mentioned in section two and three), we first consider direct estimation of (1) from section two. For convenience, equation (1) is reprinted below:

As in the previous section, famine refers to the number of years that are coded 2, 3, or 4 in famine severity as described in Srivastava (1968). X is the set of con- temporary covariates, also described in section three. We estimate four separate specifications of (1) where X varies:

1. No controls, i.e. X is empty. 2. Historical land tenure, to capture any effects related to British land policy in causing both famines and long-term developmental outcomes. 3. Geographical controls relating to climatic and terrestrial factors, such as temperature, latitude, soil quality, etc. 4. Both (2) and (3).

Table II presents the estimates for the coefficients on famines and tenure for our nine dependent variables on the following page (we omit coefficients and confidence intervals for the geographic variables for reasons of brevity and relevance in terms of interpretation). In general, the inclusion or exclusion of controls does not greatly change the magnitudes of the estimates nor their significance, except for a few cases. We discuss effects for each dependent variable below: Log of total absolute magnitude in the district per capita: The values for famine suggest that interestingly, each additional famine results in anywhere from 1.8 to 3.6 percent more total nighttime luminosity per person in the district. As mentioned in section three, newer literature shows that nighttime luminosity is a far more reliable gauge of development than reported survey measures such as GDP, so this result is not likely due to measurement error. Thus, as the coefficient on famine is positive, it seems that having suffered more famines is positively related to development. This in fact is confirmed by the instrumental variables (IV) estimates in Table III (see section five). Curiously, the inclusion of tenure and geography controls separately does not change the significance, but including both of them together in the covariates generates far larger confidence intervals than expected and reduces the magnitude of the effect by an entire order of magnitude. This may be because each set of controls tackles a different source of omitted variable bias. As expected, however, land tenure plays a significant role in predicting a district’s development; even a single percent increase in the share of villages with a tenant-favorable system is associated with a whopping 73-80% additional night- time luminosity per person. Log rural consumption per capita: We find evidence that additional famines are associated with lower rural consumption, albeit on a minuscule scale. This suggests that the beneficial effect of famines on development may not be equal across urban and rural areas but instead concentrated in cities. For example, there might be a causal pathway that implies faster urbanization in districts that undergo more famines. Unlike with luminosity, historical land tenure does not seem to play a role in rural consumption. Percent of the workforce employed in cultivation: As expected, additional famines seem to play a strongly significant but small role with regards to the labor patterns in the district. Districts with more famines seem to have nearly one percent of the labor force working in cultivation for each additional famine, suggesting famines may inhibit development of industries other than agriculture and cultivation. Our instrumental variables estimates confirm this. Puzzlingly, land tenure does not seem to be related to this very much at all. Gini Index: The coefficients for the number of famines seem to be difficult to interpret as both those for the specification with no controls and with both sets of controls are statistically significant with similar magnitudes yet opposite signs. The confidence interval for the latter is slightly narrower. This is probably because the true estimate is zero or extremely close to zero, and the inclusion or exclusion of controls is enough to narrowly affect the magnitude to as to flip the sign of the co- efficient. In order to clarify this, more data is needed – i.e for more of the districts in colonial India to be matched in our original sample. At the very least, we can say that land tenure clearly has a large and significant positive association with in- equality. Unfortunately, this association cannot be confirmed as causal due to the lack of an instrument for land tenure which covers enough districts of British India. However, as Iyer and Banerjee (2014) argue, the assignment of tenure systems itself was plausibly random (having been largely implemented on the whims of British administrators) so that one could potentially interpret the results as causal with some level of caution. Intergenerational income mobility: Similarly, we do not find evidence of an association between the number of famines suffered by a district in the colonial era and social mobility in the present day, but we do find a strong impact of land tenure, which makes sense to the reported institutional benefits of tenant-favorable systems in encouraging development as well as the obvious benefits for the tenants and their descendants themselves. Each one-percent increase in the share of villages in a district that uses a tenant-favorable system in the colonial era is associated with anywhere from ten to thirteen percent higher expected income percentile for sons whose fathers were at the 25th percentile in 1989 although the estimates presented in Table II are an upper bound. College education: We find extremely limited evidence that famines in the colonial period are associated with less human capital in the present day, with a near-zero effect of additional famines on the share of adults in a district with a college degree (in fact, rounded to zero with five to six decimal places). Land tenure similarly has very little or no effect. Electrification, access to medical care, bus service: All three of these infra- structure and public goods variables show a negligible effect of famines, but strong impacts of historical land tenure. Ultimately, we find that famines themselves seem to have some positive impact on long-term development despite also being associated with many negative out- comes, such as a greater share of the workforce employed in agriculture (i.e as opposed to more developed activities such as manufacturing or service). Another finding of note is that while famines do not seem to have strong associations with all of our measures, land tenure does. This suggests that the relationship between land-tenure and famine is worth looking into. The existence of bias in the recording of famines, as well as the potential for factors that both cause famines while simultaneously affecting long-term outcomes, present a possible problem with these estimates. We have already attempted to account for one of those, namely historical land tenure systems. Indeed, in most of the specifications, including tenure in the regression induces a decrease in the magnitude of the coefficient on famine. As the effect of famine tends to be extremely small to begin with, the relationship is not always clear. Other errors are also possible. For example, it is possible that a given district experienced a famine in a given year, but insufficient records of its occurrence remained by 1968. Then, Srivastiva (1968) would have assigned that district a code of 0 for that year, but the correct code should have been higher. Indeed, as described in section three, a code of 0 corresponds to a code of “not mentioned”, which encompasses both “not mentioned at all” and “not mentioned as being affected by famine” (Donaldson and Burgess 2010). While measurement error in the dependent variable is usually not a problem, error in the independent variable can lead to attenuation bias in the coefficients since the ordinary least-squares algorithm minimizes the error on the dependent variable by estimating coefficients for the independent variables. The greater this error, the more the ordinary least- squares method will bias the estimated coefficients towards zero in an attempt to minimize error in the dependent variable (Riggs et al. 1978). For these reasons, we turn to instrumental variables estimation in section five in an attempt to provide additional identification.

Table 2 – Ordinary Least-Square Estimates

Notes: Independent variable is number of with recorded famines (famine code of 2 or above). Control specifications: (a) no controls, (b) land-tenure control (proportion of villages with tenant-ownership land tenure system), (c) geographic controls (see section three for enumeration), (d) both land-tenure and geographic controls.

Source: Author calculations. These are more table notes. The style is Table Notes. *** Significant at the 1 percent level or below (p ≤ 0.01). ** Significant at the 5 percent level (0.01 < p ≤ 0.05). * Significant at the 10 percent level (0.05 < p ≤ 0.1).

5. Weather Shocks as an Instrument for Famine Severity As explained in section two, there are many possible reasons why recorded famine data may not be exogenous. In any case, it would be desirable to have a truly exogenous measure of famine, for which we turn to climate data in the form of rainfall shocks. Rainfall is plausibly connected to the occurrence of famines, especially in light of the colonial government’s laissez-faire approach to famine relief (Bhatia 1968). For example, across all districts, mean rainfall averaged around 1.31m in years without any famine and around 1.04m in districts which were at least somewhat affected by famine (code 1 or above). Figure 2 below shows that there is a very clear association between rainfall activity and famines in colonial India, although variability in climate data as well as famine and agricultural policy means that there are some high-rainfall districts which do experience famines as well as low-rainfall districts which do not experience as many famines, as noted in section three.

Figure 2: Associations between famine occurrence and rainfall trends

It should be clear from the first three scatterplots above that there is a negative relationship between the amount of rainfall a district receives and the general prevalence of famine but more importantly, the total size of the rainfall shocks and the total occurrences of famine in that district. From the final plot we see that when we classify low-rainfall years by ranking the deviations from the mean, counting the number of years in which these deviations are in the bottom fifteenth percentile corresponds well to the actual number of recorded famines for each district. In order to use this to measure famine exogenously, we first estimate (2) (see below, section two and section three) where we predict the number of famines from the number of negative rainfall shocks as represented by deviation from the mean in the bottom fifteen percent of all deviations before estimating (1) using this predicted estimate of famine in place of the recorded values. Our reduced form11 estimates, where we first run (1) using the number of negative rainfall shocks directly, are presented on the following pages in Table III (11). The reduced form equation is shown as (4) below as well:

Table 3 – Reduced form estimates for IV

Notes: Independent variable is number of years in which deviation of rainfall from the historic mean is in the bottom fifteenth-percentile. Control specifications: (a) no controls, (b) land-tenure control (proportion of villages with tenant-ownership land tenure system), (c) geographic controls (see section three for enumeration), (d) both land-tenure and geographic controls.

From Table III, it would appear that negative rainfall shocks have similar effects on the outcome variables as do recorded famines in terms of the statistical significance of the coefficients on the independent variable. There is also the added benefit that we can confirm our very small and slightly negative effects of famines on the proportion of adults with a college education: for each additional year of exceptionally low rainfall in a district, the number of adults with a college education in 2011 decreases by 0.1%. In addition, whereas the coefficients in Table II were conflicting, Table III provides evidence in favor of the view that additional famines increase inequality in a district as measured by the Gini index. However, the magnitudes of the effects of famines or low-rainfall years are pre- dominantly larger than their counterparts in Table II to a rather puzzling extent. While we stated earlier in section three that famines and rainfall are not perfectly correlated, it might be that variation in historical rainfall shocks can better explain variation in outcomes in the present day. In order to get a better understanding of the relationship between the two, it would first be wise to look at the coefficients presented in Table IV, which are the results of the two-stage least-squares estimation using low-rainfall years as an instrument for recorded famines. Table IV follows the patterns established in Table II and Table III with regards to the significance of the coefficients as well as their signs; famines have a statistically significant and positive impact on nighttime luminosity, a significant negative impact on rural consumption, and a positive impact on the percent of the labor force employed in agriculture. The results with respect to Table II, concerning the impact of famine on the proportion of adults with a college education, are also very similar. Most other specifications do not show a significant effect of famine on the respective outcome with the exception of access to medical care. Unlike in Table II and Table III, each additional famine is associated with an additional 11.2 to 12.5 percent of villages in that district having some form of medical center or service readily accessible (according to the specifications with geographic controls, which we argue are more believable than the ones without). However, this relationship breaks down at the level of famines seen in some of our districts; a district having suffered nine or ten famines would see more than 100% of its villages having access to medical centers (which is clearly nonsensical), suggesting we may need to look for nonlinearity in the effects of famine in section six. Unfortunately, unlike in Table III, it seems that we cannot conclude much regarding the effect of famines on intergenerational mobility as the coefficients are contradictory and generally not statistically significant. For example, the coefficient on famine in the model without any controls is highly significant and positive, but the coefficient in the model with all controls is not significant and starkly negative. The same is true for the effect of famines on the Gini index. One possibility is that the positive coefficients on famine for both of these dependent variables are driven by outliers as our data was relatively limited due to factors mentioned in section 2. The magnitudes of the coefficients in Table IV are generally smaller than those presented in Table III but still significantly larger than the ones in Table II. For ex- ample, in Table II, the ordinary least-squares model suggests that each additional historical famine is associated with an additional 0.5 to 0.9 percent of the district’s workforce being employed in cultivation in 2011, but in Table IV, these numbers range from 1.5 to 4.3 percent for the same specifications, representing almost a tenfold increase in magnitude in some cases. One reason for this is the possibility attenuation bias in the ordinary least-squares regression; here, there should not be any attenuation bias in our results as the use of instruments which we assume are not correlated with any measurement error in the recording of famines excludes that possibility (Durbin 1954). On the other hand, the Hausman test for endogeneity (the econometric gold standard for testing a model’s internal validity) often fails to reject the null hypothesis that the recorded famine variable taken from Srivastava (1968) and Donaldson and Burgess (2012) is exogenous. To be precise, in one sense the test fails to reject the null hypothesis that the rainfall data add no new “information”, which is not captured in the reported famine data. It is possible that our rainfall instrument, as used in equation (2) is invalid due to endogeneity with the regression model specified in equation (1) despite being excluded from it. The only way to test this possibility is to conduct a Sargan-Han- sen test12 on the model’s overidentifying restrictions; however, we are unable to conduct the test as we have a single instrument. It follows that our model is not actually overidentified (12).

Table 4 –Instrumental Variables Estimates

Notes: Independent variable is number of years with recorded famines (famine code of 2 or above), instrumented with number of low-rainfall years (rainfall deviation from historic mean in bottom fifteenth percentile). Control specifications: (a) no controls, (b) land-tenure control (proportion of villages with tenant-ownership land tenure system), (c) geographic controls (see section three for enumeration), (d) both land-tenure and geographic controls.

Source: Author calculations. These are more table notes. The style is Table Notes.

*** Significant at the 1 percent level or below (p ≤ 0.01). ** Significant at the 5 percent level (0.01 < p ≤ 0.05). * Significant at the 10 percent level (0.05 < p ≤ 0.1).

We also need to consider the viability of our instrumental variables estimates. Table V on the following page offers mixed support. While the weak-instrument test always rejects the null-hypothesis of instrument weakness, for models with more controls, namely those with geographic controls, the first-stage F-values – the test statistics of interest– are relatively small. Which is not encouraging as generally a value of ten or more is recommended to be assured of instrument strength (Staiger and Stock 1997) (13). In Table IV, we show confidence intervals obtained by inverting the Anderson-Rubin test, which accounts for instrument strength in determining the statistical significance of the coefficients. These are wider in the models with more controls, although not usually wide enough to move coefficients from statistically significant to statistically insignificant. However, additional complications arise when considering the Hausman tests for endogeneity. The p-values in Table V suggest that around half of the regression specifications in Table IV do not suffer from a lack of exogeneity, meaning that the ordinary least-squares results are just as valid for those specifications. A more serious issue is that the Hausman test rejects the null-hypothesis of exogeneity for four out of nine outcome variables. Combined with the fact that the first-stage F-statistics are concerningly low for the specifications with geographic controls, this means that not only are the ordinary least-squares results likely to be biased, but the instrumental variables estimates are also likely to be imprecise. This is most concerning for the results related to rural consumption and percent of the workforce in agriculture. Conversely, the results for nighttime luminosity are not affected as the Hausman tests do not reject exogeneity for that outcome variable. While we might simply use the ordinary-least squares results to complement those obtained via two-stage least-squares, the latter are lacking in instrument strength. More importantly, the differences in magnitude between the coefficients presented in Table II and in Table IV are too large to allow this use without abandoning consistency in the interpretation of the coefficients. Ultimately, given that the Hausman tests show that instrumentation is at least somewhat necessary, and the actual p-values for the weak-instrument test are still reasonably low (being less than 0.05 even in the worst case), we prefer to uphold the instrumental variables results as imperfect as some of them may be. We argue that it is better to have un- biased estimates from the instrumental variables procedure (IV), even if they may be less unreliable, than to risk biased results due to endogeneity problems present in ordinary least squares (OLS).

Table 5 – Instrumental Variables Diagnostics

Notes: The weak-instrument test p-value is obtained from comparison of the first-stage F-statistic with the chi- square distribution with degrees of freedom corresponding to the model (number of data points minus number of estimands). Independent variable is number of years in which deviation of rainfall from the historic mean is in the bottom fifteenth-percentile. Control specifications: (a) no controls, (b) land-tenure control (proportion of villages with tenant-ownership land tenure system), (c) geographic controls (see section three for enumeration), (d) both land-tenure and geographic controls.

Source: Author calculations.

6. Discussion Our data suggest that there are long-run impacts of historical famines. Tables II, IV, and VII clearly show that the number of historical famines has a[72] [73] [MOU74] statistically significant, though small impact on the following: average level of economic development as approximated by nighttime luminosity, the share of the population employed in cultivation, consumption, inequality, and the provision of medical services in contemporary Indian districts. There appear to be no discernible effects on intergenerational income mobility or basic infrastructure such as electrification. The effects are quite small and are generally overshadowed by other geographical factors such as climate (i.e., latitude and temperature). They are also small in comparison to the impact of other colonial-era policies such as land-tenure systems. Nevertheless, they are still interesting to observe given that the famines in question occurred nearly a hundred years prior to the measurement of the outcomes in question. We contend that they reveal lasting and significant consequences of British food policy in colonial India. Table IV suggests that a hypothetical district having suffered ten famines - which is not atypical in our data - may have developed as much as ninety-four percent more log absolute magnitude per capita, around forty percent less consumption per capita in rural areas, 150% percent more of the workforce employed in cultivation, and a Gini index nearly ten percent greater than a district which suffered no famines. As to the question of whether or not the famines were directly caused by British policy, the results suggest that, at the very least, British nineteenth-century laissez-faire attitudes to disaster management have had long-lasting consequences for India. Moreover, these estimates are causal as the use of rainfall shocks as instruments provides a means of estimation which is “as good as random.” Therefore, we can confidently state that these effects are truly the result of having undergone the observed famines. In considering whether to prefer our instrumental estimates or our least-squares estimates, we must mainly weigh the problems of a potentially weak instrument versus the benefits of a causal interpretation. We argue that we should still trust the IV estimates even though the instrument is not always as strong as we would like. First of all, the instrumentation of the recorded famine data with the demeaned rainfall data provides plausible causal estimation due to the fact that the rainfall measures are truly as good as random. Even if the recorded famine measure is itself reasonably exogenous as suggested by the Hausman tests, we argue that it is better to be sure. Using instruments for a variable which is already exogenous will not introduce additional bias into the results and may even help reduce attenuation bias from any possible measurement error. The Hausman test, after all, can- not completely eliminate this possibility; it can only suggest how likely or unlikely it is. In this sense, the instrumental estimates allow us to be far more confident in our assessment of the presence or absence of the long-run impact of famines. Though the first-stage F-statistics are less than ten, they are still large enough to reject the null hypothesis of instrument weakness as shown by the p-values for this test in Table V. We argue that it is better to be consistent than pick and choose which set of estimates we want to accept for a given dependent variable and model. We made this choice because the differences in magnitude between the IV and OLS coefficients are too large to do otherwise. A more interesting question raised by the reported coefficients in Table II, Table IV, and Table VII has to do with their sign. Why do districts more afflicted historically by famines seem to have more economic development yet worse out- comes in terms of rural consumption and inequality by our models? This could be due to redistributive preferences associated or possibly even caused by famines; Gualtieri et al. pose this hypothesis in their paper on earthquakes in Italy. We note that districts suffering more famines in the colonial era are more “rural” to- day in that they tend to have a greater proportion of their labor force working in cultivation. This cannot be a case of mere association where more rural districts are more susceptible to famine as our instrumental estimates in Table IV suggest otherwise. Rather, we explore the possibility that post-independence land reform in India was greater in relatively more agricultural districts. Much of the literature on land-tenure suggests that redistributing land from large landowners to smaller farmers is associated with positive effects for productivity and therefore, economic development (Iyer and Banerjee 2005, Varghese 2019). If the historical famines are causally associated with districts having less equal land tenure at independence, then this would explain their positive, though small, impact on economic development by way of inducing more land reform in those districts. On the other hand, if they are causally associated with districts remaining more agricultural in character at independence, and a district’s “agriculturalness” is only indirectly associated with land reform (in they only benefit because they have more agricultural land, so they benefit more from the reform), this would indicate that famines have a small and positive impact on economic development through a process that is less directly causal. Although we are unable to observe land-tenure and agricultural occupations immediately at independence, we are able to supplement our data with addition- al state-level observations of land-reform efforts in Indian states from 1957-1992 compiled in Besley and Burgess (2010) and aggregate the district-level observations of famines in our dataset by state (14). If our hypothesis above is correct, then we should see a positive association between the number of historical famines in a state’s districts and the amount of land-reform legislation passed by that state after independence, keeping in mind that provincial and state borders were almost completely reorganized after independence. Although this data is quite coarse, being on the state level, it is widely available. However, the plot below suggests completely the opposite relationship as each additional famine across the state’s districts appears to be associated with nearly 0.73 fewer land-reform acts. Even after removing the outlier of West Bengal, which underwent far more numerous land reforms due to the ascendancy of the Communist Party of India in that state, the relationship is still quite apparent; every two additional famines are associated with almost one fewer piece of land-reform legislation post-independence.

Figure 3: Historical Famine Occurrence vs Post-independence land reforms

Figure 3 with West Bengal removed

Therefore, there seems to be little evidence that famines are associated with land-reforms at all. This is quite puzzling because it is difficult to see how famine occurrence could lead to positive economic development while hurting outcomes such as inequality, consumption, and public goods provision. One potential explanation is that famines lead to higher urban development while hurting rural development, which would suggest a key impact of famine occurrence is the worsening of an urban-rural divide in economic development. This would explain how high er famine occurrence is linked with higher night-time luminosity, which would itself be positively associated with urbanization but is also linked with lower rural consumption, higher inequality (which may be the result of a stronger rural-urban divide), and a higher proportion of the workforce employed in the agricultural sector. For example, it is highly plausible that famines depopulate rural areas, leaving survivors to concentrate in urban centers, where famine relief is more likely to be available. Donaldson and Burgess (2012), who find that historical famine relief tended to be more effective in areas better served by rail networks, support this explanation. At the same time, the population collapse in rural areas would leave most of the workforce employed in subsistence agriculture going forward. Thus, if famines do lead to more people living in urban areas while simultaneously increasing the proportion of the remaining population employed in agriculture, then they would also exacerbate inequality and worsen rural, economic out- comes. If the urbanization effect is of greater magnitude, this would also explain the slight increase in night-time luminosity and electrification. This is somewhat supported by the plots in Figure 4, in which urbanization is defined as the proportion of a district’s population that lives in urban areas as labeled by the census. It appears that urbanization is weakly associated with famine occurrence (especially when using rainfall shocks) and positively associated with nighttime luminosity and inequality while negatively associated with rural consumption and agricultural employment as hypothesized above. However, instrumental estimates of urbanization as a result of famine detailed in Table VI only weakly support the idea that famine occurrence causally impacts urbanization as only the estimation without any controls is statistically significant.

Figure 4: Urbanization Rates vs. Famine occurrence and Development outcomes

Notes: The first two plots (in the top row) depict urbanization against famine occurrence and negative rainfall shocks. The rest of the plots depict various outcomes (discussed above) against the urbanization rate.

Table VI –Urbanization Vs. Famine Occurrence

Notes: Independent variable is percent of a district’s population that is urban as defined in the 2011 Indian census. Control specifications: (a) no controls, (b) land-tenure control (proportion of villages with tenant-ownership land tenure system), (c) geographic controls (see section three for enumeration), (d) both land-tenure and geographic controls.

Source: Author calculations. *** Significant at the 1 percent level or below (p ≤ 0.01).

** Significant at the 5 percent level (0.01 < p ≤ 0.05). * Significant at the 10 percent level (0.05 < p ≤ 0.1).

Nevertheless, this represents a far more likely explanation for our results than land reform, especially since the land reform mechanism implies that famine occurrence would be associated with better rural outcomes. In other words, if famines being associated with land-reform at independence was the real explanation behind our results, because the literature on land-reform suggests that it is linked with improved rural development, we would not expect to see such strongly negative rural impacts of famine in our results. Therefore, not only is the explanation of differential urban versus rural development as a result of famine occurrence better supported by our data, it also constitutes a more plausible explanation for our findings. While we do not have enough data to investigate exactly how famine occurrence seems to worsen urban-rural divides in economic development (for example, rural population collapse as hypothesized above), such a question would certainly be a key area of future study.

Conclusion In this paper, we have shown that famines occurring in British India have a statistically significant long-run impact on present-day outcomes by using both ordinary least-squares as well as instrumenting for famine with climate shocks in the form of deviated rainfall. In particular, the occurrence of famine seems to ex- acerbate a rural-urban divide in economic development. Famines appear to cause a small increase in overall economic development, but lower consumption and welfare in rural areas while also worsening wealth inequality. This is supported by the finding that famines appear to lead to slightly higher rates of urbanization while simultaneously leading to a higher proportion of a district’s labor force remaining employed in the agricultural sector. Even though our ordinary-least squares measures are generally acceptable, we point to the similar instrumental variable estimates as stronger evidence of the causal impact of the famines. Ultimately, our results demonstrate that negative cli- mate shocks combined with certain disaster management policies, such as British colonial laissez-faire approaches to famine in India, may have significant, though counter-intuitive, impacts on economic outcomes in the long-run.

Endnotes

1 One can essentially understand this technique as manipulating the independent variable, which may not be randomly assigned, via a randomly assigned instrument. 2 The Gini index measures the distribution of wealth or income across individuals, with a score of zero corresponding to perfectly equal distribution and a score of one corresponding to a situation where one individual holds all of the wealth or earns all of the income in the group.

3 The Durbin-Wu-Hausman test essentially asks whether adding the instrument changes bias in the model . A rejection of the null hypothesis implies that differences in coefficients between OLS and IV are due to adding the instrument, whereas the null hypothesis assumes that the independent variable(s) are already exogenous and so adding an instrument contributes no new information to the model. 4 Attenuation bias occurs when there is measurement error in the independent variable, which biases estimates downward due to the definition of the least-squares estimator as one which minimizes squared error on the axis of the dependent variable. See Durbin (1954) for a detailed discussion.

5 Classical growth theory, such as in the Solow-Swan (1957) and Romer (1994) implies long-run convergence and therefore that districts would have similar outcomes today regardless of the number of famines they underwent. However, this is at odds with most of the empirical literature as discussed previously, in which there are often measurable long-term effects to natural disasters. 6 A Poisson process models count data via a random variable following a Poisson distribution. 7 Although we use the term damage, the impact to the economy need not be negative – indeed, we find that some impacts of famine occurrence are positive in sections four and five, which we attempt to explain in section seven.

8 Normally, OLS assumes that the variance of the error term is not correlated with the independent variable(s) i.e the errors are homoscedastic. If this is not true, i.e the errors are heteroscedastic, then the standard errors will be too small. Robust least-squares estimation calculates the OLS standard errors in a way that does not depend on the assumption that the errors are homoscedastic.

9 So, for example, if this value is 25, then there is on average no mobility on average, as sons would be expected to remain in the same income percentile as their fathers. Similarly, if it is less than (greater than) 25, then there would be downward (upward) mobility. A value of 50 would indicate perfect mobility, i.e no relationship between fathers’ income percentiles and those of their sons.

10 For a brief overview of the types of systems employed by the East India Company and Crown administrators, see Iyer and Banerjee (2008), or see Tirthankar (2006) for a more detailed discussion.

11 While reduced form estimates–that is, estimating the outcomes as direct functions of the exogenous variables rather than via a structural process–are often not directly interpretable, they can serve to confirm the underlying trends in the data (for example, via the sign of the coefficients), which is why we choose to include them here.

12 The Sargan-Hansen test works very similarly to the Durbin-Wu-Hausman test, but instead uses a quadratic form on the cross-product of the residuals and instruments.

13 To be precise, this heuristic is technically only valid with the use of a single instrument, which is of course satisfied in our case anyway.

14 To be clear, the value of famine for each state is technically the average number of famines in the historical districts that are presently part of the state, since subnational boundaries were drastically reorganized along linguistic lines after independence.

Bibliography Agbor, Julius A., and Gregory N. Price. 2014. “Does Famine Matter for Aggregate Adolescent Human

Capital Acquisition in Sub-Saharan Africa?” African Development Review/Revue Africaine de

Développement 26 (3): 454–67. Am brus, Attila, Erica Field, and Robert Gonzalez. 2020. “Loss in the Time of Cholera: Long-Run

Impact of a Disease Epidemic on the Urban Land- scape.” American Economic Review, 110 (2):

475-525. Anand, R., Coady, D., Mohommad, A., Thakoor, V. V., & Walsh, J. P. 2013. “The Fiscal and Welfare

Impacts of Reforming Subsidies in India”. The Inter- national Monetary Fund, IMF Working

Papers 13/128. Anderson, T.W. and Rubin, H. 1949. Estimation of the parameters of a single equation in a complete

system of stochastic equations. Annals of Mathematical Statistics, 20, 46-63. Asher, Sam, Tobias Lunt, Ryu Matsuura, and Paul Novosad. 2019. The Socioeconomic High-

Resolution Rural-Urban Geographic Dataset on India. Asher, Sam and Novosad, Paul. 2019. “Rural Roads and Local Economic Development”. American

Economic Review (forthcoming). Web. Bakkensen, Laura and Lint Barrage. 2018. “Do Disasters Affect Growth? A Macro Model-Based

Perspective on the Empirical Debate”. IMF Workshop on Macroeconomic Policy and Income

Inequality. Bannerjee, Abhijit and Lakshmi Iyer. 2005. “History, Institutions, and Economic Performance: The

Legacy of Colonial Land Tenure Systems in India”. American Economic Review 95(4) pp. 1190-

1213. Besley, Timothy and Burgess, Robin. 2000. Land reform, poverty reduction and growth: evidence from

India. Quarterly Journal of Economics, 115 (2). pp. 389-430. Bhatia, B.M. 1968. Famines in India. A Study in Some Aspects of the Economic History of India (1860-

1965) London: Asia Publishing House. Print. Bose, Sugata and Ayesha Jalal. 2004. Modern South Asia: History, Culture, Political Economy (2nd ed.)

Routledge.

Brekke, Thomas. 2015. “Entrepreneurship and Path Dependency in Regional Development.”

Entrepreneurship and Regional Development 27 (3–4): 202–18. Burgess, Robin and Dave Donaldson. 2010. “Can Openness Mitigate the Effects of Weather Shocks?

Evidence from India’s Famine Era”. American Economic Review 100(2), Papers and Proceedings

of the 122nd Annual Meeting of the American Economic Association pp. 449-453. Carlyle, R. W. 1900. “Famine Administration in a Bengal District in 1896-7.” Economic Journal 10:

420–30. Cheng, Wenli, and Hui Shi. 2019. “Surviving the Famine Unscathed? An Analysis of the Long-Term

Health Effects of the Great Chinese Famine.” Southern Economic Journal 86 (2): 746–72. Cohn, Bernard S. 1960. “The Initial British Impact on India: A case study of the Benares region.” The

Journal of Asian Studies. Association for Asian Studies. 19 (4): 418–431. Cole, Matthew A., Robert J. R. Elliott, Toshihiro Okubo, and Eric Strobl. 2019. “Natural Disasters

and Spatial Heterogeneity in Damages: The Birth, Life and Death of Manufacturing Plants.”

Journal of Economic Geography 19 (2): 373–408. Davis, Mike. 2001. Late Victorian Holocausts: El Niño Famines and the Making of the Third World.

London: Verso. Print. Dell, Melissa, Benjamin F. Jones, and Benjamin A. Olken. 2012. “Temperature Shocks and Economic

Growth: Evidence from the Last Half Century.” American Economic Journal: Macroeconomics, 4

(3): 66-95. Dell, Melissa. 2013.“Path dependence in development: Evidence from the Mexican Revolution,”

Harvard University Economics Department, Manuscript. Donaldson, Dave. 2018. “Railroads of the Raj: Estimating the Impact of Transportation

Infrastructure.” American Economic Review, 108 (4-5): 899-934. Drèze, Jean. 1991. “Famine Prevention in India”, in Drèze, Jean; Sen, Amartya (eds.), The Political

Economy of Hunger: Famine prevention Oxford University Press US, pp. 32–33. Dutt, R. C. 1902, 1904, 2001. The Economic History of India Under Early British Rule. From the Rise

of the British Power in 1757 to the Accession of Queen Victoria in 1837. London: Routledge. Durbin, James. 1954. “Errors in Variables”. Revue de l’Institut International de Statistique / Review of

the International Statistical Institute, 22(1) pp. 23-32. Ewbank, R. B. 1919. “The Co-Operative Movement and the Present Famine in the Bombay

Presidency.” Indian Journal of Economics 2 (November): 477–88. FAOSTAT. 2018. FAOSTAT Data. Faostat.fao.org, Food and Agriculture Organization of the United

Nations. Fieldhouse, David. 1996. “For Richer, for Poorer?”, in Marshall, P. J. (ed.), The Cambridge Illustrated

History of the British Empire, Cambridge: Cambridge University Press. Pp. 400, pp. 108–146. Goldberger, Arthur S. 1964. “Classical Linear Regression”. Econometric Theory. New York: John Wiley

& Sons. Pp. 164-194. Gooch, Elizabeth. 2017. “Estimating the Long-Term Impact of the Great Chinese Famine (1959-61)

on Modern China.” World Development 89 (January): 140–51. Gualtieri, Giovanni, Marcella Nicolini, and Fabio Sabatini. 2019. “Repeated Shocks and Preferences

for Redistribution.” Journal of Economic Behavior and Organization 167(11): 53–71. Henderson, J. Vernon, Adam Storeygard, and David Weil. 2011. “A Bright Idea for Measuring

Economic Growth.” American Economic Review. Hochrainer, S. 2009. “Assessing the Macroeconomic Impacts of Natural Disasters: Are there Any?”

World Bank Policy Research Working Paper 4968. Washington, DC, United States: The World

Bank. Holden, Stein T. and Hosaena Ghebru. 2016. “Land tenure reforms, tenure security and food security

in poor agrarian economies: Causal linkages and research gaps.” Global Food Security 10: 21-28. Hoyle, R. W. 2010. “Famine as Agricultural Catastrophe: The Crisis of 1622-4 in East Lancashire.”

Economic History Review 63 (4): 974–1002. Hu, Xue Feng, Gordon G. Liu, and Maoyong Fan. 2017. “Long-Term Effects of Famine on Chronic

Diseases: Evidence from China’s Great Leap Forward Famine.” Health Economics 26 (7): 922–36. Huff, Gregg. 2019. “Causes and Consequences of the Great Vietnam Famine, 1944-5.” Economic

History Review 72 (1): 286–316. Lima, Ricardo Carvalho de Andrade, and Antonio Vinicius Barros Barbosa. 2019. “Natural Disasters,

Economic Growth and Spatial Spillovers: Evidence from a Flash Flood in Brazil.” Papers in

Regional Science 98 (2): 905–24. Maxwell, Daniel, and Keith Daniel Wiebe. 1998. Land tenure and food security: A review of concepts,

evidence, and methods. Land Tenure Center, University of Wisconsin-Madison, 1998. McKean, Joseph W. 2004. “Robust Analysis of Linear Models”. Statistical Science 19(4): 562–570. Nguyen, Linh, and John O. S. Wilson. 2020. “How Does Credit Supply React to a Natural Disaster?

Evidence from the Indian Ocean Tsunami.” European Journal of Finance 26 (7–8): 802–19. Pinkovsky, Maxim L. and Xavier Sala-i-Martin. 2016. “Lights, Camera, ... In- come! Illuminating the

National Accounts-Household Surveys Debate,” Quarterly Journal of Economics, 131(2): 579-

631. Li, Q. and J.S. Racine. 2004. “Cross-validated local linear nonparametric regression,” Statistica Sinica

14: 485-512. Riggs, D. S.; Guarnieri, J. A.; et al. (1978). “Fitting straight lines when both variables are subject to

error.” Life Sciences. 22: 1305–60. Romer, P. M. 1994. “The Origins of Endogenous Growth”. The Journal of Economic Perspectives. 8

(1): 3–22. Roy, Tirthankar. 2006. The Economic History of India, 1857–1947. Oxford U India. Print. Ruppert, David, Wand, M.P. and Carroll, R.J. 2003. Semiparametric Regression. Cambridge

University Press. Print. Salibian-Barrera, M. and Yohai, V.J. .2006. A fast algorithm for S-regression esti- mates, Journal of

Computational and Graphical Statistics 15(2): 414-427. Scholberg, Henry. 1970. The district gazetteers of British India: A bibliography. University of

California, Bibliotheca Asiatica 3(4). Sharma, Ghanshyam, and Kurt W. Rotthoff. 2020. “The Impact of Unexpected Natural Disasters on

Insurance Markets.” Applied Economics Letters 27(6): 494–97. Solow, Robert M. 1957. “Technical change and the aggregate production function.” Review of

Economics and Statistics. 39 (3): 312–320. Srivastava, H.C. 1968. The History of Indian Famines from 1858–1918, Sri Ram Mehra and Co., Agra.

Print. Staiger, Douglas, and James H. Stock. 1997. “Instrumental Variables Regression with Weak

Instruments.” Econometrica 65(3): 557-586. Thompson, Kristina, Maarten Lindeboom, and France Portrait. 2019. “Adult Body Height as a

Mediator between Early-Life Conditions and Socio-Economic Status: The Case of the Dutch

Potato Famine, 1846-1847.” Economics and Human Biology 34 (August): 103–14. Varghese, Ajay. 2019. “Colonialism, Landlords, and Public Goods Provision in India: A Controlled

Comparative Analysis”. The Journal of Development Studies, 55(7), pp. 1345-1363. Wang, Chunhua. 2019. “Did Natural Disasters Affect Population Density Growth in US Counties?”

Annals of Regional Science 62 (1): 21–46. World Bank. 2011. “India Country Overview.” Worldbank.org