Impact of Improved Maize Technology Package on Maize Yield Growth of Major Maize Growing Regions of Ethiopia

This study examines the differential impact of adoption of improved maize technology package (including improved maize varieties, fertilizer of any kind as well as row planting) on maize yield growth using 645 sample farm households in three major maize growing administrative regions of Ethiopia. Propensity score matching (PSM) technique was employed since it is an increasingly utilized standard approach for evaluating impacts using observational data. It is found that adoption of improved maize technology package doesn't have the desired positive and significant impact on yield growth in all of the administrative regions considered. Moreover, the magnitude of the impact greatly varies among regions. Thus, the study recommends that the agricultural research and extension system of the country should further consider the various differences that exist among different regions and areas of the country so as to generate and disseminate appropriate and suitable improved agricultural technologies and information. Keywords: Impact, Maize, Improved Technology Package, Ethiopia DOI: 10.7176/ISDE/12-1-01 Publication date: January 31 st 2021

buyers of food and as sellers of labor and membership in these categories is affected not only by asset positions, but also by gender, ethnicity, and social status, as they imply differing abilities to use the same assets and resources in responding to opportunities. This pervasive heterogeneity in agriculture and rural society has deep implications for public policy in using agriculture for development. As a particular policy reform is likely to have gainers and losers, policies have to be differentiated according to the status and context of households, taking particular account of prevailing gender norms. Differentiated policies are designed not necessarily to favor one group over the other but to serve all households more cost effectively, tailoring policies to their conditions and needs, particularly to the poorest. Balancing attention to the favored and less favored subsectors, regions, and households is one of the toughest policy dilemmas facing poor countries with severe resource constraints (World Bank, 2007).
In Ethiopia, agricultural production is dominated by smallholder households which produce more than 90% of agricultural output and cultivate more than 90% of the total cropped land (Bekabil, 2018). Smallholder production is dominated by five major cereal crops-teff, maize, wheat, sorghum, and barley-accounting for almost three quarters of the total cultivated area and about 68 percent of total production (Dorosh and Rashid, 2012). Improving the productivity, profitability, and sustainability of smallholder farming is the main pathway out of poverty in using agriculture for development. In this regard, a broad array of policy instruments, many of which apply differently to commercial smallholders and to those in subsistence farming, can be used to achieve the following: improve price incentives and increase the quality and quantity of public investment; make product markets work better; improve access to financial services and reduce exposure to uninsured risks; enhance the performance of producer organizations; promote innovation through science and technology; make agriculture more sustainable and a provider of environmental services (World Bank, 2007). With regard to promoting innovation through science and technology, developing countries invest only a ninth of what industrial countries put into agriculture R&D as a share of agricultural GDP including both public and private sources. To narrow this divide, sharply increased investments in R&D must be at the top of the policy agenda. Many international and national investments in R&D have paid off handsomely. But global and national failures of markets and governance lead to serious underinvestment in R&D and in innovation systems more generally, particularly in the agriculture-based countries (World Bank, 2007). In addition, accordingly, African countries are disadvantaged by the fact that the specificity of their agro-ecological features leaves them less able than other regions to benefit from international technology transfers. Low investments in R&D and low international transfers of technology have gone hand in hand with stagnant cereal yields in Sub-Saharan Africa, resulting in a widening yield gap with the rest of the world. For these countries, sharply increased investment and regional cooperation in R&D are urgent. Moreover, many public research organizations face serious leadership, management, and financial constraints that require urgent attention. But higher-value markets open new opportunities for the private sector to foster innovation along the value chain and grasping them often requires partnerships among the public sector, private sector, farmers, and civil society in financing, developing, and adapting innovation. With a wider range of institutional options now available, more evaluation is needed of what works well in what contexts (World Bank, 2007). In response to this need, the objective of this study is to identify the impact of adoption of improved maize technology package (including improved maize varieties, fertilizer of any kind as well as row planting) on maize yield growth in each of the three administrative regions of Ethiopia (namely Oromia, Amhara, and South Nations, Nationalities & People) which are also known to be the major maize producing regions in the country. Cochran (1965) defined an observational study to be an empirical investigation in which the "objective is to elucidate cause-and-effect relationships . . . [in settings in which] it is not feasible to use controlled experimentation, in the sense of being able to impose the procedures or treatments whose effects it is desired to discover, or to assign subjects at random to different procedures" (p. 234) (Austin, 2011). By this definition, accordingly, an observational study has the same intent as a randomized experiment: to estimate a causal effect. However, an observational study differs from a randomized experiment in one design issue: the use of randomization to allocate units to treatment and control groups. In observational studies, the treated subjects often differ systematically from untreated subjects. Thus, in general, we have that E[Y(1) | D =1] ≠ E[Y(1)] (and similarly for the control treatment), and an unbiased estimate of the average treatment effect cannot be obtained by directly comparing outcomes between the two treatment groups (Austin, 2011).

Analytical Framework for Evaluation
Matching deals with the selection process by constructing a comparison group of individuals with observable characteristics similar to the treated when no randomized control group is available (Blundell and Dias, 2002). As to their clarification, the matching method aims to construct the correct sample counterpart for the missing information on the treated outcomes had they not been treated by pairing each participant with members of nontreated group and under the matching assumption, the only remaining difference between the two groups is program participation. As with all non-parametric methods, the dimensionality of the problem as measured by X may seriously limit the use of matching (Blundell and Dias, 2002). According to them, a more feasible alternative is to match on a function of X and usually, this is carried out on the propensity to participate given the set of characteristics X: P(X i ) = P(D i =1| X i ) the propensity score.
In the counterfactual framework, causal inference is approached by first stipulating the existence of two potential outcome random variables that are defined over all individuals in the population: Y i (1) is the potential outcome in the treatment state for individual i, and Y i (0) is the potential outcome in the control state for individual i (Morgan and Harding, 2006). The individual-level causal effect of the treatment is then defined as Y i (1) -Y i (0) (Morgan and Harding, 2006;Austin, 2011). Because it is usually impossible to effectively estimate individuallevel causal effects, we typically shift attention to aggregated causal effects (Morgan and Harding, 2006). With E[.] denoting the expectation operator from probability theory, the average treatment effect (ATE) is defined to be , 2011, citing Imbens, 2004Morgan and Harding, 2006). It is the average effect, at the population level, of moving an entire population from untreated to treated. A related measure of treatment effect is the average treatment effect for the treated (ATT) (Austin, 2011citing Imbens, 2004 is the average effect of treatment on those subjects who ultimately received the treatment. In an RCT these two measures of treatment effects coincide because, due to randomization, the treated population will not, on average, differ systematically from the overall population (Austin, 2011).
Operationally, propensity score methods begin with the estimation of a model to estimate the fitted probability, or propensity, to receive the treatment versus comparison group. Observations that have a similar estimated propensity to be in either the treatment group or comparison group will tend to have similar observed covariate distributions. It should be clear, however, that matching is no 'magic bullet' that will solve the evaluation problem in any case. It should only be applied if the underlying identifying assumption can be credibly invoked based on the informational richness of the data and a detailed understanding of the institutional set-up by which selection into treatment takes place (Caliendo and Kopeinig, 2008). The underlying identifying assumption which is known as un-confoundedness, selection on observables, conditional independence (CIA) or exogeneity, postulates that the covariate information in the data is rich enough to control for characteristics jointly affecting the treatment and the outcome. As a further assumption known as common support, it is required that for any empirically feasible combination of observed covariates, both treated and non-treated subjects can be observed, which rules out that the covariates perfectly predict participation. Finally, the covariates must in general not be affected by the treatment, and therefore must not contain (post-treatment) characteristics that are affected by the treatment, in order to not condition away part of the treatment effect of interest (Huber, 2019). According to him, denoting by X the vector of observed covariates and X(1), X(0) the potential covariate values with and without treatment, the assumptions can formally be stated as: is the conditional treatment probability, also known as propensity score.
The challenge of matching is to ensure that the 'correct' set of observables X is being used so that the observations of non-participants are what the observations of treated would be had they not participated, forming the right counterfactual and satisfying CIA (Blundell and Dias, 2002). There is a lack of consensus in the applied literature as to which variables to include in the propensity score model (Austin, 2011). According to him, given the propensity score is defined to be the probability of treatment assignment, there are theoretical arguments in favor of the inclusion of only those variables that affect treatment assignment. In practical terms, however, the more detailed the information is, the harder it is to find a similar control and the more restricted the common support becomes. That is, the appropriate trade-off between the quantity of information at use and the share of the support covered may be difficult to achieve. If, however, the right amount of information is used, matching deals well with potential bias (Blundell and Dias, 2002).
Once the propensity scores have been estimated, the propensity scores of the treatment group can be matched to propensity scores of subjects in a comparison group and this allows one to estimate the ATT. The most common implementation of propensity score matching is one-to-one or pair matching, in which pairs of treated and untreated subjects are formed, such that matched subjects have similar values of the propensity score. However, other approaches can also be used (Austin, 2011).
The true propensity score is a balancing score. Therefore, in strata of subjects that have the same propensity score, the distribution of measured baseline covariates will be the same between treated and untreated subjects. Appropriate methods for assessing whether the propensity score model has been adequately specified involve examining whether the distribution of measured baseline covariates is similar between treated and untreated subjects with the same estimated propensity score (Austin, 2011). One approach uses a two-sample t-test to check if there are significant differences in covariate means for both groups (Caliendo andKopeinig, 2008 citing Rosenbaum andRubin, 1985). Before matching differences are expected, but after matching the covariates should be balanced in both groups and hence no significant differences should be found (Caliendo and Kopeinig, 2008). If, after conditioning on the propensity score, there remain systematic differences in baseline covariates between treated and untreated subjects, this can be an indication that the propensity score model has not been correctly specified (Austin, 2011).

Data and Variables
The data utilized for this study is acquired from the third wave of the Ethiopia Socioeconomic Survey (ESS) 2015-2016. The Ethiopian Socioeconomic Survey (ESS) is a collaborative long-term project between the Central Statistics Agency of Ethiopia (CSA) and the World Bank Living Standards Measurement Study-Integrated Surveys on Agriculture (LSMS-ISA) team to collect panel data. The ESS collects information on household agricultural activities along with other information on the households like human capital, other economic activities, access to services and resources. ESS uses a nationally representative sample of over 5,000 households living in rural and urban areas. The urban areas include both small and large towns. The sample is a two-stage probability sample. The first stage of sampling entailed selecting primary sampling units, which are a sample of the CSA enumeration areas (EAs). The second stage of sampling was the selection of households to be interviewed in each EA. A total of 433 EAs were selected based on probability proportional to size of the total EAs in each region out of which 290 were rural, 43 were small town EAs from ESS1, and 100 were EAs from major urban areas. In order to ensure sufficient sample size in the most populous regions (Amhara, Oromiya, SNNP, and Tigray) and Addis Ababa, quotas were set for the number of EAs in each region. The sample is not representative for each of the small regions including Afar, Benishangul-Gumuz, Dire Dawa, Gambella, Harari, and Somali regions. However, estimates can be produced for a combination of all smaller regions as one "other region" category. During wave 3, 1255 households were re-interviewed yielding a response rate of 85 percent. Attrition in urban areas is 15% due to consent refusal and inability to trace the whereabouts of sample households. Yield stands for the yield of maize per unit of land cropped measured in quintals per hectare. LnYield stands for the natural logarithmic transformation of Yield. HHAGE stands for the age of a household head in years. HHSEX is a dummy variable indicating the sex of a household head where HHSEX = 1 if the head is male and 0 if otherwise. HHEDU is a dummy variable indicating whether a household head is literate where HHEDU = 1 if the head is literate/able to read and write in any language / and 0 if otherwise. HHRELIGION is a dummy variable indicating the main religion of a household head. FAMILY_SIZE stands for size of a household. CREDIT is a dummy variable indicating household's access to credit where CREDIT = 1 if anyone in the household has borrowed greater than 150 birr from someone outside the household or from an institution for business or farming purposes over the past 12 months and 0 if otherwise. LANDHOLDING_SIZE stands for size of the land holding of a household measured in meter squared. OVERALLPLOTOWN is a dummy variable indicating household's plot ownership where OVERALLPLOTOWN = 1 if the household has some plot under its ownership (acquired through inheritance or local leaders' grant) and 0 if otherwise. AVERPLOTSLOPE stands for the average plot slope of a household' overall plot measured in percent. OVERALLFERTILEPLOT is a dummy variable indicating household's overall plot soil quality where OVERALLFERTILEPLOT = 1 if the household has some plot with fair or good soil quality and 0 if otherwise. DSTNEARMKT stands for distance to the nearest market from residence measured in kilometer. DSTMAJROAD stands for distance to the nearest major road from residence measured in kilometer. DSTNEARPOPCENTER stands for distance to the nearest population center with more than 20,000 people from residence measured in kilometer. OXEN stands for the total number of oxen owned by a household. HHTLU stands for the total livestock units currently owned and kept by a household. EXCONTACT is a dummy variable indicating whether a household had participated in the extension program where EXCONTACT = 1 if the household had participated in the extension program and 0 if otherwise. NONAGRIBUSIN is a dummy variable indicating whether a household owned a non-agriculture business or provided a non-agricultural service from home over the past 12 months where NONAGRIBUSIN = 1 if the household has owned a non-agriculture business or provided a non-agricultural service from home over the past 12 months and 0 if otherwise. COMIRRIGSCH is a dummy variable indicating presence of an irrigation scheme in the community in which a household reside where COMIRRIGSCH = 1 if the community in which a household reside has an irrigation scheme and 0 if otherwise. AMTOFRAIN is a dummy variable indicating the amount of rain received in the last season.

Propensity Scores Estimation using Probit Model
The descriptive statistics has shown a tentative impact of improved maize technology package adoption on increasing yield growth in all of the regions. Nevertheless, a mere comparison of yield growth has no causal meaning since improved maize technology package adoption is endogenous. Thus, it is difficult to attribute the change to adoption of improved maize technology package since the difference in yield growth might be owing to other determinants. To this end, a rigorous impact evaluation method; namely, Propensity Score Matching has to be employed to control for observed characteristics and determine the actual attributable impact of improved maize technology package adoption on yield growth in different maize producing regions of Ethiopia. Propensity scores for adopters and non-adopters were estimated using a probit model to compare the treatment group with the control group. In this regard, only those significant variables were used in estimating the propensity scores for each region. The check for 'overlap condition' across the treatment and control groups was done through visual inspection of the propensity score distributions for both the treatment and comparison groups and the result as indicated on figure 1 showed that the overlap condition is satisfied for all the three regions considered as there is substantial overlap in the distribution of the propensity scores of both adopters and non-adopters.
For Oromia region, the propensity score for adopters ranges between 0.0470692 and 0.9843553 while it ranges between 0.0044759 and 0.9068076 for non-adopters and the region of common support for the distribution of estimated propensity scores of adopters and non-adopters ranges between 0.04706915 and 0.98435532. For Amhara region, the propensity score for adopters ranges between 0.1147719 and 0.9999913 while it ranges between 1.12e-17 and 0.9289967 for non-adopters and the region of common support for the distribution of estimated propensity scores of adopters and non-adopters ranges between 0.11477187 and 0.99999131. For SNNP region, the propensity score for adopters ranges between 0.0383172 and 0.9948128 while it ranges between 0.0001237 and 0.8994336 for non-adopters and the region of common support for the distribution of estimated propensity scores of adopters and non-adopters ranges between 0.03831718 and 0.99481279. When matching techniques are employed, observations whose propensity score lies outside this range were discarded.

Assessing Matching Quality
In order to check whether the matching procedure is able to balance the distribution of the relevant variables in both the control and treatment group, the before and after matching covariate balancing tests presented on table 1 suggested that the proposed specification of the propensity score is fairly successful in balancing the distribution of covariates between the two groups as indicated by decreasing pseudo R 2 and mean standardized bias for most and all regions under consideration.

Results
Among the different matching algorithms being available for Propensity Score Matching, nearest neighbor matching and kernel matching are the most commonly applied ones (Kikulwe et al., 2012 citing Caliendo andKopeinig, 2008). Accordingly, nearest neighbor matching matches adopters with non-adopters with the nearest propensity score, while controlling for differences between adopters and non-adopters whereas kernel matching computes treatment effects by deducting from each outcome observation in the treatment group a weighted average of outcomes in the control group. Table 2 depicts the average impact of improved maize technology package adoption on maize yield growth using nearest neighbor matching one and five (NN=1 and NN=5) as well as Epanechnikov kernel matching with two band widths (BW=0.03 and BW=0.06). Accordingly, all or most of the matching algorithms employed support the hypothesis that improved maize technology package adoption has a positive and significant impact on yield growth in only two of the three regions considered, namely Oromia and Amhara. Moreover, it has a higher impact on yield growth in Amhara region, ranging from 96-119%, compared to that in Oromia region, ranging from 44-51%.

Conclusion and Recommendation
This study is undertaken to shed-light on the differential impact of adoption of improved maize technology package (including improved maize varieties, fertilizer of any kind as well as row planting) on maize yield growth among various major maize producing administrative regions of Ethiopia using the propensity score matching technique which is a robust impact evaluation technique that identifies the impact which can be attributed to the adoption of improved maize technology package. The study also employed and compared different matching algorithms to ensure robustness of the impact estimates. Finally, the study concludes that improved maize technology package adoption doesn't have the desired positive and significant impact on maize yield growth in all of the administrative regions considered. Moreover, the magnitude of the impact greatly varies among regions showing positive and significant impact. Therefore, this study recommends that the agricultural research and extension system of the country should be strengthened to further take into account the differences among different regions and areas (like zones, woredas and "kebeles"/villages) having high variability in landscape positions, agro-ecologies, soil characteristics and farming systems in order to generate and scale-up appropriate improved agricultural technologies and information that suits to the specific conditions of each maize producing land pockets of the country.