Differential Impact of Organic and/or Inorganic Fertilizer Application with Row Planting on Maize Yield Growth of Major Maize Growing Regions of Ethiopia

This study examines the differential impact of adoption of fertilizer (organic, inorganic or both) with row planting on maize yield growth using 673 sample farm households in four major maize growing administrative regions of Ethiopia. Propensity score matching (PSM) technique was employed since it is an increasingly utilized standard approach for evaluating impacts using observational data. It is found that adoption of fertilizer with row planting doesn't have the desired positive and significant impact on yield growth in all of the administrative regions considered except one region called Amhara. Thus, the study recommends that the agricultural research and extension system of the country should further consider the various differences that exist among different regions and areas of the country so as to generate and disseminate appropriate and suitable improved agricultural technologies and information.

makers still recognize that there is extensive room for improvement (Dorosh and Rashid, 2012).
In this regard, key constraints to agricultural productivity in Ethiopia include low availability of improved or hybrid seed, lack of seed multiplication capacity, low and declining soil fertility as a result of soil erosion and desertification in certain agro ecological zones due to over-cultivation and limited investment in land improvement, low profitability and efficiency of fertilizer use due to rising cost of chemical fertilizer as well as the lack and limited use of complimentary improved farming practices and seed, and lack of irrigation and water constraints. In addition, lack of transport infrastructure and market access decreases the profitability of adopting improved practices (Bekabil, 2018citing AfDB, 2012, Kate & Leigh, 2010and Reimund et al., 2007.
Moreover, Ethiopia is endowed with diverse terrain and agro-ecological climate ranging from temperate in the highlands to tropical in the lowlands (Bekabil, 2018). The rugged terrain in much of the highlands makes transport and communication difficult. Rainfall also varies significantly between mountains and valleys, even across short distances (Dorosh and Rashid, 2012). Accordingly, the opportunities and constraints facing Ethiopian agriculture are strongly influenced by geographical location. Hence, identifying the right technological package for the various ecologies and crops has been of considerable challenge to researchers and extension systems (Bekabil, 2018citing Mulat et al., 2004. These all obviously calls for a further and a better growth in agricultural productivity as well as quality with minimum adverse impact on the environment mainly through the supply, duplication and diffusion of continuously improving as well as location specific technology and information. Appropriate evaluation of the impact of those efforts of the past few decades in general and of the past recent years in particular is believed to be useful in order to create a more fertile ground for the fast and better achievement of the aforementioned goal. However, studies assessing the contribution of improved inputs and crop management practices for the productivity growth and other outcomes of interest of such important and widely cultivated cereals like maize carried out in Ethiopia in the past were not only few but also restricted to piece meal or location specific approach. As a result, the conclusions drawn from such studies that didn't use a nationally or regionally representative data would have low probability of influencing national and regional policies. Thus, the objective of this study is to identify the impact of use of fertilizer of any kind (organic, inorganic or both) with row planting on maize yield growth in each of the four administrative regions of Ethiopia (namely Oromia, Amhara, South Nations, Nationalities & People and Benishangul-Gumuz) which are also known to be the major maize producing regions in the country.

Analytical Framework for Evaluation
An impact evaluation must establish what has been the cause of observed changes (in this case 'impacts') referred to as causal attribution (also referred to as causal inference). Among broad strategies for causal attribution in impact evaluations, estimating the counterfactual (i.e., what would have happened in the absence of the intervention, compared to the observed situation) is one. On the other hand, among design options that address causal attribution, Quasi-experimental designs -which construct a comparison group through matching, regression discontinuity, propensity scores or another means is one unlike experimental designs -which construct a control group through random assignment. Random assignment is used to assure that participation in the intervention is the only differentiating factor between units subject to the intervention and those excluded from it, so that the control group can be used to assess what would have happened to participants in the absence of the intervention (Heinrich et al., 2010). However, treatment assignment is not often random because of the following factors: (a) purposive program placement and (b) self-selection into the program. That is, programs are placed according to the need of the communities and individuals, who in turn self-select given program design and placement (Khandker et al. 2010). Accordingly, self-selection could be based on observed characteristics, unobserved factors, or both.
In absence of an experimental design, assignment to treatment is frequently nonrandom, and thus, units receiving treatment and those excluded from treatment may differ not only in their treatment status but also in other characteristics that affect both participation and the outcome of interest. To avoid the biases that this may generate, matching methods find a non-treated unit that is "similar" to a participating unit, allowing an estimate of the intervention's impact as the difference between a participant and the matched comparison case. Averaging across all participants, the method provides an estimate of the mean program impact for the participants (Heinrich et al., 2010).
In the potential outcomes framework, there are two possible treatments (e.g., active treatment vs. control treatment) and an outcome and given a sample of subjects and a treatment, each subject has a pair of potential outcomes: Yi(0) and Yi(1), the outcomes under the control treatment and the active treatment, respectively (Austin, 2011). However, according to him, each subject receives only one of the control treatment or the active treatment. Let D be an indicator variable denoting the treatment received (D = 0 for control treatment vs. D = 1 for active treatment). Thus, only one outcome, Yi(Yi = DiYi(1) + (1 -Di)Yi(0)), is observed for each subject: the outcome under the actual treatment received. For each subject, the effect of treatment is defined to be Yi(1) -Yi(0) (Austin, Journal of Natural Sciences Research www.iiste.org ISSN 2224-3186 (Paper) ISSN 2225-0921 (Online) Vol.11, No.21, 2020 3 2011). In general, an evaluation seeks to estimate the mean impact of an intervention which might be a small project, a large program, a collection of activities, or a policy, obtained by averaging the impact across all the individuals in the population (Heinrich et al., 2010). This parameter is known as Average Treatment Effect or ATE: 0)) where E(.) represents the average (or expected value). Another quantity of interest is the Average Treatment Effect on the Treated, or ATT, which measures the impact of an intervention on those individuals who participated: ). Finally, the Average Treatment Effect on the Untreated (ATU) measures the impact that the intervention would have had on those who did not participate: The problem is that all of these parameters are not observable, since they depend on counterfactual outcomes. For instance, using the fact that the average of a difference is the difference of the averages, the ATT can be rewritten as: , is the average outcome that the treated individuals would have obtained in absence of treatment, which is not observed. However, it is possible to observe the term E (Y(0) | D =0), that is, the value of Y(0) for the untreated individuals. Thus, it is possible to calculate: The second term, SB, is the selection bias: the difference between the counterfactual for treated individuals and the observed outcome for the untreated individuals. If this term is equal to 0, then the ATT can be estimated by the difference between the mean observed outcomes for treated and untreated: However, in many cases the selection bias term is not equal to 0. In these cases, the difference in means will be a biased estimator of the ATT. The main goal of an evaluation is thus to ensure that the selection bias is equal to 0 in order to correctly estimate the parameter of interest (Heinrich et al., 2010).
The matching approach is one possible solution to the selection problem (Caliendo and Kopeinig, 2008). Matching methods are designed to ensure that impact estimates are based on outcome differences between comparable individuals (Heinrich et al., 2010). Accordingly, the simplest form of matching pairs each participant to a comparison group member with the same values on observed characteristics (collected in a vector X). If the number of variables in X is large, such an approach may not be feasible. Propensity-score matching (PSM), one of the most important innovations in developing workable matching methods, allows this matching problem to be reduced to a single dimension (Heinrich et al., 2010). The propensity score was defined by Rosenbaum and Rubin (1983a) to be the probability of treatment assignment conditional on observed baseline covariates: Pr(Di = 1|Xi) (Austin, 2011). As to him, the propensity score is a balancing score: conditional on the propensity score, the distribution of measured baseline covariates is similar between treated and untreated subjects. Thus, in a set of subjects all of whom have the same propensity score, the distribution of observed baseline covariates will be the same between the treated and untreated subjects (Austin, 2011).
However, the matching estimator will not necessarily work in all circumstances; specific conditions (the theoretical assumptions underlying the matching estimator and the data requirements for implementing it, i.e.) have to be met to produce valid impact estimates (Heinrich et al., 2010). First, PSM requires selection on observables; the inability of the researcher to measure one or more relevant characteristics that determine the selection process results in biased estimations of the impact of the intervention. Second, in order to assign a comparison unit to each treated unit, the probability of finding an untreated unit for each value of X must be positive (Heinrich et al., 2010).
On the other hand, the data (variables) available for matching are critical to justifying the assumption that, once all relevant observed characteristics are controlled, comparison units have, on average, the same outcomes that treated units would have had in the absence of the intervention. Since in many cases the researcher does not know precisely the criteria that determine participation, it is common to control for all the variables that are suspected to influence selection into treatment (although controlling for too many variables could generate problems with the common support). As a result, the researcher should have access to a large number of variables to be able to correctly characterize the propensity score. It is important for data for both the treatment and comparison units to be drawn from the same sources, so that the measures used (for control and outcome variables) are identical or similarly constructed. Any missing data should also be handled similarly for treated and untreated units. Although data errors are always a potential issue, the bias in impact estimates may be relatively small if data errors have the same structure for treated and comparison units. Finally, to obtain impact estimates that are generalizable to the population of interest, it is necessary for the pool of comparison units to have a sufficient number of observations with characteristics corresponding to those of the treated units. (Heinrich et al., 2010) Journal of Natural Sciences Research www.iiste.org ISSN 2224-3186 (Paper) ISSN 2225-0921 (Online) Vol.11, No.21, 2020

Data and Variables
The data utilized for this study is acquired from the third wave of the Ethiopia Socioeconomic Survey (ESS) 2015-2016. The Ethiopian Socioeconomic Survey (ESS) is a collaborative long-term project between the Central Statistics Agency of Ethiopia (CSA) and the World Bank Living Standards Measurement Study-Integrated Surveys on Agriculture (LSMS-ISA) team to collect panel data. The ESS collects information on household agricultural activities along with other information on the households like human capital, other economic activities, access to services and resources. ESS uses a nationally representative sample of over 5,000 households living in rural and urban areas. The urban areas include both small and large towns. The sample is a two-stage probability sample. The first stage of sampling entailed selecting primary sampling units, which are a sample of the CSA enumeration areas (EAs). The second stage of sampling was the selection of households to be interviewed in each EA. A total of 433 EAs were selected based on probability proportional to size of the total EAs in each region out of which 290 were rural, 43 were small town EAs from ESS1, and 100 were EAs from major urban areas. In order to ensure sufficient sample size in the most populous regions (Amhara, Oromiya, SNNP, and Tigray) and Addis Ababa, quotas were set for the number of EAs in each region. The sample is not representative for each of the small regions including Afar, Benishangul-Gumuz, Dire Dawa, Gambella, Harari, and Somali regions. However, estimates can be produced for a combination of all smaller regions as one "other region" category. During wave 3, 1255 households were re-interviewed yielding a response rate of 85 percent. Attrition in urban areas is 15% due to consent refusal and inability to trace the whereabouts of sample households. Yield stands for the yield of maize per unit of land cropped measured in quintals per hectare. LnYield stands for the natural logarithmic transformation of Yield. HHAGE stands for the age of a household head in years. HHSEX is a dummy variable indicating the sex of a household head where HHSEX = 1 if the head is male and 0 if otherwise. HHEDU is a dummy variable indicating whether a household head is literate where HHEDU = 1 if the head is literate/able to read and write in any language / and 0 if otherwise. HHRELIGION is a dummy variable indicating the main religion of a household head. FAMILY_SIZE stands for size of a household. CREDIT is a dummy variable indicating household's access to credit where CREDIT = 1 if anyone in the household has borrowed greater than 150 birr from someone outside the household or from an institution for business or farming purposes over the past 12 months and 0 if otherwise. LANDHOLDING_SIZE stands for size of the land holding of a household measured in meter squared. OVERALLPLOTOWN is a dummy variable indicating household's plot ownership where OVERALLPLOTOWN = 1 if the household has some plot under its ownership (acquired through inheritance or local leaders' grant) and 0 if otherwise. AVERPLOTSLOPE stands for the average plot slope of a household' overall plot measured in percent. OVERALLFERTILEPLOT is a dummy variable indicating household's overall plot soil quality where OVERALLFERTILEPLOT = 1 if the household has some plot with fair or good soil quality and 0 if otherwise. DSTNEARMKT stands for distance to the nearest market from residence measured in kilometer. DSTMAJROAD stands for distance to the nearest major road from residence measured in kilometer. DSTNEARPOPCENTER stands for distance to the nearest population center with more than 20,000 people from residence measured in kilometer. OXEN stands for the total number of oxen owned by a household. HHTLU stands for the total livestock units currently owned and kept by a household. EXCONTACT is a dummy variable indicating whether a household had participated in the extension program where EXCONTACT = 1 if the household had participated in the extension program and 0 if otherwise. NONAGRIBUSIN is a dummy variable indicating whether a household owned a non-agriculture business or provided a non-agricultural service from home over the past 12 months where NONAGRIBUSIN = 1 if the household has owned a non-agriculture business or provided a non-agricultural service from home over the past 12 months and 0 if otherwise. COMIRRIGSCH is a dummy variable indicating presence of an irrigation scheme in the community in which a household reside where COMIRRIGSCH = 1 if the community in which a household reside has an irrigation scheme and 0 if otherwise. AMTOFRAIN is a dummy variable indicating the amount of rain received in the last season.

Descriptive Statistics
Various variables that were included in the propensity score matching model that describe the major observed characteristics of the sample respondents are presented in table 1. In all regions, the yield growth of fertilizer and row planting adopters is significantly greater than that of non-adopters. Thus, it tentatively shows that there is Journal of Natural Sciences Research www.iiste.org ISSN 2224-3186 (Paper) ISSN 2225-0921 (Online) Vol.11, No.21, 2020 significant difference in yield growth level in all the regions between those households that adopt fertilizer of any kind with row planting and those that do not adopt both. All the important variables used in the probit model except age and sex of a household head, household size as well household's participation in the extension program have different effect in the different administrative regions considered.

Propensity Scores Estimation using Probit Model
The descriptive statistics has shown a tentative impact of fertilizer and row planting adoption on increasing yield growth in all of the regions. Nevertheless, a mere comparison of yield growth has no causal meaning since fertilizer and row planting adoption is endogenous. And it is difficult to attribute the change to adoption of fertilizer and row planting since the difference in yield growth might be owing to other determinants. To this end, a rigorous impact evaluation method; namely, Propensity Score Matching has to be employed to control for observed characteristics and determine the actual attributable impact of fertilizer and row planting adoption on yield growth in different maize producing regions of Ethiopia. Propensity scores for adopters and non-adopters were estimated using a probit model to compare the treatment group with the control group. In this regard, only those significant variables were used in estimating the propensity scores for each region. The check for 'overlap condition' across the treatment and control groups was done through visual inspection of the propensity score distributions for both the treatment and comparison groups and the result showed that the overlap condition is satisfied for all the four regions considered as there is substantial overlap in the distribution of the propensity scores of both adopters and non-adopters.
For Oromia region, the propensity score for adopters ranges between 0.0734954 and 0.9633425 while it ranges between 0.0183441 and 0.9171392 for non-adopters and the region of common support for the distribution of estimated propensity scores of adopters and non-adopters ranges between 0.07349536 and 0.96334252. For Amhara region, the propensity score for adopters ranges between 0.182649 and 0.9984117 while it ranges between 0.005937 and 0.9937181 for non-adopters and the region of common support for the distribution of estimated propensity scores of adopters and non-adopters ranges between 0.18264896 and 0.99841165. For SNNP region, the propensity score for adopters ranges between 0.0679312 and 0.9906759 while it ranges between 1.67e-13 and 0.8710673 for non-adopters and the region of common support for the distribution of estimated propensity scores of adopters and non-adopters ranges between 0.0679312 and 0.99067586. For Benishangul-Gumuz region, the propensity score for adopters ranges between 0.2303756 and 1 while it ranges between 8.81e-67 and 0.8148187 for non-adopters and the region of common support for the distribution of estimated propensity scores of adopters and non-adopters ranges between 0.23037558 and 1. When matching techniques are employed, observations whose propensity score lies outside this range were discarded.

Assessing Matching Quality
In order to check whether the matching procedure is able to balance the distribution of the relevant variables in both the control and treatment group, the before and after matching covariate balancing tests presented on table 2 suggested that the proposed specification of the propensity score is fairly successful in balancing the distribution of covariates between the two groups as indicated by decreasing pseudo R 2 and mean standardized bias for all regions.

Results
Among the different matching algorithms being available for Propensity Score Matching, nearest neighbor matching and kernel matching are the most commonly applied ones (Kikulwe et al., 2012 citing Caliendo andKopeinig, 2008). Accordingly, nearest neighbor matching matches adopters with non-adopters with the nearest propensity score, while controlling for differences between adopters and non-adopters whereas kernel matching computes treatment effects by deducting from each outcome observation in the treatment group a weighted average of outcomes in the control group. Table 3 depicts the average impact of fertilizer adoption on maize yield growth using nearest neighbor matching one and five (NN=1 and NN=5) as well as Epanechnikov kernel matching with two band widths (BW=0.03 and BW=0.06). Accordingly, all or most of the matching algorithms employed support the hypothesis that fertilizer and row planting adoption has a positive and significant impact on yield growth in only one of the four regions considered-Amhara. Moreover, its adoption has an impact ranging from 55-75% in the region.

Conclusion and Recommendation
This study is undertaken to shed-light on the differential impact of adoption of fertilizer and row planting on maize yield growth among various major maize producing administrative regions of Ethiopia using the propensity score matching technique which is a robust impact evaluation technique that identifies the impact which can be attributed to the adoption of fertilizer and row planting. The study also employed and compared different matching algorithms to ensure robustness of the impact estimates. Finally, the study concludes that fertilizer of any kind (organic, inorganic or both) with row planting adoption doesn't have the desired positive and significant impact on yield growth in most of the different major maize producing administrative regions of the country. Therefore, this study recommends that the agricultural research and extension system of the country should be strengthened to further take into account the differences among different regions and areas (like zones, woredas and "kebeles"/villages) having high variability in landscape positions, agro-ecologies, soil characteristics and farming systems in order to generate and scale-up appropriate improved agricultural technologies and information that suits to the specific conditions of each maize producing land pockets of the country. Table 1: Descriptive statistics of important variables used in the probit model-Propensity Score Matching ***, **, * indicate significance at at 1%, 5% & 10% level respectively. Source: Own computation, 2020