Discuss the Selection of Research Variables in Applying Gravity Model to Analyze Factors Affecting Specific Sector

Gravity model is an important tool for researchers interested in the impact of trade-related policies. It provides a relatively accurate result for testing and evaluating the trade-impact of various policies. In recent years, the application of the model in research has become a trend in many countries around the world, including Vietnam. Though it is considered an attractive platform for international trade researchers, its use is not without potential pitfalls. The most important thing for the application of this model is the choice of exactly which model to estimate (Sherpherd, 2019). This paper aims to systematize the development process, analyze the nature of the model, and focus on analyzing its applicability in the study of trade flows for a specific sector. Based on that, it gives some comments and evaluation on the selection of research variables in a number of recent studies which apply this model in analyzing factors affecting exports of a specific sector in Vietnam. Finally, it draws some conclusions and recommendations for the application of this model in research.


Overview of the process of formation, development and nature of the model
The gravity model is drawn from Newton's Law of universal gravitation in 1687. In Newton's opinion, an object in the universe attracts any other particle with a force that is directly proportional to the product of their masses and inversely proportional to the square of the distance between their centers. In that context, Regenstein's 1885 publication titled Migration Laws intended to explain migration 'flows' driven by "the attractiveness of the commercial and industrial centers but the growth is inversely proportional to the respective distance." Actually, a group of Dutch economists led by Tinbergen was the first to formulate the mathematical equation of the gravity type model and apply it in an experimental context. Tinbergen pioneered the gravity equation in his famous work, Shaping the World Economy (1962). He is the instructor of Linnemann's doctoral thesis (1966). In the document, Tinbergen is credited as the first author to construct this model in terms of econometric model, which today has become a standard traditional gravity model for the study of international trade flows. He was the first winner of the 1969 Nobel Prize for his outstanding contributions to the world economy. In fact, his work has become the standard text that references the first version of the traditional gravity equation (Anderson, 2010): The traditional gravity equation is as follows: (1) Where: Trade ij is the value of the bilateral trade between country i and country j, GDP i and GDP j are the national incomes of country i and country j, respectively. Distance ij is a measure of the bilateral distance between the two countries and α is a constant of proportionality.
Anderson was the first economist who formed the theoretical economic foundation for the gravity equation under the assumptions of product differentiation by place of origin and Constant Elasticity of Substitution (CES) expenditures.
Anderson's groundbreaking study was based on Armington's assumption (1969), which incorporates the product differentiation approach. Anderson deduced the gravity equation which explains the presence of income variables in the model. Some early contributions to gravity theory are the prominent papers of Bergstrand (Bergstrand, 1990;Bergstrand, 1985Bergstrand, , 1989. ISSN 2222-1905(Paper) ISSN 2222-2839(Online) Vol.13, No.10, 2021 51 He was the second author to provide the microeconomic foundations of the gravity model. He developed a relationship between trade theory and bilateral trade, and included the supply side of the economy explicitly. During this period, several authors have contributed significantly to the development of trade theory of this model such as Brakman andGarretsen (2009), Helpman andKrugman (1985), Krugman and Obstfeld (2002), Helpman (1984), Helpman, Melitz, and Rubinstein (2008, etc. Although theories received a great deal of support from many scientists around the world, since McCallum (1995) posed a question about the "border effect", many researchers have questioned the convincing of the model. McCallum (1995) pointed out the fact that, if other things were equal, the interprovincial trade was estimated to be 20 times larger than trade between the Canadian provinces and the US states. Since the very first publications of McCallum (1995) and Helliwell (1997), the economists have wondered how borders could make a difference in consumption (the first problem of intuitive gravity model). In addition, "…the traditional gravity model is not free of difficulties once more advanced concepts from the trade literature are introduced. As one example, consider the impact on trade between countries i and j of a change in trade costs between countries i and m. An example of such a change might be that countries i and m enter into a preferential trade agreement that lowers tariffs on their respective goods. Basic economic theory suggests that such a move may well impact the trade of country j, even though it is not party to the agreement. The well-known concepts of trade creation and trade diversion are examples of such effects. However, the original gravity model (intuitive gravity model) does not account for this issue at all…" (Sherpherd, 2019). The mathematical formula (1) of the original intuitive gravity model showed that reducing trade costs for one bilateral route does not affect trade on other routes, which is contrary to economic theory (The second problem of the intuitive gravity model). These issues have attracted the attention of many researchers, among them, the study of Anderson and Van Wincoop has convinced many researchers in the world, and it has been increasingly widely applied.
Basically, the gravity model of Anderson and Van Wincoop is a demand function. Its starting point is based on the constant elasticity of substitution structure chosen for consumer preferences. Consumers have "love of variety" preferences, which means that their utility increases both from consuming more than a given product variety, or from consuming a wider range of varieties without consuming more of any one. Regarding production side, the model makes assumptions that are standard following Krugman (1979). Each company produces a single, unique product variety according to the rate of increasing returns. By assuming a large number of companies, competitive interactions disappear and companies engage in constant pricing: in equilibrium, the difference between price and marginal cost is just enough to cover the fixed cost of market entry. A manufacturer in one country can sell goods in any country, either the one where it is located, or an overseas country. To simplify the model, selling goods locally is assumed to have no transporting costs. Therefore, consumers consume varieties of goods produced from all countries, but the prices of non-domestically produced varieties are higher due to the impact of the cost of transporting goods between countries. Then overall trade between the two countries is shown as follows: (2) (3) (4) Where, X ij denotes total trade (export/import) from country i to country j, Y i denotes the size of country i's economy -indicating the production capacity of country i, E j is the size of country j's economy -indicating country j's ability to spend on goods, Y is the size of the world economy, σ is the elasticity of substitution (between varieties), and t ij represents bilateral trade barrier. The two notable feature terms of the model: Π i is outward multilateral resistance from i. At its most basic, it essentially captures the fact that exports from country i to country j depend on trade costs across all possible export markets. P j is inward multilateral resistance from country j. Also, it captures the dependence of imports into country j from country i on trade costs across all possible suppliers. Together, these terms are the key to the model, and they resolve both issues identified as problems with the intuitive gravity model in the previous section. In particular, it is immediately apparent that because the multilateral resistance terms involve trade costs across all bilateral routes… In other words, this model shows the fact that changes in trade cost on one bilateral route can affect trade flows on all other routes because of relative price effects (response to the second issue of the intuitive gravity model) (Sherpherd, 2019). Furthermore, because the intuitive model does not include these two multilateral resistance variables but they are, by construction, correlated with trade costs, there is a classic case of omitted variables bias in the intuitive model. Finding a way to correct for this problem will be the main thrust of the estimation approaches discussed in the remainder of this user guide (solving the second issue -Mccallum border puzzle).
Later, by developing the model of Anderson and Van Wincoop (2003) in a structured form, Yotov et al. (2016) showed the difference in the application of this model to overall trade and sectoral trade by separating into individual structures. Specifically, the trade value of sector k between countries i and j has been indicated by two research projects according to the following equation: (5) Where, X ij is the export from country i to country j in sector k, Y k i is the size of sector k of country i, E j k is country j's ability to spend on sector k, Y k is the total production capacity of sector k of the world, σ k is the elasticity of substitution by sector (between varieties), and t ij k is the bilateral trade cost, Π i k is multilateral resistance (unobservable cost) from country i, P k j is multilateral resistance (unobservable cost) from country j.
(In this paper, the author only summarizes the final results of the model, details the construction process, referring to the studies (Anderson & Van Wincoop, 2003;Yotov et al, 2016;Shepherd, 2019)

Analyzing the applicability of gravity models of Anderson and Van Wincoop in analyzing factors affecting exports of specific sectors
In fact, from equations (2) and (5), it can be applied to many different purposes, including total trade, sectoral trade, intra-sectoral trade. The aspects to be exploited usually include: analysis of factors affecting export/import, assessment of trade potential, forecasting of trade potential. Although the gravity model is an attractive platform for applied international trade researchers, its use does not come without some potential pitfalls (Shepherd, 2019). There are three particularly important issues affecting the accuracy and reliability of this model, namely variable selection, data selection and estimation method. The most important of these is the choice of the research variable for the model (Shepherd, 2019). This study will focus on analyzing this issue.
Many studies around the world are still applying the gravity model mainly based on intuitive ideas about which variables are likely to affect trade. The instructions for using this model by Sherpherd (2019) and Yotov (2016) both argued that there is a need towards using one of some "structural" gravity models (i.e. it has a theoretical basis), to provide consistent and unbiased parameter estimates, as well as a suitable foundation for making counterfactual simulations. " There are different versions for using this model, however, as noted above, the version proven by Anderson and Van Wincoop has received a lot of consensus from individual scientists and reputable organizations around the world. So in the analysis, discussion for the application of this model below, we base on the argument of the version of Anderson and Van Wincoop's structural gravity model, further developed by Sherpherd (2019) and Yotov (2016). Accordingly, for the theory of choosing variables for the model, we base on the results of equation (5), then we can find at least 6 groups of factors affecting the export value of product k from a country to another country and we can also compare with observed reality as follows: (1) Group of factors affecting production scale of sector k in country I (Y i k ). The large supply will create an important premise for the trade to take place. In the context of the global economy being gradually specialized, the commercialization process is constantly under the influence of science and technology, the excess production of products compared to domestic demand increases the demand for goods export. It is evident that the greater the supply to domestic demand, the higher the incentive for trade to take place. However, it should be noted that in today's international trade, not all goods are easily traded and exchanged, but it also depends on quality, standards and designs. This seems to be determined by the scientific and technical level of the manufacturing country, infrastructure, macroeconomic stability, and openness of the economy. As a result, if factors such as technology, infrastructure, agricultural land area, inflation index, and openness of entry are included, it will be reasonable. They are considered to be the factors contributing to the size of the goods supply k of the exporting country.
(2) Group of factors affecting the size of product demand of sector k from country j (E j k ). No trade in goods k will take place if a country has no demand for the goods in the market of country j. The size of the demand for goods k in country j is measured by its spending on domestic consumption and re-export. That of course will depend significantly on the population of country j, the ability to produce goods k of country j, and the need for re-export. The variables commonly used to represent this factor are usually population, GDP and GDP per capita of country j.
(3) Group of factors affecting the bilateral trade cost factor (t ij k ). There is no need to argue much about this as costs always play a significant role in trade. The distance, the exchange rate, the border, the country of the continent, the common language, and the tariffs all have an effect on trade as it can be observed. Distance is constant over time and immutable, it is inherent that the closer the countries are, the greater the trade potential, and vice versa. This is explained as a representative index for transporting costs. However, these costs will be adjusted in a positive direction if the two countries share a border or are easily transported by sea, for this reason, variables such as the common borderline, the country bordering the sea are often included in the study. In particular, the featured factor in international trade is repeatedly discussed namely tariffs and import taxes, many studies around the world have shown that the presence of this factor is necessary. Non-tariff barriers are considered to be a factor affecting the change of bilateral trade cost. The higher the tax is applied, the less circulating the goods will be. The liberalization trend lays a solid foundation for smoother trade flows as the country intensifies the signing of bilateral-multilateral free trade agreements. The reason is that it contributes remarkably to the adjustment of tariffs for goods of both sides. Consequently, dummy variables such as a free trade agreement between two countries (FTA) are often included. They act as factors affecting the increase/decrease of tariffs -a kind of bilateral trade costs.
(4) Group of factors affecting the multilateral resistance factor deriving from the importer side (Pjk). The variables commonly used to represent this group are the open index, the level of international and regional economic integration (joining WTO, ASEAN, NAFTA, APEC ...); nevertheless, in the ability to find the author's documents currently, there is no study on the bilateral or multilateral agreements signed between the importing country and other suppliers.
(5) Group of factors affecting the multilateral resistance factor deriving from the exporter (Π i k ). Similar to group (3), but the selected variables are now mainly related on the exporter side. Additionally, there may be groups of factors, variables significant in the model that has not been fully exploited by the author.

Discuss the application of gravity models in studies of trade flows
In recent years, the gravity model has been used in thousands of research papers about various fields of trade (Andrea, 2013). In particular, a large number of studies in the world have used the gravity model in researching factors affecting exports of agricultural products. According to statistics in a study on factors affecting the export of agricultural products in the Czech Republic, Andrea (2013) has gathered statistics of 65 different studies applying this model. Lately, this number is getting bigger and bigger. Many variables have been added to the research models. (Sapa & Droždz, 2019) argued that the addition of variables improves the basic methodology of the selected gravity equation. In the other hands, according to (Sherphed, 2019) "Traditionally, gravity models have been based largely on intuitive ideas as to which variables are likely to influence trade", without focusing on the structure of the model. To clarify this, we give our discussion based on the structure of the gravity model according to the version of Anderson and Van Wincoop (2003). There are three notable points when using the structural gravity model that national studies often violate. First, there is a significant difference between the structural gravity model according to the sectoral version and the overall version. In terms of sectoral version, (Sherphed, 2019) specified in the instructions of use of this model "… ideally we would like to include data on sectoral expenditure and output rather than GDP as such…" From 2017 onwards, according to the author's literature accessibility, there are almost no studies on sectoral expenditure and output data. At this stage, the factor representing the specific sectoral transaction size is often ignored. Meanwhile, other variables representing the size of the sector such as the total sectoral production capacity, sectoral production value, sectoral production volume ... are all not mentioned. It was not until 2018 that Adiqa et al. (2018)'s study on examining the rice and cotton trade potential for Pakistan mentioned production capacity for the first time as a sectoral factor. It is, however, just a way to add variables to the intuitive model, not derived from the model's reasoning. Later, Sokvibol Kea et al. (2019) also added the variable of the total import value of rice to the importing country to represent the size of the demand for rice of the importing country when analyzing the main determinants of Cambodia's rice export performance in the international market. Morevoer, Sapa and Droždz (2019) added the agriculture value-added variable of the exporting country and demonstrated that it has a good meaning in the model. This in fact has also shown a certain inequality. Assuming that considering the size of tea exports from the US to France in the case of frictionless trade based on the GDP of these two countries, it will be difficult to give an accurate and reasonable result when the tea exporting country is not the growing one, tea export value in the total GDP of this country is only about 0.007%. Thus, no matter how large its GDP is but the country does not produce, process, or participate in the export of sector k, it will not affect the export of that sector. Therefore, we believe that it may lead to misleading results if we only involve GDP to represent the production capacity of a certain sector. Some studies added agricultural land area variable (or the area of land for rice cultivation, wheat) in exporting countries such as Yen & Thao (2017), Xiaohua et al (2020), is also reasonable, because it can decide the supply. This may depend on the characteristics of the research sector and the research objectives of the authors. Nonetheless, one of the difficulties in the gravity model is that it can lead to endogenous problems (Yotov, 2016), so the authors need to consider when choosing, choosing few variables but highly representative rather than choosing many variables.
As for the group of factors of bilateral trade costs and multilateral resistance from the two sides, it should also be the effects of the specific trade policy of the sector. Depending on the question of interest, estimation of trade policy variables in gravity estimation are aggregated across sectors that may be sector-specific or are limited to being common across sectors. For example, the export of tea from Vietnam to country j, in addition to distance costs, should include tariffs and non-tariff barriers. For example, the Korean market is an attractive market, the distance is not too far, but the tariff in this market for tea is up to 145%. Or the EU market, although the tariff barriers were removed after Vietnam joined the WTO and signed bilateral agreements with this market. Non-tariff barriers, however, make it even more difficult. Notwithstanding that sector-specifics are ignored, common influences among sectors such as WTO, APEC accessions have been included in a number of studies (USA, 2015;Kien & My, 2015;Thu et al., 2019). However, multilateral resistance from the importer side is regularly ignored. For example, the signing of bilateral agreements by Vietnam and Korea is not mentioned in the research on exporting Chinese tea to this country (Wei et al., 2012;Zhang, 2019;Martin, 2020) when in fact it has clearly affected the cost due to tax. The tax rate that Korea applies on Chinese tea from 2015 to now is 349%, while Korea has set a tax rate of 276% on Vietnamese tea because Vietnam and Korea have up to 3 free trade agreements (CECA, ASEAN-Korea, FTA, Korea-Viet Nam, GSTP). And many of the other multilateral barriers on the importer side that we have seen have been ignored in many studies such as regional links (RTA) such as NAFTA, MERCOSUR, ASEAN, APEC, etc. Finally, again, we agree that the gravity model in general and the structural gravity model in particular is a success in world economics. Furthermore, we are grateful for the pioneering of many studies that apply this model to trade, especially in recent years in trade of specific sectors. We have learned a lot from these studies. However, we also recognize that it is necessary to have discussions and comments on the application of this model in research to make sure that we are all on the right track. No matter how appropriate the estimation results are, the lack of theoretical and practical model is difficult to apply in practice.

Conclusion
The purpose of the paper is to track historical developments, to clarify the theoretical nature of the gravity model of trade. After that, we analyze the applicability of structural gravity model to the study of factors affecting exports of a sector. Finally, the study discusses the selection of research variables for the application of this model to the trade of a particular sector. The results showed that the application of the model in Vietnam still have a number of irrational points. This is a highly regarded model in world economic history. Seriously recognizing and examining the correct applicability of the model for practical application is the job that needs to be paid attention to by scientists.