By using static and dynamic panel data techniques, this paper analyses the impact of economic, structural, institutional and social factors on tax revenue, across 34 countries from the Organisation for Economic Co-operation and Development, over the period 2001–2011. The results show that gross domestic product per capita, the industrial sector, and civil liberties have positive impact on the dependent variable, while the agricultural sector and the share of foreign direct investment in gross fixed capital formation have negative impact. The lagged value of the dependent variable enters positively in the equation and its effect is larger in high income countries. We also encounter tax effort and tax gap and find that they are stable over time but diverse across countries regardless the level of development of the economies.
Mediante el uso de técnicas de datos de panel dinámicos y estáticos, este trabajo analiza el impacto de los factores económicos, estructurales, institucionales y sociales en los ingresos fiscales, a través de 34 países de la Organización para la Cooperación Económica y el Desarrollo, en el periodo 2001–2011. Los resultados muestran que el producto interno bruto per cápita, el sector industrial y las libertades civiles tienen un impacto positivo sobre la variable dependiente, mientras que el sector agrícola y la participación de la inversión extranjera directa en la formación bruta de capital fijo tienen un impacto negativo. El valor rezagado de la variable dependiente entra positivamente en la ecuación y su efecto es mayor en los países de altos ingresos. También se estima el esfuerzo fiscal y la brecha fiscal y se encuentra que son estables en el tiempo, pero diversos entre los países, independientemente del nivel de desarrollo de las economías.
The difference in tax revenues across countries have been a topic of widespread debate in the relevant literature. The main factors that have been found, as the cause of variations in tax pressure are, the level of development, which is usually represented by the gross domestic product (GDP) per capita (Gupta, 2007; Pessino and Fenochietto, 2010), the productive specialization, or the structure of the economy, that can be explored through the sectoral composition of the GDP (Piancastelli, 2001; Karagöz, 2013;), external factors such as the level of foreign direct investment (FDI) and trade (Cassou, 1997; Gupta, 2007; Bird et al, 2008). Other factors comprise level of public debt (Teera and Hudson, 2004) and public policies, including exchange rate, control of inflation and financial policies (Tanzi, 1988). Government efficiency and institutional factors like political stability, voice and accountability and civil and political rights are also considered determinants of tax revenue (Bird et al., 2008; Martin-Mayoral and Uribe, 2010). Some studies have explored the effect of social variables such as the educational level, measured by public expenditure on education or illiteracy rate (Pessino and Fenochietto, 2010; Piancastelli, 2001) and population growth (Bahl and Wallace, 2005).
The methodology to analyse the determinants of tax revenue across countries has been diverse. For instance, some authors have applied dynamic general equilibrium models (Feltenstein and Cyan, 2012), while others have conducted diverse econometric techniques. One of the first papers to study international tax ratios, through the use of econometrics, employed cross-section methods (Lotz and Morss, 1967). Pessino and Fenochietto (2010) developed a panel version of a stochastic tax frontier model. Other panel data studies (Gupta, 2007; Martin-Mayoral and Uribe, 2010) focused on static fixed and random effect models and dynamic panel data techniques that use the generalized method of moments (GMM). It is worth noting that the previous dynamic studies have neglected to incorporate in the analysis, reflections about the effect of lagged values of the tax revenue variable.
Studies like those by Teera and Hudson (2004) and Pessino and Fenochietto (2010), already quoted, incorporated large samples from low, middle-income and high-income countries. However it was found that the results have low level of significance when the whole panel is employed. In this sense, the results improved when the sample is similar in terms of geographical location or income level.
In this paper, we study the effect of economic, structural (productive specialisation), social and institutional factors on tax revenue through static and dynamic panel data techniques and we also comment on the role of lagged values of the dependent variable. The sample includes 34 countries from the Organisation for Economic Co-operation and Development (OECD) over the period 2001–2011. This sample comprises middle-income and high-income countries and therefore provides homogeneity to a certain extent. Nevertheless, after conducting aggregated regressions we also split the sample in middle and high income countries, in order to explore whether two more homogeneous samples can have different results than the aggregated sample, and also to analyse whether determinants of tax revenue in middle income countries can be different than those on high income countries.
Both tax gap and tax effort are calculated so as to analyse how they evolve across countries and over time.
The structure of the paper is as follows. Section 2 shows the features of the tax revenue variable along the sample and presents the explanatory variables included in the model. Section 3 explains the econometric methods conducted in the study. Section 4 provides the results from the most consistent specification and calculates the tax gap and tax effort by country. Finally section 5 gives the conclusions.
Description of explanatory variables and the evolution of tax revenueFollowing previews work in the field, in this paper we incorporate four sets of factors (economic, productive specialisation or structural, social and institutional), as determinants of tax revenue. The economic factors involve the GDP per capita expressed in constant United States (US) dollars with year 2000 as the reference year, this variable also represents the level of development of a country, trade volume measured as the sum of exports and imports of goods and services as a percentage of GDP, and foreign direct investment relative to gross fixed capital formation. The productive specialisation factors include two variables the agriculture value added as a percentage of GDP and the industry value added as a percentage of GDP. The social factors comprise three variables, gross tertiary school enrolment, life expectancy and child mortality rate. The source is World Development Indicators (World Bank, 2013). The institutional factors are composed of two indicators, political rights that essentially measures the level of democracy; and civil liberties that considers freedom of expression, assembly and thought, and legal security; both are measured on a one-to-seven scale, with one representing the highest degree of freedom and seven the lowest. The source is Freedom House (2013).
Our dependent variable is total tax revenue as a percentage of GDP, and it is obtained from the OECD (2013). Table 1 presents in column 1 the tax revenue of the countries in descending order. It is possible to observe that the highest figures, above 40, are in Nordic countries (Denmark, Sweden, Norway and Finland) and Western European countries (Belgium, Italy, France and Austria). On the other hand, the lowest figures belong to middle-income countries (Mexico, Chile and Turkey), Asian countries (Korea and Japan) besides United States and Australia. In average the tax revenue in the OECD is 33.77. Column 2 shows the variation in descending order, between the first and last observation available in the dataset. Middle-income countries (Estonia, Mexico and Chile) and Korea have increased more their potential to collect taxes, but the increase is not substantially high. It is interesting to note that countries that had already low levels of tax revenue, like Slovak Republic, Australia, Canada and United States, have also experimented the most drastic fall of the indicator over the period analysed. Sweden had the biggest reduction but still remain with high levels of tax revenue.
Tax revenue in OECD countries
Rank | Country | Tax revenue (1) | Country | Difference in tax revenue (2) |
---|---|---|---|---|
1 | Denmark | 47.60 | Korea | 2.87 |
2 | Sweden | 45.52 | Estonia | 2.25 |
3 | Belgium | 43.51 | Mexico | 1.73 |
4 | Italy | 42.92 | Chile | 1.70 |
5 | Norway | 42.90 | Germany | 0.98 |
6 | France | 42.86 | Italy | 0.93 |
7 | Finland | 42.49 | Netherlands | 0.56 |
8 | Austria | 42.01 | Israel | 0.56 |
9 | Netherlands | 38.74 | Norway | 0.38 |
10 | Hungary | 37.91 | Japan | 0.37 |
11 | Slovenia | 37.49 | France | 0.18 |
12 | Luxemburg | 37.13 | Czech Republic | −0.35 |
13 | Germany | 36.05 | Denmark | −0.38 |
14 | Iceland | 35.22 | United Kingdom | −0.60 |
15 | United Kingdom | 34.86 | Poland | −0.85 |
16 | Czech Republic | 34.18 | Belgium | −0.89 |
17 | Estonia | 34.17 | Switzerland | −1.05 |
18 | Israel | 32.43 | Turkey | −1.13 |
19 | Spain | 32.26 | Finland | −1.17 |
20 | Poland | 31.71 | Slovenia | −1.21 |
21 | New Zealand | 31.53 | New Zealand | −1.33 |
22 | Portugal | 31.26 | Ireland | −1.46 |
23 | Canada | 31.03 | Greece | −1.67 |
24 | Greece | 30.88 | Spain | −2.19 |
25 | Slovak Republic | 28.33 | Hungary | −2.32 |
26 | Switzerland | 28.05 | Portugal | −2.50 |
27 | Ireland | 27.64 | Luxemburg | −2.62 |
28 | Japan | 27.63 | Austria | −3.17 |
29 | Turkey | 25.72 | Israel | −3.67 |
30 | Australia | 25.63 | United States | -−3.75 |
31 | Korea | 25.06 | Canada | −3.84 |
32 | United States | 24.85 | Australia | −3.99 |
33 | Chile | 19.64 | Slovak Republic | −4.34 |
34 | Mexico | 18.85 | Sweden | −5.38 |
Average | 33.77 | −1.10 |
Note: The difference is obtained from the first and last observation available in the dataset for very country.
Only 11 out of 34 countries increased tax revenue over the period, in average the OECD club reduced 1.10 points its potential to collect taxes. This reduction can be the result of the 2009 international economic crisis and therefore, the countries that have positive figures are the economies which have had the ability to recover faster their capacity to collect taxes after the crisis.
The econometric approachOriginal modelThe first model applied is a general regression equation as follows:
where TAXREV or the dependent variable Y is tax revenue, X is a vector of economic, productive specialisation, social and institutional factors, β is a vector of coefficients to estimate, η are the unobservable individual effects, specific for every country. The error term u is assumed to satisfy white-noise assumptions, that is independently and identically distributed with zero mean, constant variance σ2 and serially uncorrelated, which is denoted as u∼ I.I.D (0, σ2). The parameter αi lets the intercept vary for each country and captures country specific differences, and finally the subscripts i and t indicate country and year respectively.Preliminary effect of explanatory variablesAs for the explanatory variables, GDP per capita (GDPpc) is expected to have positive sing because as a country expands the level of development, the formal sector of the economy increases, in relative terms. Trade volume (TRA/GDP) can have a positive effect due to the taxes applied on imports; in addition, as trade expands, the formalisation and the competitiveness of the economy increases and therefore, there are more possibilities to collect taxes; on the other hand, an open economy reduces tariffs and trade barriers and this fact can have negative effects on tax collection (Baunsgaard and Keen, 2010). Foreign direct investment relative to gross fixed capital formation (FDI/GFCF) is expected to have negative effect, since countries can create fiscal incentives in order to capture more flows of foreign investment (Cassou, 1997; UNCTAD, 2000; Martin-Mayoral and Uribe, 2010); from another perspective, this variable can have positive effects, as the flow of FDI boosts competitiveness and the formalisation of the economy (Gugler and Brunner, 2007). The variable on specialisation in agriculture as a percentage of the economy (AGR/GDP) is expected to have negative sign, because the economic activities in this sector are more difficult to tax, especially in middle-income countries, where production tends to be organized on a small-scale basis. In contrast, specialisation in industry as a percentage of the economy (IND/GDP) can have positive effects on taxation as industrial enterprises are typically easier to tax and manufacturing can generate larger taxable than agriculture (Eltony, 2002).
The civil liberties (CIVLIB) and political rights (POLRIG) indicators are expected to have positive effect on tax revenue1 because in countries with high level of democracy and liberties, the taxpayers can have better perception of their governments and therefore, have more willingness in relation to tax regulations, in other words more compliance and less tax evasion. In addition, political stability and social confidence foster a better environment for the operation of the economy and the creation of businesses.
Higher level of education in a country creates further specialisation and hence more sophisticated production methods or economic activities that can increase tax revenue. Moreover, a consolidated educational system fosters social commitment, which leads to more consciousness in the population about the benefits of taxes. Consequently, we should expect a positive relationship between tax revenue and the proxy of education (SCHTER). The other two social variables, life expectancy (LIFEEXP) and infant mortality rates (INFMOR) are also expected to have a positive relationship with the tax variable, since they are associated with levels of development and social security. Furthermore, people with higher levels of social security and more access to medical services are likely to raise their productivity and economic activity. An opposite reflexion is that life expectancy can have an adverse effect, because the higher the population average age, the higher the proportion of retired people and hence, there is a lower proportion of the population paying taxes (Svejnar 2002).
The lagged dependent variable can have two main interpretations. i) A positive sign indicates a Keynesian approach in which high levels of tax collection encourage public expenditure and economic growth and further tax revenue, but the effect reverses when the tax collection is low. ii) A negative sign indicates a neoclassical approach in which high levels of tax collection discourage the economic activity and eventually reduce tax revenue; that is, low tax rates are associated to a better performance of the economy (Cooley and Ohanin, 1997). A coefficient close to one, is evidence that the dependent variable changes slower and is less vulnerable to variations in the explanatory variables, but depends more on the lagged dependent variable (Ángeles-Castro, 2006).
Dimension of the panelThe dataset is an unbalanced panel data, consisting of 34 countries and 11 time observations between 2001 and 2011. In the panel the number of time periods available may vary from country to country, but the number of variables included in each year is the same. In total there are 273 observations in the panel.
Static methodsThe estimations start with the standard ordinary-least-squares method (OLS) pooling or combining all the observations, and assuming that αi=α. The output obtained from the OLS specification is presented in table 2, column 1. It should be added the traditional OLS approach has two major drawbacks. It assumes that the intercept value of the countries is the same and it does not control for country-specific factors. So as to test whether these are implausible characteristics, two panel estimation methods, which take into account the specific nature of the countries, are performed.
Determinants of tax revenue in OECD countries
Variables | OLS (Pooled) (1) | Fixed Effect (2) | Random Effect (3) | GMM (4) | GMM system (5) | |||||
---|---|---|---|---|---|---|---|---|---|---|
TAXREVt−1 | 0.510 | * | 0.785 | * | ||||||
(0.000) | (0.000) | |||||||||
GDPpc | 1.25E-04 | ** | 9.45E-05 | 1.63E-04 | ** | 2.11E-04 | * | 8.95E-05 | ** | |
(0.013) | (0.357) | (0.036) | (0.000) | (0.011) | ||||||
TRA/GDP | 0.023 | ** | −0.020 | ✦ | −0.015 | −0.031 | * | −0.008 | ||
(0.036) | (0.086) | (0.130) | (0.000) | (0.286) | ||||||
FDI/GFCF | 0.005 | −0.004 | −0.004 | −0.007 | * | −0.008 | * | |||
(0.712) | (0.131) | (0.117) | (0.000) | (0.000) | ||||||
AGR/GDP | −0.006 | −0.582 | * | −0.547 | * | −0.036 | −0.316 | * | ||
(0.981) | (0.008) | (0.008) | (0.841) | (0.000) | ||||||
IND/GDP | −0.229 | * | 0.238 | * | 0.200 | * | 0.168 | * | 0.160 | * |
(0.002) | (0.000) | (0.000) | (0.000) | (0.000) | ||||||
CIVLIB | −0.416 | 0.450 | 0.63 | ✦ | −0.443 | −0.722 | ✦ | |||
(0.677) | (0.106) | (0.099) | (0.320) | (0.076) | ||||||
POLRIG | 1.849 | −0.922 | −1.155 | ** | −0.783 | ✦ | −0.612 | |||
(0.303) | (0.123) | (0.053) | (0.083) | (0.271) | ||||||
SCHTER | 0.022 | −0.035 | ✦ | −0.033 | ✦ | 0.002 | 0.000 | |||
(0.474) | (0.078) | (0.083) | (0.883) | (0.981) | ||||||
LIFEEXP | −0.622 | ** | −0.039 | −0.214 | −0.404 | ✦ | −0.400 | * | ||
(0.018) | (0.849) | (0.264) | (0.066) | (0.000) | ||||||
INFMOR | −1.217 | * | 0.019 | −0.064 | −0.291 | −0.018 | ||||
(0.000) | (0.848) | (0.514) | (0.408) | (0.933) | ||||||
Constant | 89.397 | * | 35.097 | ** | 48.552 | * | 45.225 | ** | 35.694 | |
(0.000) | (0.029) | (0.001) | (0.014) | (0.000) | ||||||
Tests | ||||||||||
F | (0.000) | * | ||||||||
BPLM | (0.000) | * | ||||||||
Hausman | (0.000) | * | ||||||||
MODDW | 0.857 | |||||||||
BWLBI | 1.128 | |||||||||
AB AR1 | (0.020) | ** | (0.013) | ** | ||||||
AB AR2 | (0.145) | (0.275) | ||||||||
Sargan | (0.964) | (0.986) |
Notes: Dependent variable is TAXREV, ρ-values in parenthesis
The fixed effect model (FE) lets the intercept vary for each country by adding dummy variables that control country-specific effects. In order to explore whether the dummies belong to the model an F test is conducted. The null hypothesis is that the additional coefficients equal zero, that is αi is a constant intercept α for all the countries H0: αi=α). In this case the result of the test is to reject the null hypothesis; hence the FE is more appropriate.2 Results are presented in table 2, column 2.
In the random effect model (RE), differences across countries are captured through a disturbance term ωit, which follows ωit=ε+uit, where εi is an unobservable term that represents the individual specific error component, and uit is the combined time series and cross-section error component. The RE assumes that εi is not correlated to any explanatory variable Xkit in the equation. The Breusch and Pagan Lagrange Multiplier (BPLM) test (1980) is designed to test random effects. The null hypothesis is that the individual-specific or time series error variance is zero, that is H0:σu2=0 In the present analysis the BPLM test rejects the null hypothesis; hence, the RE is appropriate.3 The results are presented in table 2, column 3.
Both FE and RE are more convenient than the OLS specification. In order to choose between the FE and the RE we apply the Hausman test for specification (1978). The null hypothesis underlying the test is that the regressors Xkit and the unobservable individual specific random error Si are uncorrelated, that is H0: Corr (Xkit, εi)=0. If the test statistic, based on an asymptotic χ2 distribution rejects the null, then the random effect estimators are biased and the fixed effect model is preferred. The results from the Hausman test, presented in table 2 column 3, indicate that the RE estimates are inconsistent and the FE would be more appropriate.4
Before adopting the FE as the final estimation, it is important to conduct an additional test. It has been already contended that the error term uit is assumed to satisfy white-noise assumptions, that is zero mean, constant variance σ2 and serially uncorrelated, by the same token an AR(1) (auto regressive process of order one) test should be available. In the presence of autocorrelation, both σ2 and the standard errors are likely to be underestimated and biased, which leads to misleading conclusions about the statistical significance of the estimated regression coefficients. In this respect and, in order to test the presence of autocorrelation in the FE specification, we obtain the modified Bhargava et al. (1982) Durbin-Watson statistic (MODDW) and Baltagi-Wu LBI (1999) statistic (BWLBI), the results are presented in column 2 of table 2. Both tests statistics reject the null hypothesis H0: no first-order serial correlation.5
Dynamic methodsThis econometric approach is relevant to our study for two main reasons. Firstly, the incorporation of a lagged dependent variable in the model is essential to explore the effect of past tax revenue values on current values, and in this way, to test if the variable can be explained by itself.
Secondly, to deal with autocorrelation, it is necessary to explore the possibility that the problem may arise due to model misspecification; to be precise, because of an omitted lagged dependent variable.
In this context, equation 1 is extended and transformed into a dynamic panel data model (DPDM) by adding a lagged dependent variable as follows:
However, the inclusion of a lagged dependent variable introduces a source of persistence over time: correlation between the right hand regressor yit−1 and the error term uit. Furthermore, DPDMs are characterised by individual effects ηi. caused by heterogeneity among the individuals.6 As a consequence, it is necessary to adopt different estimations and testing procedures for model 2.
The generalised method of moments (GMM) estimationIn order to estimate equation 2, we use the GMM, for DPDMs, initially proposed by Arellano and Bond (1991) and Arellano and Bover (1995). Firstly, the estimation method eliminates country-effects ηi by expressing the dynamic equation in first differences as follows:
On the basis of the following standard moment conditions:
that is, lagged levels of TAXREVit are uncorrelated with the error term in first difference, the method uses lagged endogenous variables as instruments to control for likely endogeneity of the lagged dependent variable, reflected in the correlation between this variable and the error term in the transformed equation. The GMM estimation obtained is known as the difference estimator and the results are reported in column 4 of table 2.Blundell and Bond (1998) contended that the GMM estimator obtained after first differencing has been found to have large finite sample bias and poor precision. They attribute the limitations of the estimator to the problem of weak instruments, as they assert that lagged levels of the series provide weak instruments for the first difference.
So as to improve the properties of the standard first-differenced GMM estimator, they justified the use of an extended GMM estimator, on the basis of the following moment condition:
that is, there is no correlation between lagged differences of TAXREVit and the country-specific effects. The method therefore uses lagged differences of Yit as instruments for equations in levels, in addition to lagged levels of Yit as instruments for the equation in first differences. The extended method is known as the system GMM estimation (sys-GMM). It encompasses a regression equation in both differences and levels, each one with its specific set of instrumental variables. The sys-GMM estimation not only improves precision but also reduce finite sample bias. Results are reported in column 5 of table 2.The GMM estimations, both difference and system, assume that the disturbances uit are not serially correlated. If this were the case, there should be evidence of first order serial correlation in differenced residuals uit – uit−1 and no evidence of second-order serial correlation in the differenced residuals (Doornik et al. 2002). It is an important assumption because the consistency of the GMM estimator hinges upon the fact that E[Δuit Δuit−2]=0. Accordingly, tests of autocorrelation up to order 2 in the first-differenced residuals should be available.
The tests of serial correlation in the first-differenced residuals are in both models consistent with the maintained assumption of no serial correlation in uit. The AR(2) tests fail to reject the null hypothesis that the first-differenced residual error term is not second-order serially correlated, whereas by construction, the AR(1) tests reject the null (at 5 per cent level of significance in both models) that the process does not exhibit first-order serial correlation. The results for the difference and system models are reported in columns 4 and 5 of table 2 respectively.7
In order to assess the validity of the instruments, a Sargan test of overidentifying restrictions, proposed by Arellano and Bond (1991) is also performed. Under the null hypothesis that the instruments are not correlated with the error process, the Sargan Tests is asymptotically distributed as a chi-square with as many degrees of freedom as overidentifying restrictions. In both models the test is unable to reject the validity of the instruments. The result for the difference and system specifications are reported in columns 4 and 5 respectively.7
Moreover, we split the overall sample in two subsamples, middle income and high income countries; the former contains 11 countries and the latter 23. The criterion to separate the sample is through the World Bank income classification (2014), which divides the economies according to the 2012 GNI per capita, using the Atlas method. With this exercise, it is possible to obtain more homogenous samples, and it also allows observing if the determinants of tax revenue can vary between the two income groups. The sys-GMM specification is applied using the two subsamples. Both regressions satisfy the tests of serial correlation in the first differenced residuals and the Sargan test of overidentifying restrictions. The results are presented in table 3.
The sys-GMM specification for middle and high income countries
Variables | sys-GMM middle-income | sys-GMM high-income | ||
---|---|---|---|---|
TAXREVt−1 | 0.562 | * | 0.936 | * |
(0.000) | (0.000) | |||
GDPpc | −5.65E-05 | −4.42E-06 | ||
(0.848) | (0.961) | |||
TRA/GDP | 0.005 | 0.009 | ||
(0.768) | (0.541) | |||
FDI/GFCF | −0.005 | −0.010 | ||
(0.776) | (0.000)* | |||
AGR/GDP | 0.610 | −0.429 | ||
(0.137) | (0.232) | |||
IND/GDP | −0.013 | 0.105 | ✦ | |
(0.907) | (0.072) | |||
CIVLIB | −1.843 | ✦ | −1.344 | ✦ |
(0.078) | (0.094) | |||
POLRIG | −0.219 | −5.506 | ||
(0.667) | (0.294) | |||
SCHTER | −0.021 | 0.023 | ||
(0.559) | (0.451) | |||
LIFEEXP | −0.454 | ✦ | −0.569 | ** |
(0.086) | (0.023) | |||
INFMOR | −0.346 | ** | 0.506 | |
(0.036) | (0.444) | |||
Constant | 44.561 | ✦ | 46.621 | ✦ |
(0.099) | (0.075) | |||
Tests | ||||
AB AR1 | (0.067) | ✦ | (0.079) | ✦ |
AB AR2 | (0.174) | (0.113) | ||
Sargan | (0.823) | (0.890) |
Notes: Dependent variable is TAXREV, p-values in parenthesis,
In the following section we comment on the outcome from the sys-GMM specifications.
Estimation results and the tax gapOverall sampleIn the case of the economic factors, the coefficient on GDP per capita has a positive sign and it is statistically significant at the 5 per cent level, that is to say, the development of the economy increase tax revenue. Trade volume is not statistically significant; in other words its effect is undetermined or can have opposite effects, as suggested before. A plausible explanation of this is that OECD economies, on the one hand, are open economies and have reduced import taxes gradually and, on the other hand, the expansion of exports increase the performance of the economy and eventually, the direction of the effect is not clear. FDI relative to gross fixed capital formation enters negatively at the 1 per cent level, in the sys-GMM equation. This result is in keeping with the argument that the creation of government incentives to attract foreign investment reduces the potential to collect taxes.
The productive specialisation factors, the share of agriculture and industry in the economy, have negative and positive sign respectively, and both are statistically significant at the 1 per cent level. In this sense, it is confirmed that more industrialised countries increase their potential to collect taxes, while the agricultural sector has less possibilities to collect them.8
Both institutional variables, civil liberties and political rights, have negative sign (or positive effect), but only the former is statistically significant at the 10 per cent level. The results suggest that a democratic system is not a robust determinant of tax revenue, but the social and economic stability, associated with civil liberties, can have a more robust effect on tax collection.
Two of the social variables, child mortality rates and the proxy of education are not statistically significant. The other social variable, life expectancy, is statistically significant and shows a negative relationship with the dependent variable; this outcome is consistent with the argument that higher average age represents a higher proportion of retired people. As two of the social variables are not statistically significant, it can be contended that the social factors are not robust determinants of tax revenue. It has been argued previously that social factors can also be interpreted as level of development. In this respect, there are other proxies of development that can be better determinants of tax revenue, GDP per capita for instance, and therefore, the incorporation of robust development proxies in the model can weaken the effect of social factors.9
Previous research has shown that the mix of education subsidies and progressive tax income, in a way that maximises social welfare, leads to an endogenous process in which further educational attainment fosters economic growth and tax revenue (Krueger and Ludwig, 2012). Gomme (2005) showed that higher taxes tend to retard growth and reduce welfare, but when used to finance educational expenditure, taxes promote human capital accumulation and so growth which is tax and welfare-enhancing. In this respect the empirical evidence in the relevant literature suggest a bidirectional effect between education and tax revenue. In order to test the direction of the effect in our variables, we conducted granger causality tests and confirmed a bidirectional effect between tax revenue and the proxy of education. Hence, both variables follow an endogenous process, but the effect of education on tax revenue can be transmitted through economic growth, as documented in previous work. This is an explanation why the coefficient of the education proxy is less significant when the variable on economic growth is introduced in the model.
The coefficient on the lagged dependent variable is positive and statistically significant at the 1 per cent level and it is consistent with Keynesian theory. This finding shows that previous values of tax revenue determine present values. This is an endogenous process that can create a virtuoso circle in the economy when countries gradually increase tax revenue or, in contrast, can keep the economy with low economic growth rates when countries do not improve tax collection.
SubsamplesThe coefficient of the lagged dependent variable remains positive and statistically significant in the two subsamples, what it is worth noting is that the coefficient is smaller in the middle income countries (0.562) than in the high income countries (0.936), in average. In this respect, variations in the tax revenue in a previous period, are translated almost in a proportion of one to one in the current period, in high income countries; whereas the effect from one period to another, in middle income countries, is translated in a proportion slightly higher than 50 per cent. This result illustrates that tax revenue in high income countries depends mainly on its lagged values, while the tax collection level in middle income countries depends substantially on changes in the determinant variables, because the effect of past tax revenue decays over time. In other words, the adjustment coefficient (1 – γ) in middle income countries is larger and therefore, these countries are more vulnerable to the effect of the determinant variables, since their tax revenue as a proportion of GDP adjusts faster to the long term level, or it changes faster, compared to those countries with higher income level.
Although the regressions with the subsamples satisfy the tests already described, they are expected to have less significant coefficients, since they use less observations than the aggregated regression. In this sense, the variables that enter statistically significant in the subsample equations can be deemed the most robust determinants. In the case of middle income countries, institutional factors (civil liberties) and social factors (life expectancy) are the most robust determinants. As for high income countries, economic factors (foreign direct investment relative to gross fixed capital formation), productive specialisation factors (industry value added as a percentage of GDP), institutional factors (civil liberties) and social factors (life expectancy) are the most robust determinants.
In any case the sing of the significant coefficients does not change, compared to the aggregated regression. This outcome validates the results from the overall sample and is more evidence that the variables are robust.
Tax capacity, tax gap and tax effortThe following are some definitions that are useful to understand the discussion in this section. While tax revenue is the amount of income collected by a government through taxation as a percentage of GDP, tax capacity represents the maximum tax revenue that could be collected in a country, given its economic, social, institutional and structural characteristics. Tax gap is the difference between tax capacity and the actual tax revenue. Tax effort is the ratio between actual tax revenue and tax capacity.
In order to calculate the tax capacity we take the coefficients obtained through the aggregated sys-GMM equation, and substitute the most recent values of the explanatory variables for every country. This process is usually known in the relevant literature as the stochastic approach (Martín-Mayoral and Uribe, 2010) and applies linear regression models to estimate the coefficients.
A country with negative tax gap and tax effort lower than one, collects less than its capacity or potential (Piancastelli, 2001), in this case given by the characteristics outlined earlier. If the country wants to reach its tax capacity it needs to change tax regulations or taxation procedures. On the other hand, if the country wants to increase tax capacity it needs to go through a more complex process that involves structural, economic, social and institutional adjustments.
A negative tax gap or a tax effort less than one can occur mainly for two reasons: the first is that the tax collection systems or the taxation procedures of the corresponding country are not efficient. The second is because the country sets relatively low tax rates or tax burden and chooses to provide a low level of public goods and services, in other words, to have a relatively small government. In this study, low tax effort or negative tax gap simply mean that the country does not take advantage of the full tax capacity and we do not argue which of the two reasons, inefficient tax system or small government, is more convincing.
Table 4 in column 2 presents the estimated tax capacity and column 3 shows the computation of the tax gap in descending order. Countries with the largest tax gap are mainly from the Western Europe Region (Iceland, Belgium, France and Italy). It is interesting to note that countries like Belgium, France, Italy and Denmark have already high-tax-capacity, above 39 per cent, and even so their tax revenue is at least 2.2 points above it, which suggests that these countries have either large governments or efficient tax collection systems.
Tax gap in OECD countries
Country | Rank | Tax revenue (1) | Tax capacity (2) | Tax gap (3) |
---|---|---|---|---|
Iceland | 1 | 35.22 | 31.09 | 4.12 |
Belgium | 2 | 43.51 | 39.96 | 3.54 |
France | 3 | 42.86 | 39.32 | 3.54 |
Italy | 4 | 42.92 | 39.41 | 3.51 |
Greece | 5 | 30.88 | 27.72 | 3.15 |
Turkey | 6 | 25.72 | 22.65 | 3.07 |
Israel | 7 | 32.43 | 29.50 | 2.92 |
Luxemburg | 8 | 37.13 | 34.51 | 2.62 |
Denmark | 9 | 47.60 | 45.38 | 2.22 |
Spain | 10 | 32.26 | 30.13 | 2.12 |
Netherlands | 11 | 38.74 | 36.80 | 1.93 |
New Zealand | 12 | 1.53 | 29.96 | 1.57 |
Slovenia | 13 | 37.49 | 36.01 | 1.48 |
Sweden | 14 | 45.52 | 44.29 | 1.22 |
Portugal | 15 | 31.26 | 30.04 | 1.22 |
Finland | 16 | 42.49 | 41.62 | 0.87 |
United Kingdom | 17 | 34.86 | 34.14 | 0.72 |
Austria | 18 | 42.01 | 41.30 | 0.72 |
Czech Republic | 19 | 34.18 | 34.62 | −0.44 |
Estonia | 20 | 34.17 | 34.66 | −0.49 |
Mexico | 21 | 18.85 | 19.42 | −0.57 |
Australia | 22 | 25.63 | 26.23 | −0.60 |
Poland | 23 | 31.71 | 32.48 | −0.77 |
Hungary | 24 | 37.91 | 38.76 | −0.85 |
Norway | 25 | 42.90 | 44.23 | −1.33 |
Japan | 26 | 27.63 | 29.00 | −1.37 |
Germany | 27 | 36.05 | 37.43 | −1.38 |
Ireland | 28 | 27.64 | 29.10 | −1.46 |
Switzerland | 29 | 28.05 | 29.60 | −1.54 |
Chile | 30 | 19.64 | 21.31 | −1.67 |
Korea | 31 | 25.06 | 27.39 | −2.33 |
Canada | 32 | 31.03 | 33.52 | −2.49 |
Slovak Republic | 33 | 28.33 | 30.83 | −2.49 |
United Sates | 34 | 24.85 | 27.93 | −3.08 |
Average | 33.77 | 33.25 | 0.52 |
There is another group of countries like Greece, Turkey, Israel and Spain that have middle-tax-capacity, between 22 and 30, and a substantial tax gap, above 2.1. In this case the results suggest that these countries partially compensate the taxes that cannot collect, due to their specific characteristics, with efficient tax collection systems or, alternative suggest that the countries have large governments in relation to their tax capacity.
On the other hand, Korea, Canada, Slovak Republic and the US are the countries with the lowest tax gaps. In this case, although they have middle-tax-capacity, between 27 and 33, they do not seem to have efficient taxation systems or a heavier tax burden, that allow them to partially compensate the taxes that their specific characteristics do not allow them to collect. Another explanation for the negative gap is that these countries may have chosen to have small governments and to give a mayor role to the operation of markets.
It should be added that some middle-income countries like Chile and Mexico, have low-tax-capacity, below 21.3, and negative tax gaps. In this case, these countries need to adjust economic, social, structural and institutional factors if they want to increase tax capacity. Moreover, they also need to change tax regulations and tax collections systems, by increasing the tax burden, if they want to improve the tax gap.
We also compute the tax gap over the period 2002 – 2010, across all the countries; the exercise is extended to these years because they contain most of the observations for every variable and thus, the calculation can be more accurate. This exercise is aimed at exploring whether the tax gap changes drastically over the years or remains stable. According to the results, 19 out of 34 countries keep the tax gap with the same sign and 90.07 per cent of the observations do not change sign over the analysed period. The countries that present variation in the tax gap sign have the following characteristics: i) Gradual changes from positive to negative signs with the turning point in 2009, the year of the global economic crises, as is the case of Spain, Iceland and Slovenia. ii) Slight oscillations around cero, for instance Poland and Luxemburg. iii) Only changes in the sign in one year over the period, as is the case in most of this group of countries. Consequently, the countries tend to keep the same sing in the tax gap, and in the case of variations they are the result of trends, slight oscillations or sporadic variations, but they are not outliers. This result suggests that the estimation of the tax gap is stable and robust and does not present inexplicable or drastic variations.
ConclusionsAccording to this study, a country with high GDP per capita, low share of FDI in relation to GFCF, a robust industrial sector and the protection of civil liberties, is a country with more possibilities to have high tax revenue. In addition, the use of a dynamic econometric model reveals that the past value of the lagged dependent variable is a strong determinant of the current tax revenue, as it enters positively and statistically significant in the equation.
Tax revenue in middle-income-countries depends less on its lagged values than in high-income countries, which indicates that the roll of economic, institutional, social and structural factors is more important to determine current values of tax revenue in the middle-income economies.
According to the magnitude of the lagged dependent variable coefficients, middle-income countries need to improve the analysed factors permanently if they want, at less, to keep their tax revenue level, since the lag effect decreases faster over time. On the other hand, high-income countries, in general, can manage to keep the tax income level basically with the effect of past values, without conducting substantial transformations of the determinants.
The implication is that in middle-income countries, structural reforms are more relevant to improve tax collection, besides the fact that the dependent variable is more vulnerable to variations of the explanatory variables.
Some of the statistically significant coefficients of the explanatory variables in the overall sample, tend to reduce their level of significance in the regressions conducted with the subsamples, but they keep the sign, which suggests that the direction of the effect is robust.
The tax gap across countries tends to remain stable over time, which suggests that the characteristics of the taxation systems of the economies included in the sample have not changed drastically along the period. In addition, the tax gap does not seem to follow a pattern in relation to the income level of the countries and hence, this variable can depend on different factors like tax collection efficiency or tax regime.
In this case a positive effect is represented by a negative sign because low levels of the indicators are associated with high levels of democracy and liberties.
The value of the F test is 209.09, it far exceeds the critical values at conventional levels of significance and the ϱ-value is (0.000); then the null hypothesis is rejected.
We obtain a BPLM test statistic of 904.04, which far exceeds the one per cent critical value of the χ2 with one degree of freedom, 6.63, the ϱ-value is (0.000). Since the null hypothesis is rejected, it is concluded that there are individual effects.
The value of the Hausman test statistic is 34.41 and the ϱ-value is (0.000); hence the test rejects the null hypothesis. In this case, the key assumption of the REM “the unobservable individual specific error is not correlated to any explanatory variable” is violated; thus the FE is preferred.
A value of both test statistics, close to 2, indicates no autocorrelation, the values can be between 0 and 4. The lower and upper bonds for the Bhargava et al. (1982) statistic, considering N=250 and K=11 are 1.9255 and 1.9445; since the test statistic in this case is 0.857 and is located below the lower critical value, then the conclusion is to reject the null. The value of the Baltagi-Wu statistic is 1.128, which is far from 2; hence, in this case, the null is rejected as well.
The statistics of the Sargan test in both the difference and system GMM specification are 21.547 and 20.760 respectively; both figures do not exceeds the critical values of the test for 47 and 49 overidentifying restrictions; thus, test is unable to reject the null.
Since the two variables, the agriculture value added as a percentage of GDP and the industry value added as a percentage of GDP, are likely to be correlated, we obtained the correlation matrix and found that the correlation between the two variables is low (0.170); hence we include both variables in the model because they are not collinear.
We also conducted the sys-GMM specification dropping the GDP per capita variable, and found that variables such as political rights and the proxy of education, gross tertiary school enrolment, become statistically significant with the expected sign. We select the specification with the GDP per capita to report results and derive comments, because the Wald chi(2) test shows a larger statistic than the specification without the GDP per capita. This test approximately follows an F distribution, by the same token, larger values of the test correspond to more appropriate specifications.