Mar 20, 2010

Some Words about Nonparametrics

If m is believed to be smooth, then the observations at Xi near x should contain information about the value of m at x. Thus it should be possible to use something like a local average of the data near x to construct an estimator of m(x).   --R. Eubank (1988. p.7)


Parametric models are fully determined up to a parameter (vector). The fitted models can easily be interpreted and estimated accurately if the underlying assumptions are correct. If, however, they are violated then parametric estimates may be inconsistent and give a misleading picture of the regression relationship.
Nonparametric models avoid restrictive assumptions of the functional form of the regression function m. However, they may be difficult to interpret and yield inaccurate estimates if the number of regressors is large. This has been appropriately  termed The Curse of Dimensionality. Semiparametric models combine components of parametric and nonparametric models, keeping the easy interpretability of the former and retaining some  of the flexibility of the latter.


Note: Nonparametric regression estimators are very flexible but their statistical precision decreases greatly if we include several explanatory variables in the model. The latter Caveat has been appropriately termed the curse of dimensionality. Consequently, researchers have tried to develop models and estimators which offer more flexibility than standard parametric regression but overcome the curse of dimensionality by employing some form of dimension reduction. Such methods usually combine features of parametric and nonparametric techniques. As a consequence, they are usually referred to as semiparametric methods. Further advantages of semiparametric methods are the possible inclusion of categorical variables (which can often only be included in a parametric way), an easy (economic) interpretation of the results, and the possibility of a part specification of a model.          --Wolfgang Hardle (2004)

Mar 19, 2010

Integrated, Unit Roots and Box-Jenkins Approach

The Box-Jenkins Approach is only valid if the variable being modeled is stationary. Although there are many different ways in which data can be nonstationary, Box and Jenkins assumed that the nature of economic time series data is such that any nonstationarity can be removed by differencing. This explains why, the Box-Jenkins approach deals mainly with differenced data.
A key ingredient of their methodology, an ingredient adopted by econometricians (without any justification based on economic theory), is their assumption that the nonstationarity is such that differencing will create stationarity. this concept is what is meant by the term Integrated : a variable is said to be integrated of order d, written I(d), if it must be differenced d times to be made stationary. Thus a stationary variable is integrated of order zero, written I(0), a variable which must be differenced once to become stationary is said to be I(1), integrated of order one, and so on. Economic variables are seldom integrated of order greater than two, and if nonstationary are usually I(1). Here is an I(1) random walk illustrative example given by Peter Kennedy:

Mar 18, 2010

Robust Estimation and Outliers

Estimation designed to be the "best" estimator for a particular estimating problem owe their attractive properties to the fact that their derivation has exploited special features of the process generating the data, features that are assumed known by the econometrician. Knowledge that the classic linear regression model assumptions hold, for example, allows derivation of the OLS estimator as one possessing several desirable properties. Unfortunately, because these best estimators have been designed to exploit these assumptions, violations of the assumptions affect them much more than they do other, sub-optimal estimators. Because the researchers are not in a position of knowing with certainty that the assumptions used to justify their choice of estimator are met, it is tempting to protect oneself against violations of these assumptions by using an estimator whose properties, while not quite "best", are not sensitive to violations of those assumptions. Such estimators are referred to as Robust Estimators.
In the presence of fat-tailed error distributions, although the OLS estimator is BLUE, it is markedly inferior to some nonlinear unbiased estimators. These nonlinear estimators, namely robust estimators, are preferred to the OLS estimator whenever there may be reason to believe that the error distribution is fat-tailed.

So the implications here are that, we should treat the Outliers more carefully than just simply kicking them out of the sample for the purpose of a better good-of-fit in running OLS. Often Influential Observations (outliers) are the most valuable observations in a dataset, outliers maybe reflecting some unusual fact that could lead to an improvement in the model's specification.

Mar 17, 2010

Cointegration

Recall that the levels variables in the ECM entered the estimating equation in a special way: they entered combined into a single entity that captured the extent to which the system is out of equilibrium. It could be that even though these levels variables are individually I(1) (a variable which must be differenced once to become stationary), this special combination of them is I(0). If this is the case, their entry into the estimating equation will not create spurious (false, although seeming to be genuine) results.

This possibility does not seem unreasonable. A nonstationary variable will tend to wander extensively (that is what makes it nonstationary), but some pairs of nonstationary variables can be expected to wander in such a way that they do not drift too far apart. Thanks to dis-equilibrium forces that will tend to keep them together. Some examples are short and long term interests rate, prices and wages, household income and expenditures, imports and exports, spot and future prices of a commodity, and exchange rates determined in different markets. Such variables are said to be Cointegrated: although individually they are I(1), a particular linear combination of them is I(0). The cointegrating combination is interpreted as an equilibrium relationship, since it can be shown that variables in the error-correction term in an ECM must be cointegrated, and vice versa, that cointegrated variables must have an ECM representation. This is why economists have shown such interests in the concept of cointegration - it provides a formal framework for testing and for estimating long-run (equilibrium) relationships among economic variables.

One important implication of all this is that differencing is not the only means of eliminating unit roots. Consequently, if the data are found to have unit roots, before differencing (and thereby losing all the long-run information in the data) a researcher should test for cointegration; if a cointegrating relationship can be found, this should be exploited by undertaking estimation in an ECM framework.

Error-Correction Model, Differenced and Levels Variable

Error-correction model, or for short, ECM is a very popular benchmark model in time series econometrics. An error-correction model is a dynamic model in which "the movement of the variables in any periods is related to the previous period's gap from long-run equilibrium." As a simple example of this consider the relationship:
where y and x are measured in logarithms, with economic theory suggesting that in the long run y and x will grow at the same rate, so that in equilibrium (y-x) will be a constant, save for the error. This relationship can be manipulated to produce:
This is the ECM representation of the original specification; the last term is the error-correction term, interpreted as reflecting dis-equilibrium reponses. The terminology can be explained as follows: if in error y grows too quickly, the last term becomes bigger, and since its coefficient is negative (β3 <1 for stationarity), Δyt is reduced, correcting this error. In actual applications, more explanatory variables will appear, with many more lags. Notice that this ECM equation turns out to be in terms of Differenced Variables, with the error-correction component measured in terms of Levels Variables.

Mar 16, 2010

Dummy Variable, Fixed and Random Effects

Dummy variables are sometimes used in the context of panel, or longitudinal data - observations on a cross-section of individuals or firms, say, over time. In this context it is often assumed that the intercept varies across the N cross-sectional units and/or across the T time periods. In the general case (N-1)+(T-1) dummies can be used for this, with computational short-cuts available to avoid having to run a regression with all these extra variables. This way of analyzing panel data is called the Fixed Effects Model. the dummy variable coefficients reflect ignorance - they are inserted merely for the purpose of measuring shifts in the regression line arising from unknown variables. Some researchers feel that this type of ignorance should be treated in a fashion similar to the general ignorance represented by the error term, and have accordingly proposed the Random Effects, Variance Components, or Error Components model.
Which of the fixed effects and the random effects models is better? This depends on the context of the data and for what the results are to be used. If the data exhaust the population (say observations on all firms producing automobiles), then the fixed effects approach, which produces results conditional on the units in the dataset, is reasonable. If the data are a drawing of observations from a large population (say a thousand individuals in a city many times that size), and we wish to draw inferences regarding other members of that population, the fixed effects model is no longer reasonable; in this context, use of the random effects model has the advantage that it saves a lot of degrees of freedom.

The random effects model has major drawback, however, it assumes that the random error associated with each cross-section unit is uncorrelated with the other regressors, something that is not likely to be the case. Suppose, for example, that wages are being regressed on schooling for a large set of individuals, and that a missing variable, ability, is thought to affect the intercept; since schooling and ability are likely to be correlated, modeling this as a random effect will create correlation between the error and the regressor schooling (whereas modeling it as a fixed effect will not). The result is bias in the coefficient estimates from the random effect model. This may explain why the slope estimates from the fixed and random effects models are often so different. 
A Hausman test for correlation between the error and the regressors can be used to check for whether the random effects model is appropriate. Under the null hypothesis of no correlation between the error and the regressors, the random effects model is applicable and its estimated GLS estimator is consistent and efficient. Fixed effects model is consistent under both null nad the alternative.

Qualitative vs Quantitative Variables

These two words are pretty similar, I have been confused on them for a little while, now here is a clarification.

Variables can be quantitative or qualitative. (Qualitative variables are sometimes called "categorical variables.") Quantitative variables are measured on an ordinal, interval, or ratio scale; qualitative variables are measured on a nominal scale. If five-year old subjects were asked to name their favorite color, then the variable would be qualitative. If the time it took them to respond were measured, then the variable would be quantitative. In brief, a qualitative variable is not measurable with numerical instruments but with adjectives that do not imply ranking or scales (i.e. gender, colors, taste, etc.) A quantitative variable is measurable with numerical instruments and can be ordered in a quantifiable ranking (i.e. the height of a person). Through the following table you will be able to differentiate them very easily:

Mar 15, 2010

The Bayesian Approach

The essence of the debate between the Frequentists (A statistical approach for assessing the likelihood that a hypothesis is correct, by assessing the strength of the data that supports the hypothesis and the number of hypotheses that are tested. ) and the Bayesian rests on the acceptability of the subjectivist notion of probability. Once one is willing to view probability in this way, the advantages of the Bayesian approach are compelling. But most practitioners, even though they have no strong aversion to the subjectivist notion of probability, do not choose to adopt the Bayesian Approach. The reasons are practical in nature.
1. Formalizing prior beliefs into a prior distribution is not an easy task;
2. The mechanics of finding the posterior distribution are formidable (feel fear and/or respect for something, because they are impressive or powerful, or because they seem very difficult).
3. Convincing others of the validity of Bayesian results is difficult because they view those results as being "contaminated" by personal beliefs.


Following the subjective notion of probability, it is easy to imagine that before looking at the data the researcher could have a "prior" density function for β, reflecting the odds that he or she would give, before looking at the data, if asked to take bets on the true value of β. This prior distribution, when combined with the data via Bayes' Theorem, produces the posterior distribution referred to above. This posterior density function is in essence a Weighted Average of the prior density and the likelihood (or "conditional density", conditional on the data).
Generally, the Bayesian Approach consists of three steps:
1. A prior distribution is formalized, reflecting the researcher's beliefs about the parameters in question before looking at the data.
2. This prior is combined with the data, via Bayes' theorem, to produce the posterior distribution, the main output of a Bayesian analysis.
3. This posterior is combined with a loss or utility function to allow a decision to be made on the basis of minimizing expected loss or maximizing expected utility, this third step is optional.

The Bayesian approach claims several advantages over the classical approach:
1. The Bayesian approach is concerned with how information in data modifies a researcher's beliefs about parameter values and allows computation of probabilities associated with alternative hypotheses or models; this corresponds directly to the approach to these problems taken by most researchers.
2. Extraneous information is routinely incorporated in a consistent fashion in the Bayesian method through the formulation of the prior; in the classical approach such information is more likely to be ignored, and when incorporated is usually done so in ad hoc (arranged or happening when necessary and not planned in advance) ways.
3. The Bayesian approach can tailor the estimate to the purpose of the study, through selection of the loss function; in general, its compatibility with decision analysis is a decided advantage.
4. There is no need to justify the estimating procedure in terms of the awkward concept of the performance of the estimator in hypothetical (based on situations or ideas which are possible and imagined rather than real and true) repeated samples; the Bayesian approach is justified solely on the basis of the prior and the sample data.

Mar 10, 2010

Condition Index and Multicollinearity

In the case of Multicollinearity, a less common, but more satisfactory, way of detecting Multicollinearity is through the condition index, or number, of the data, the square root of the ratio of the largest to the smallest characteristic root of X'X. A high condition index reflects the presence of collinearity.
When there is no collinearity at all, the eigenvalues, condition indices and condition number will all equal one. As collinearity increases, eigenvalues will be both greater and smaller than 1, and the condition indices and the condition number will increase. An informal rule of thumb is that if the condition number is 15, multicollinearity is a concern; if it is greater than 30 multicollinearity is a very serious concern. (But again, these are just informal rules of thumb.) In SPSS, you get these values by adding the COLLIN parameter to the Regression command; in Stata you can use COLLIN. In SAS, you can use COLLIN option in Model statement of PROC REG.

Here are two more rules of thumb in dealing with multicollinearity:
Don't worry about multicollinearity if the R^2 from the regression exceeds the R^2 of any independent variable regressed on the other independent variables.
Don't worry about multicollinearity if the t statistics are all greater than 2.

Mar 9, 2010

Consistency and Convergence

A consistent sequence of estimators is a sequence of estimators that converge in probability to the quantity being estimated as the index (usually the sample size) grows without bound. In other words, increasing the sample size increases the probability of the estimator being close to the population parameter. Mathematically, a sequence of estimators \{t_n; n \ge 0\} is a consistent estimator for parameter θ if and only if, for all ε > 0, no matter how small, we have

 
\lim_{n\to\infty}\Pr\left\{
\left|
t_n-\theta\right|<\epsilon
\right\}=1.

The consistency defined above may be called Weak Consistency. The sequence is Strongly Consistent, if it Converges Almost Surely to the true value. To say that the sequence Xn converges almost surely or almost everywhere or with probability 1 or strongly towards X means that


    \operatorname{Pr}\!\left( \lim_{n\to\infty}\! X_n = X \right) = 1.
This means that the values of Xn approach the value of X, in the sense (see almost surely) that events for which Xn does not converge to X have probability 0. Using the probability space \scriptstyle (\Omega, \mathcal{F}, P ) and the concept of the random variable as a function from Ω to R, this is equivalent to the statement

    \operatorname{Pr}\Big( \omega \in \Omega : \lim_{n \to \infty} X_n(\omega) = X(\omega) \Big) = 1.

Mar 8, 2010

Indirect Least Squares Method (ILS)

Suppose we wish to estimate a structural equation containing say, three endogenous variables. The first step of the ILS technique is to estimate the reduced-form equations for these three endogenous variables. If the structural equations for these three endogenous variables. If the structural equation in question is just identified, there will be only one way of calculating the desired estimates of the structural equation parameters from the reduced-form parameter estimates. The structural parameters are expressed in terms of the reduced-form parameters, and the OLS estimates of the reduced-form parameters are plugged in these expressions to produce estimates of the structural parameters. Because these expressions are nonlinear, however, unbiased estimates of the reduced-form parameters produce Only Consistent estimates of the structural parameters, not unbiased estimates.

If an equation is over-identified, the extra identifying restrictions provide additional ways of calculating the structural parameters from the reduced-form parameters, all of which are supposed to lead to the same values of the structural parameters. But because the estimates of the reduced-form parameters do not embody these extra restrictions, these different ways of calculating the structural parameters creates different estimates of these parameters. (This is because unrestricted estimates rather than actual values of the parameters are being used for these calculations.) Because there is no way of determining which of these different estimates is the most appropriate, ILS is not used for over-identified equations. The other simultaneous equation estimating techniques have been designed to estimate structural parameters in the over-identified case; many of these can be shown to be equivalent in the over-identified case; many of these can be shown to be equivalent to ILS in the context of a just-identified equation, and to be weighted averages of the different estimates produced by ILS in the context of over-identified equations.

Here is a basic procedure to implement ILS:
1. Rearrange the structural form equations into reduced form, Estimate the reduced form equations;
2. Estimate the reduced form parameters;
3. Solve for the structural form parameters in terms of the reduced form parameters, and substitute in the estimates of the reduced form parameters to get estimates for the structural ones.
Note: If structural equation is exactly identified, there will be  a unique way to calculate the parameters. Estimates of reduced form parameters are unbiased, but estimates of the structural parameters will not be.  Both are consistent.

Mar 7, 2010

Order & Rank Conditions of Identification

The identification problem is a mathematical (as opposed to statistical) problem associated with simultaneous equation systems. It is concerned with the question of the possibility or impossibility of obtaining meaningful estimates of the structural parameters. The identification problem can be solved if economic theory and extraneous information can be used to place restrictions on the set of simultaneous equations. These restrictions can take a variety forms (such as use of extraneous estimates of parameters, knowledge  of exact relationship among parameters, knowledge of the relative variances of disturbances, knowledge of zero correlation between disturbances in different equations, etc.), but the restrictions usually employed, called Zero Restrictions, take the form of specifying that certain structural parameters are zero, i.e., that certain endogenous variables and certain exogenous variables do not appear in certain equations. Mathematical investigation has shown that in the case of Zero Restrictions on structural parameters each equation can be checked for identification by using a rule called the Rank Condition. It turns out, however, that this rule is quite awkward to employ, and as a result a simpler rule, called the Order Condition, is used in its stead. This rule only requires counting included and excluded variables in each equation.
Here is a brief illustration of order and rank conditions of identification in simultaneous equation system:





M = number of endogenous variables in the model
K = number of exogenous variables in the model
m = number of endogenous variable in an equation
k = number of exogenous variables in a given equation
Rank condition is defined by the rank of the matrix, which should have a dimension (M-1), where m is the number of endogenous variables. This matrix is formed from the coefficients of the variables (both endogenous and exogenous) excluded from that particular equation but included in the other equations in the model.
The rank condition tells us whether the equation under consideration is identified or not, whereas the order condition tells us if it is exactly identified or overidentified.
1. If K-k>m-1 and the rank of the ρ(A) is M-1 then the equation is overidentified.
2. If K-k=m-1 and the rank of the ρ(A) is M-1 then the equation is exactly identified.
3. If K-k>=m-1 and the rank of the ρ(A) is less than M-1 then the equation is underidentified.
4. If K-k<=m-1 the structural equation is unidentified. The rank of the ρ(A) is less M-1 in this case.

From these rules, we can tell that, the order condition is only a necessary condition, not a sufficient one. So that, technically speaking, the rank condition must also be checked. Many econometricians do not bother doing this, however, gambling that the rank condition will be satisfied (as it usually is) if the order condition is satisfied. This procedure is hence not recommended.

Mar 4, 2010

Direct PC SAS Output to a File

When running SAS programs interactively through the display manager, the output from any procedure is written to the Output window and notes, warnings and errors are written to the Log Window. Contents of these windows are temporary. They can be saved to a file using the File Save pulldown menus from the Output Window and from the Log Window. But if you want to make sure that the output of these windows is saved to a file every time, you can use Proc Printto to automatically route output to a file.

For example, the following program routes the output from Proc Printto directly to a file named auto.lst. What would have gone to the Output Window is redirected to the file c:\auto.lst . The statements below tell SAS to send output that would go to the Output Window to the file c:\auto.lst and to create a new file if the file currently exists.  If the NEW option was omitted, SAS would append to the file if it existed.

    PROC PRINTTO PRINT='c:\auto.lst' NEW;
    RUN;

Note: (1) sometime SAS program can collapse (unexpected terminated) before it executes all of the statements properly, then you will lose all of the results you already got.(In this kind of situation you have to end the SAS through Windows Task Manager, because generally the SAS program will stop respond). By using Proc Printto, you can save all of the temporary results you have already got before the program is unexpected terminated.

(2) Generally you need put Proc Printto statement at the very beginning of the SAS code. Of course, you can also release the print output file by using another simple statement at the very end of the SAS code:
    PROC PRINTTO;
    RUN;

(3) For log print, you can use a similar SAS Code:

    PROC PRINTTO LOG='c:\auto.log' NEW;
    RUN;

Mar 3, 2010

Observations and Thoughts on Haiti and Chile

Here are some observations from a blogger:
"The recent earthquakes in Haiti and Chile present an interesting contrast between the deleterious effects of a major earthquake in one of the richest countries in the western hemisphere and in the poorest.  It may surprise you that Chile is (by relative standards) quite an advanced and relatively wealthy country as many Americans, I think, have a tendency to view all of Latin America as a poor region.  According to the CIA, the per-capita GDP in Chile in 2009 was $14,700 while Haiti was $1,300 - so while Chile is far from US or Western European standards of living, it is a much wealthier country than Haiti.  In both cases the earthquake (and subsequent tsunami in Chile) were devastating disasters, but the scope of the tragedy in Haiti was, it appears, much, much worse."

These observations pass two serious thinking to me:
1. Other than physical demand, people's level of immaterial demand can also be determined by income or wealth; and most of time, safety is not among the basic levels of human needs.
2. Opportunity cost for poor is less than rich people when they are facing the same danger and potential of losing. Who can stand more risk and unsafety, poor or rich? This is a two-way argument.

So the practical question is that, can we validate these observation via some statistical or econometrics methods?

A good Illustration of Weighted Regression by Peter Kennedy

Measurement Error

In parametrics, the assumption of fixed regressors is made mainly for mathematical convenience, if the regressors can be considered to be fixed in repeated samples, the desirable properties of the OLS estimator can be derived quite straightforwardly. The essence of this assumption is that, if the regressors are nonstochastic, they are distributed independently of the disturbances. If this assumption is weakened to allow the explanatory variables to be stochastic but to be distributed independently of error term, all the desirable properties of the OLS estimator are maintained; their algebraic derivation is more complicated, however, and their interpretation in some instances must be changed (for example, in this circumstance, βOLS is not, strictly speaking, a linear estimator).

If the regressors are only contemporaneously uncorrelated with with the disturbance vector, the OLS estimator is biased but retains its desirable asymptotic properties at the expense of the small-sample properties of βOLS. If the regressors are contemporaneously correlated with the error term, the OLS estimator is even asymptotically biased.

When there exists contemporaneous correlation between the disturbance and a regressor, alternative estimators with desirable small-sample properties can not in general be found; as a consequence, the search for alternative estimators is conducted on the basis of their asymptotic properties. The most common estimator used in this context is the instrumental variable (IV) estimator.

Mar 1, 2010

The Nature of Agricultural Economics - My Thought

The strength of agricultural economics is not that it can compete with general economics research. People may feel that general economics research is more decent, this is true in the sense that it produces pretty neat and nice work with help of mathematic notation. Mathematics is important, this should be admitted, it is the logic language of this world. So what general economics research does is that it has been employed in the effort of expressing the world, while the agricultural economics research should be dedicated to be more close to the real world, to pass more care to people and entire world's basic needs.

There is a movie which has been putting on the screen for a while, Food Inc. (2008). An American documentary film directed by Emmy Award-winning filmmaker Robert Kenner. The film examines large-scale agricultural food production in the United States, concluding that the meat and vegetables produced by this type of economic enterprise have many hidden costs and are unhealthy and environmentally-harmful. The documentary generated extensive controversy in that it was heavily criticized by large American corporations engaged in industrial food production. This is just an example, so the question is that after popular cost-benefit analysis and its derivative forms and combinations, who really cares about problem like above? Most of time, optimization, maximization, equilibrium and so forth are too perfect to be practical in applications; some other time, human activity and interaction are so of diversity that it is not enough or even it is not neccessary to follow a cost-benefit logic, especially when you have hard time to identify who are beneficiaries and who are victims.

Here is a word I want to share with everyone: we can only and will only win the world by love and responsibility, not by proving; because essentially everything can be proved while nothing cannot be proved eventually.
                                              -- Haoying Wang, 2010

The Nature of Agricultural Economics (1)

The nature, foundation, structure and future of agricultural economics has been of concern for a long time. Even though after 90's it is noticed that agricultural economics and its education have been experiencing a downturn, it is still holding the frontier of applied econometrics and environmental economics, which are something heading future. When we look back twenty years, where lots of concerns and thoughts stacked from.

Agricultural Economics is Applied Agricultural economics is by its very nature an applied discipline-a discipline that focuses on the application of economic rinciples taken from general economics to practical, applied problems based on keen observation of the behavior of individuals, groups and institutions within an economic setting. Some agricultural economists argue that despite its reliance on economic theory, nearly all the research being conducted by agricultural economists is applied - in that the research has as its core basis observable economic phenomena based upon human behavior. Like theoretical physics related to the origins of the universe, much of the most advanced economic research being conducted in what are regarded as the best economics graduate schools has little grounding in observable economic phenomena, and consists of abstract mathematical proofs of economic theories that are seldom verifiable based on data gathered from the real world.
                                                        --David L. Debertin, 1999.


There is decreasing diversity among economics departments with respect to what is taught among the top-ten schools - that because of the inter-hiring only within the small group of schools thought to be in the peer group, there is little diversity in what is taught or in methodological approaches to research considered acceptable. As I look at the agricultural economics top-ten list, however, I see considerably greater diversity in the kinds of graduate education that would be obtained. An agricultural economics Ph.D. from Purdue would be very different from one obtained from UC-Berkeley, and no one would characterize a North Carolina State ag. econ. Ph.D. as being a clone of one produced by UW-Madison! In my view-the diversity of these graduate programs, along with the additional diversity contained in lower-ranked schools--is a source of great strength in agricultural economics, not a weakness.
                                                        --David L. Debertin, 1999.
 
Locations of visitors to this page