Apr 28, 2010

Two Rivers, Culture and Life

Here is an excerpt from Michael Steinhardt's book 'No Bull-my life in and out of markets' , which is of much inspiration in understanding the American and immigration culture.

"In looking back on my career and my life, I can see that my values, and the goals I continue to strive for, represent the confluence of two great rivers: The age-old river of Judaism, the people and the tradition, and the river of secularized (societies are no longer under the control or influence of religion) American. From the Eastern European Jewish river flows a region, and, more importantly, a culture, while from the other river flows twentieth and twenty-first century American life with its openness, social mobility, and material prosperity. I believe my generation of Jews, in particular, is the product of these same two rivers, and the contents of both are strong within us. But, over time, the American river has grown stronger, becoming dominant in our lives, while the Eastern European river has been subsumed (to include something in a particular group and not consider it separately). For the first 50-plus year of my life, I too traveled, almost exclusively, along the secular river of American culture. Now I work, almost exclusively, on strengthening the flow of the river of my heritage."

       --'No Bull-my life in and out of markets', Chaper 17, pp.263.

Apr 19, 2010

On Property

The word property is not easy to define precisely.
According to Merriam Webster: 'property' can be interpreted as:

(a) A quality or trait belonging and especially peculiar to an individual or thing;
(b) An effect that an object has on another object or on the senses;
(c) An attribute common to all members of a class.

More simple, according to Google Dictionary:
(a) A thing or things that are owned by somebody; a possession or possessions;
(b) A quality or characteristic that something has.

Mathematically, however, we shall not hesitate to use it in the usual (informal) fashion.
If P denotes a property that is meaningful for a collection of elements, then we agree to write {x : P(x)} for the set of all elements x for which the property P holds. We usually read this as "the set of all x such that P(x)". It is often worthwhile to specify which elements we are testing for the property P. Hence we shall often write:

{x \in \!\, S : P(x)} for the subset of S for which the property P holds.

Mar 20, 2010

Some Words about Nonparametrics

If m is believed to be smooth, then the observations at Xi near x should contain information about the value of m at x. Thus it should be possible to use something like a local average of the data near x to construct an estimator of m(x).   --R. Eubank (1988. p.7)


Parametric models are fully determined up to a parameter (vector). The fitted models can easily be interpreted and estimated accurately if the underlying assumptions are correct. If, however, they are violated then parametric estimates may be inconsistent and give a misleading picture of the regression relationship.
Nonparametric models avoid restrictive assumptions of the functional form of the regression function m. However, they may be difficult to interpret and yield inaccurate estimates if the number of regressors is large. This has been appropriately  termed The Curse of Dimensionality. Semiparametric models combine components of parametric and nonparametric models, keeping the easy interpretability of the former and retaining some  of the flexibility of the latter.


Note: Nonparametric regression estimators are very flexible but their statistical precision decreases greatly if we include several explanatory variables in the model. The latter Caveat has been appropriately termed the curse of dimensionality. Consequently, researchers have tried to develop models and estimators which offer more flexibility than standard parametric regression but overcome the curse of dimensionality by employing some form of dimension reduction. Such methods usually combine features of parametric and nonparametric techniques. As a consequence, they are usually referred to as semiparametric methods. Further advantages of semiparametric methods are the possible inclusion of categorical variables (which can often only be included in a parametric way), an easy (economic) interpretation of the results, and the possibility of a part specification of a model.          --Wolfgang Hardle (2004)

Mar 19, 2010

Integrated, Unit Roots and Box-Jenkins Approach

The Box-Jenkins Approach is only valid if the variable being modeled is stationary. Although there are many different ways in which data can be nonstationary, Box and Jenkins assumed that the nature of economic time series data is such that any nonstationarity can be removed by differencing. This explains why, the Box-Jenkins approach deals mainly with differenced data.
A key ingredient of their methodology, an ingredient adopted by econometricians (without any justification based on economic theory), is their assumption that the nonstationarity is such that differencing will create stationarity. this concept is what is meant by the term Integrated : a variable is said to be integrated of order d, written I(d), if it must be differenced d times to be made stationary. Thus a stationary variable is integrated of order zero, written I(0), a variable which must be differenced once to become stationary is said to be I(1), integrated of order one, and so on. Economic variables are seldom integrated of order greater than two, and if nonstationary are usually I(1). Here is an I(1) random walk illustrative example given by Peter Kennedy:

Mar 18, 2010

Robust Estimation and Outliers

Estimation designed to be the "best" estimator for a particular estimating problem owe their attractive properties to the fact that their derivation has exploited special features of the process generating the data, features that are assumed known by the econometrician. Knowledge that the classic linear regression model assumptions hold, for example, allows derivation of the OLS estimator as one possessing several desirable properties. Unfortunately, because these best estimators have been designed to exploit these assumptions, violations of the assumptions affect them much more than they do other, sub-optimal estimators. Because the researchers are not in a position of knowing with certainty that the assumptions used to justify their choice of estimator are met, it is tempting to protect oneself against violations of these assumptions by using an estimator whose properties, while not quite "best", are not sensitive to violations of those assumptions. Such estimators are referred to as Robust Estimators.
In the presence of fat-tailed error distributions, although the OLS estimator is BLUE, it is markedly inferior to some nonlinear unbiased estimators. These nonlinear estimators, namely robust estimators, are preferred to the OLS estimator whenever there may be reason to believe that the error distribution is fat-tailed.

So the implications here are that, we should treat the Outliers more carefully than just simply kicking them out of the sample for the purpose of a better good-of-fit in running OLS. Often Influential Observations (outliers) are the most valuable observations in a dataset, outliers maybe reflecting some unusual fact that could lead to an improvement in the model's specification.

Mar 17, 2010

Cointegration

Recall that the levels variables in the ECM entered the estimating equation in a special way: they entered combined into a single entity that captured the extent to which the system is out of equilibrium. It could be that even though these levels variables are individually I(1) (a variable which must be differenced once to become stationary), this special combination of them is I(0). If this is the case, their entry into the estimating equation will not create spurious (false, although seeming to be genuine) results.

This possibility does not seem unreasonable. A nonstationary variable will tend to wander extensively (that is what makes it nonstationary), but some pairs of nonstationary variables can be expected to wander in such a way that they do not drift too far apart. Thanks to dis-equilibrium forces that will tend to keep them together. Some examples are short and long term interests rate, prices and wages, household income and expenditures, imports and exports, spot and future prices of a commodity, and exchange rates determined in different markets. Such variables are said to be Cointegrated: although individually they are I(1), a particular linear combination of them is I(0). The cointegrating combination is interpreted as an equilibrium relationship, since it can be shown that variables in the error-correction term in an ECM must be cointegrated, and vice versa, that cointegrated variables must have an ECM representation. This is why economists have shown such interests in the concept of cointegration - it provides a formal framework for testing and for estimating long-run (equilibrium) relationships among economic variables.

One important implication of all this is that differencing is not the only means of eliminating unit roots. Consequently, if the data are found to have unit roots, before differencing (and thereby losing all the long-run information in the data) a researcher should test for cointegration; if a cointegrating relationship can be found, this should be exploited by undertaking estimation in an ECM framework.

Error-Correction Model, Differenced and Levels Variable

Error-correction model, or for short, ECM is a very popular benchmark model in time series econometrics. An error-correction model is a dynamic model in which "the movement of the variables in any periods is related to the previous period's gap from long-run equilibrium." As a simple example of this consider the relationship:
where y and x are measured in logarithms, with economic theory suggesting that in the long run y and x will grow at the same rate, so that in equilibrium (y-x) will be a constant, save for the error. This relationship can be manipulated to produce:
This is the ECM representation of the original specification; the last term is the error-correction term, interpreted as reflecting dis-equilibrium reponses. The terminology can be explained as follows: if in error y grows too quickly, the last term becomes bigger, and since its coefficient is negative (β3 <1 for stationarity), Δyt is reduced, correcting this error. In actual applications, more explanatory variables will appear, with many more lags. Notice that this ECM equation turns out to be in terms of Differenced Variables, with the error-correction component measured in terms of Levels Variables.

Mar 16, 2010

Dummy Variable, Fixed and Random Effects

Dummy variables are sometimes used in the context of panel, or longitudinal data - observations on a cross-section of individuals or firms, say, over time. In this context it is often assumed that the intercept varies across the N cross-sectional units and/or across the T time periods. In the general case (N-1)+(T-1) dummies can be used for this, with computational short-cuts available to avoid having to run a regression with all these extra variables. This way of analyzing panel data is called the Fixed Effects Model. the dummy variable coefficients reflect ignorance - they are inserted merely for the purpose of measuring shifts in the regression line arising from unknown variables. Some researchers feel that this type of ignorance should be treated in a fashion similar to the general ignorance represented by the error term, and have accordingly proposed the Random Effects, Variance Components, or Error Components model.
Which of the fixed effects and the random effects models is better? This depends on the context of the data and for what the results are to be used. If the data exhaust the population (say observations on all firms producing automobiles), then the fixed effects approach, which produces results conditional on the units in the dataset, is reasonable. If the data are a drawing of observations from a large population (say a thousand individuals in a city many times that size), and we wish to draw inferences regarding other members of that population, the fixed effects model is no longer reasonable; in this context, use of the random effects model has the advantage that it saves a lot of degrees of freedom.

The random effects model has major drawback, however, it assumes that the random error associated with each cross-section unit is uncorrelated with the other regressors, something that is not likely to be the case. Suppose, for example, that wages are being regressed on schooling for a large set of individuals, and that a missing variable, ability, is thought to affect the intercept; since schooling and ability are likely to be correlated, modeling this as a random effect will create correlation between the error and the regressor schooling (whereas modeling it as a fixed effect will not). The result is bias in the coefficient estimates from the random effect model. This may explain why the slope estimates from the fixed and random effects models are often so different. 
A Hausman test for correlation between the error and the regressors can be used to check for whether the random effects model is appropriate. Under the null hypothesis of no correlation between the error and the regressors, the random effects model is applicable and its estimated GLS estimator is consistent and efficient. Fixed effects model is consistent under both null nad the alternative.

Qualitative vs Quantitative Variables

These two words are pretty similar, I have been confused on them for a little while, now here is a clarification.

Variables can be quantitative or qualitative. (Qualitative variables are sometimes called "categorical variables.") Quantitative variables are measured on an ordinal, interval, or ratio scale; qualitative variables are measured on a nominal scale. If five-year old subjects were asked to name their favorite color, then the variable would be qualitative. If the time it took them to respond were measured, then the variable would be quantitative. In brief, a qualitative variable is not measurable with numerical instruments but with adjectives that do not imply ranking or scales (i.e. gender, colors, taste, etc.) A quantitative variable is measurable with numerical instruments and can be ordered in a quantifiable ranking (i.e. the height of a person). Through the following table you will be able to differentiate them very easily:

Mar 15, 2010

The Bayesian Approach

The essence of the debate between the Frequentists (A statistical approach for assessing the likelihood that a hypothesis is correct, by assessing the strength of the data that supports the hypothesis and the number of hypotheses that are tested. ) and the Bayesian rests on the acceptability of the subjectivist notion of probability. Once one is willing to view probability in this way, the advantages of the Bayesian approach are compelling. But most practitioners, even though they have no strong aversion to the subjectivist notion of probability, do not choose to adopt the Bayesian Approach. The reasons are practical in nature.
1. Formalizing prior beliefs into a prior distribution is not an easy task;
2. The mechanics of finding the posterior distribution are formidable (feel fear and/or respect for something, because they are impressive or powerful, or because they seem very difficult).
3. Convincing others of the validity of Bayesian results is difficult because they view those results as being "contaminated" by personal beliefs.


Following the subjective notion of probability, it is easy to imagine that before looking at the data the researcher could have a "prior" density function for β, reflecting the odds that he or she would give, before looking at the data, if asked to take bets on the true value of β. This prior distribution, when combined with the data via Bayes' Theorem, produces the posterior distribution referred to above. This posterior density function is in essence a Weighted Average of the prior density and the likelihood (or "conditional density", conditional on the data).
Generally, the Bayesian Approach consists of three steps:
1. A prior distribution is formalized, reflecting the researcher's beliefs about the parameters in question before looking at the data.
2. This prior is combined with the data, via Bayes' theorem, to produce the posterior distribution, the main output of a Bayesian analysis.
3. This posterior is combined with a loss or utility function to allow a decision to be made on the basis of minimizing expected loss or maximizing expected utility, this third step is optional.

The Bayesian approach claims several advantages over the classical approach:
1. The Bayesian approach is concerned with how information in data modifies a researcher's beliefs about parameter values and allows computation of probabilities associated with alternative hypotheses or models; this corresponds directly to the approach to these problems taken by most researchers.
2. Extraneous information is routinely incorporated in a consistent fashion in the Bayesian method through the formulation of the prior; in the classical approach such information is more likely to be ignored, and when incorporated is usually done so in ad hoc (arranged or happening when necessary and not planned in advance) ways.
3. The Bayesian approach can tailor the estimate to the purpose of the study, through selection of the loss function; in general, its compatibility with decision analysis is a decided advantage.
4. There is no need to justify the estimating procedure in terms of the awkward concept of the performance of the estimator in hypothetical (based on situations or ideas which are possible and imagined rather than real and true) repeated samples; the Bayesian approach is justified solely on the basis of the prior and the sample data.

Mar 10, 2010

Condition Index and Multicollinearity

In the case of Multicollinearity, a less common, but more satisfactory, way of detecting Multicollinearity is through the condition index, or number, of the data, the square root of the ratio of the largest to the smallest characteristic root of X'X. A high condition index reflects the presence of collinearity.
When there is no collinearity at all, the eigenvalues, condition indices and condition number will all equal one. As collinearity increases, eigenvalues will be both greater and smaller than 1, and the condition indices and the condition number will increase. An informal rule of thumb is that if the condition number is 15, multicollinearity is a concern; if it is greater than 30 multicollinearity is a very serious concern. (But again, these are just informal rules of thumb.) In SPSS, you get these values by adding the COLLIN parameter to the Regression command; in Stata you can use COLLIN. In SAS, you can use COLLIN option in Model statement of PROC REG.

Here are two more rules of thumb in dealing with multicollinearity:
Don't worry about multicollinearity if the R^2 from the regression exceeds the R^2 of any independent variable regressed on the other independent variables.
Don't worry about multicollinearity if the t statistics are all greater than 2.

Mar 9, 2010

Consistency and Convergence

A consistent sequence of estimators is a sequence of estimators that converge in probability to the quantity being estimated as the index (usually the sample size) grows without bound. In other words, increasing the sample size increases the probability of the estimator being close to the population parameter. Mathematically, a sequence of estimators \{t_n; n \ge 0\} is a consistent estimator for parameter θ if and only if, for all ε > 0, no matter how small, we have

 
\lim_{n\to\infty}\Pr\left\{
\left|
t_n-\theta\right|<\epsilon
\right\}=1.

The consistency defined above may be called Weak Consistency. The sequence is Strongly Consistent, if it Converges Almost Surely to the true value. To say that the sequence Xn converges almost surely or almost everywhere or with probability 1 or strongly towards X means that


    \operatorname{Pr}\!\left( \lim_{n\to\infty}\! X_n = X \right) = 1.
This means that the values of Xn approach the value of X, in the sense (see almost surely) that events for which Xn does not converge to X have probability 0. Using the probability space \scriptstyle (\Omega, \mathcal{F}, P ) and the concept of the random variable as a function from Ω to R, this is equivalent to the statement

    \operatorname{Pr}\Big( \omega \in \Omega : \lim_{n \to \infty} X_n(\omega) = X(\omega) \Big) = 1.

Mar 8, 2010

Indirect Least Squares Method (ILS)

Suppose we wish to estimate a structural equation containing say, three endogenous variables. The first step of the ILS technique is to estimate the reduced-form equations for these three endogenous variables. If the structural equations for these three endogenous variables. If the structural equation in question is just identified, there will be only one way of calculating the desired estimates of the structural equation parameters from the reduced-form parameter estimates. The structural parameters are expressed in terms of the reduced-form parameters, and the OLS estimates of the reduced-form parameters are plugged in these expressions to produce estimates of the structural parameters. Because these expressions are nonlinear, however, unbiased estimates of the reduced-form parameters produce Only Consistent estimates of the structural parameters, not unbiased estimates.

If an equation is over-identified, the extra identifying restrictions provide additional ways of calculating the structural parameters from the reduced-form parameters, all of which are supposed to lead to the same values of the structural parameters. But because the estimates of the reduced-form parameters do not embody these extra restrictions, these different ways of calculating the structural parameters creates different estimates of these parameters. (This is because unrestricted estimates rather than actual values of the parameters are being used for these calculations.) Because there is no way of determining which of these different estimates is the most appropriate, ILS is not used for over-identified equations. The other simultaneous equation estimating techniques have been designed to estimate structural parameters in the over-identified case; many of these can be shown to be equivalent in the over-identified case; many of these can be shown to be equivalent to ILS in the context of a just-identified equation, and to be weighted averages of the different estimates produced by ILS in the context of over-identified equations.

Here is a basic procedure to implement ILS:
1. Rearrange the structural form equations into reduced form, Estimate the reduced form equations;
2. Estimate the reduced form parameters;
3. Solve for the structural form parameters in terms of the reduced form parameters, and substitute in the estimates of the reduced form parameters to get estimates for the structural ones.
Note: If structural equation is exactly identified, there will be  a unique way to calculate the parameters. Estimates of reduced form parameters are unbiased, but estimates of the structural parameters will not be.  Both are consistent.

Mar 7, 2010

Order & Rank Conditions of Identification

The identification problem is a mathematical (as opposed to statistical) problem associated with simultaneous equation systems. It is concerned with the question of the possibility or impossibility of obtaining meaningful estimates of the structural parameters. The identification problem can be solved if economic theory and extraneous information can be used to place restrictions on the set of simultaneous equations. These restrictions can take a variety forms (such as use of extraneous estimates of parameters, knowledge  of exact relationship among parameters, knowledge of the relative variances of disturbances, knowledge of zero correlation between disturbances in different equations, etc.), but the restrictions usually employed, called Zero Restrictions, take the form of specifying that certain structural parameters are zero, i.e., that certain endogenous variables and certain exogenous variables do not appear in certain equations. Mathematical investigation has shown that in the case of Zero Restrictions on structural parameters each equation can be checked for identification by using a rule called the Rank Condition. It turns out, however, that this rule is quite awkward to employ, and as a result a simpler rule, called the Order Condition, is used in its stead. This rule only requires counting included and excluded variables in each equation.
Here is a brief illustration of order and rank conditions of identification in simultaneous equation system:





M = number of endogenous variables in the model
K = number of exogenous variables in the model
m = number of endogenous variable in an equation
k = number of exogenous variables in a given equation
Rank condition is defined by the rank of the matrix, which should have a dimension (M-1), where m is the number of endogenous variables. This matrix is formed from the coefficients of the variables (both endogenous and exogenous) excluded from that particular equation but included in the other equations in the model.
The rank condition tells us whether the equation under consideration is identified or not, whereas the order condition tells us if it is exactly identified or overidentified.
1. If K-k>m-1 and the rank of the ρ(A) is M-1 then the equation is overidentified.
2. If K-k=m-1 and the rank of the ρ(A) is M-1 then the equation is exactly identified.
3. If K-k>=m-1 and the rank of the ρ(A) is less than M-1 then the equation is underidentified.
4. If K-k<=m-1 the structural equation is unidentified. The rank of the ρ(A) is less M-1 in this case.

From these rules, we can tell that, the order condition is only a necessary condition, not a sufficient one. So that, technically speaking, the rank condition must also be checked. Many econometricians do not bother doing this, however, gambling that the rank condition will be satisfied (as it usually is) if the order condition is satisfied. This procedure is hence not recommended.

Mar 4, 2010

Direct PC SAS Output to a File

When running SAS programs interactively through the display manager, the output from any procedure is written to the Output window and notes, warnings and errors are written to the Log Window. Contents of these windows are temporary. They can be saved to a file using the File Save pulldown menus from the Output Window and from the Log Window. But if you want to make sure that the output of these windows is saved to a file every time, you can use Proc Printto to automatically route output to a file.

For example, the following program routes the output from Proc Printto directly to a file named auto.lst. What would have gone to the Output Window is redirected to the file c:\auto.lst . The statements below tell SAS to send output that would go to the Output Window to the file c:\auto.lst and to create a new file if the file currently exists.  If the NEW option was omitted, SAS would append to the file if it existed.

    PROC PRINTTO PRINT='c:\auto.lst' NEW;
    RUN;

Note: (1) sometime SAS program can collapse (unexpected terminated) before it executes all of the statements properly, then you will lose all of the results you already got.(In this kind of situation you have to end the SAS through Windows Task Manager, because generally the SAS program will stop respond). By using Proc Printto, you can save all of the temporary results you have already got before the program is unexpected terminated.

(2) Generally you need put Proc Printto statement at the very beginning of the SAS code. Of course, you can also release the print output file by using another simple statement at the very end of the SAS code:
    PROC PRINTTO;
    RUN;

(3) For log print, you can use a similar SAS Code:

    PROC PRINTTO LOG='c:\auto.log' NEW;
    RUN;

Mar 3, 2010

Observations and Thoughts on Haiti and Chile

Here are some observations from a blogger:
"The recent earthquakes in Haiti and Chile present an interesting contrast between the deleterious effects of a major earthquake in one of the richest countries in the western hemisphere and in the poorest.  It may surprise you that Chile is (by relative standards) quite an advanced and relatively wealthy country as many Americans, I think, have a tendency to view all of Latin America as a poor region.  According to the CIA, the per-capita GDP in Chile in 2009 was $14,700 while Haiti was $1,300 - so while Chile is far from US or Western European standards of living, it is a much wealthier country than Haiti.  In both cases the earthquake (and subsequent tsunami in Chile) were devastating disasters, but the scope of the tragedy in Haiti was, it appears, much, much worse."

These observations pass two serious thinking to me:
1. Other than physical demand, people's level of immaterial demand can also be determined by income or wealth; and most of time, safety is not among the basic levels of human needs.
2. Opportunity cost for poor is less than rich people when they are facing the same danger and potential of losing. Who can stand more risk and unsafety, poor or rich? This is a two-way argument.

So the practical question is that, can we validate these observation via some statistical or econometrics methods?

A good Illustration of Weighted Regression by Peter Kennedy

Measurement Error

In parametrics, the assumption of fixed regressors is made mainly for mathematical convenience, if the regressors can be considered to be fixed in repeated samples, the desirable properties of the OLS estimator can be derived quite straightforwardly. The essence of this assumption is that, if the regressors are nonstochastic, they are distributed independently of the disturbances. If this assumption is weakened to allow the explanatory variables to be stochastic but to be distributed independently of error term, all the desirable properties of the OLS estimator are maintained; their algebraic derivation is more complicated, however, and their interpretation in some instances must be changed (for example, in this circumstance, βOLS is not, strictly speaking, a linear estimator).

If the regressors are only contemporaneously uncorrelated with with the disturbance vector, the OLS estimator is biased but retains its desirable asymptotic properties at the expense of the small-sample properties of βOLS. If the regressors are contemporaneously correlated with the error term, the OLS estimator is even asymptotically biased.

When there exists contemporaneous correlation between the disturbance and a regressor, alternative estimators with desirable small-sample properties can not in general be found; as a consequence, the search for alternative estimators is conducted on the basis of their asymptotic properties. The most common estimator used in this context is the instrumental variable (IV) estimator.

Mar 1, 2010

The Nature of Agricultural Economics - My Thought

The strength of agricultural economics is not that it can compete with general economics research. People may feel that general economics research is more decent, this is true in the sense that it produces pretty neat and nice work with help of mathematic notation. Mathematics is important, this should be admitted, it is the logic language of this world. So what general economics research does is that it has been employed in the effort of expressing the world, while the agricultural economics research should be dedicated to be more close to the real world, to pass more care to people and entire world's basic needs.

There is a movie which has been putting on the screen for a while, Food Inc. (2008). An American documentary film directed by Emmy Award-winning filmmaker Robert Kenner. The film examines large-scale agricultural food production in the United States, concluding that the meat and vegetables produced by this type of economic enterprise have many hidden costs and are unhealthy and environmentally-harmful. The documentary generated extensive controversy in that it was heavily criticized by large American corporations engaged in industrial food production. This is just an example, so the question is that after popular cost-benefit analysis and its derivative forms and combinations, who really cares about problem like above? Most of time, optimization, maximization, equilibrium and so forth are too perfect to be practical in applications; some other time, human activity and interaction are so of diversity that it is not enough or even it is not neccessary to follow a cost-benefit logic, especially when you have hard time to identify who are beneficiaries and who are victims.

Here is a word I want to share with everyone: we can only and will only win the world by love and responsibility, not by proving; because essentially everything can be proved while nothing cannot be proved eventually.
                                              -- Haoying Wang, 2010

The Nature of Agricultural Economics (1)

The nature, foundation, structure and future of agricultural economics has been of concern for a long time. Even though after 90's it is noticed that agricultural economics and its education have been experiencing a downturn, it is still holding the frontier of applied econometrics and environmental economics, which are something heading future. When we look back twenty years, where lots of concerns and thoughts stacked from.

Agricultural Economics is Applied Agricultural economics is by its very nature an applied discipline-a discipline that focuses on the application of economic rinciples taken from general economics to practical, applied problems based on keen observation of the behavior of individuals, groups and institutions within an economic setting. Some agricultural economists argue that despite its reliance on economic theory, nearly all the research being conducted by agricultural economists is applied - in that the research has as its core basis observable economic phenomena based upon human behavior. Like theoretical physics related to the origins of the universe, much of the most advanced economic research being conducted in what are regarded as the best economics graduate schools has little grounding in observable economic phenomena, and consists of abstract mathematical proofs of economic theories that are seldom verifiable based on data gathered from the real world.
                                                        --David L. Debertin, 1999.


There is decreasing diversity among economics departments with respect to what is taught among the top-ten schools - that because of the inter-hiring only within the small group of schools thought to be in the peer group, there is little diversity in what is taught or in methodological approaches to research considered acceptable. As I look at the agricultural economics top-ten list, however, I see considerably greater diversity in the kinds of graduate education that would be obtained. An agricultural economics Ph.D. from Purdue would be very different from one obtained from UC-Berkeley, and no one would characterize a North Carolina State ag. econ. Ph.D. as being a clone of one produced by UW-Madison! In my view-the diversity of these graduate programs, along with the additional diversity contained in lower-ranked schools--is a source of great strength in agricultural economics, not a weakness.
                                                        --David L. Debertin, 1999.

Feb 28, 2010

Why Generalized Least Square Estimator?

It is known that heteroskedasticity affects the properties of the OLS estimatror (though still unbiased, but less efficient, namely larger variance). When you draw a scatter plot on raw data, the higher absolute values of the residuals to the right in the graph indicate that there is a positive relationship between the error variance and the independent variable. With this kind of error pattern, a few additional large positive errors near the right in this graph would tilt (make something move, into a position with one side or end higher than the other) the OLS regression line considerably. A few additional large negative errors would tilt it in the opposite direction considerably. In repeated sampling these unusual cases would average out, leaving the OLS estimator unbiased, but the variation of the OLS regression line around its mean will be greater - i. e., the variance of βOLS will be greater. The Generalized Least Square (GLS) technique pays less attention to the residuals associated with high-variance observations (by assigning them a low weight in the weighted sum of squared residuals it minimizes) since these observations give a less precise indication of where the true regression line lies. This avoids these large tilts, making the variance of βGLS smaller than that of βOLS.

In the case of that Durbin-Watson test indicates autocorrelated errors. It is typically concluded that estimation via Feasible GLS is called for. This is not always appropriate, however, the significant value of the Durbin-Watson statistic could result from an omitted explanatory variable, an incorrect functional form, or a dynamic misspecification. Only if a researcher is satisfied that none of these phenomena are responsible for the significant Durbin-Watson statistic value should estimation via feasible GLS proceed.

Feb 27, 2010

Two Nonparametrics

In the world of econometrics, the term nonparametric basically refers to the flexible functional form of the regression curve. However, there are other notions of "nonparametric statistics" which refer mostly to distribution-free methods. In the econometric context, generally, neither the error distribution nor the functional form of the mean function is prespecified.
Between the parametric econometrics and nonparametric econometrics, the question of which approach should be taken in data analysis was a key issue in a bitter fight between Pearson and Fisher in the twenties. Fisher pointed out that the nonparametric approach gave generally poor efficiency whereas Pearson was more concerned about the specification question. Both viewpoints are interesting in their own right. Pearson pointed out that the price we have to pay for pure parametric fitting is the possibility of gross misspecification resulting in too high a model bias. On the other hand, Fisher was concerned about a too pure consideration of parameter-free models which may result in more variable estimates, especially for small sample size n.

Orthogonality in Econometrics

In mathematics, two vectors are orthogonal if they are perpendicular, i.e., they form a right angle.

In linear algebra, an orthogonal matrix is a square matrix with real entries whose columns (or rows) are orthogonal unit vectors (i.e., orthonormal). Because the columns are unit vectors in addition to being orthogonal, some people use the term orthonormal to describe such matrices.
Equivalently, a matrix Q is orthogonal if its transpose is equal to its inverse:

Q^T Q = Q Q^T = I . \,     alternatively,   Q^T=Q^{-1} . \,

The concept of orthogonality tends to be very important in econometrics, since we have been building almost all of the methods and rules based on the matrix platform. For example, if it happens that a relevant independent variable is omitted, in general, the OLS estimator of the coefficients of the remaining variables is biased. If the omitted variable is orthogonal to the included variables, the slope coefficient estimator will be unbiased; the intercept estimator will retain its bias unless the mean of the observations on the omitted variable is zero.
In the case of inclusion of an irrelevant variable, unless the irrelevant variable is orthogonal to the other independent variables, the variance-covariance matrix βOLS becomes larger; the OLS estimator is not as efficient. Thus in this case the MSE of the estimator is unequivocally raised.

Feb 26, 2010

Borrow 500 Years of Life from the Heaven

Lyrics:     Junyi Zhang, Xiaobin Fan
Compostion: Ke Fu
Translation: Haoying Wang

♣ Along the gentle waviness of rising and subsiding territory


♣ Galloping on the beloved land, beloved plateau and Yangtze South


♣ In the face of ice blade and sword, accompanied by attaching wind and rain


♣ Being cherished of my golden life from heaven


♣ And full of fraternity all along


♣ Being afraid of nothing


♣ And full of lofty sentiments all along


♣ Life is always of half pain and half enjoyment


♣ But with distinct cut between good and evil


♣ All come true in the dream for future


♣ Clanking iron heel, Never stops on the vast beloved land


♣ Standing on the top of surge, and holding


♣ The movement of universe


♣ Praying for the world of mortals


♣ Full of peace and bliss


♣ And another 500 Years from the Heaven for me


♣ Another 500 Years from the Heaven for me

Feb 25, 2010

Specification Problems and Empirical Study

Peter Kennedy wrote: Econometric textbooks are mainly devoted to the exposition of econometrics for estimation and inference in the context of a given model for the data-generating process. The more important problem of specification of this model is not given much attention, for three main reasons: (1) specification is not easy; (2) most of econometricians would agree that specification is an innovative/imaginative process that cannot be taught; (3) there is no accepted "best" way of going about finding a correct specification. (Of course, this is why we can always contribute something here, it is too hard to find a best and perfect way of specification.)

So the issue can come as how much trust do we have in econometrics, different people express in a different way:
All models are wrong, but some are useful. - George Box
Models are to be used, but not to be believed. -Theil, H.


Here is what Edward E. Leamer contributed into the discussion:
When an inference is suspected to depend crucially on a doubtful assumption, two kinds of actions can be taken to alleviate the consequent doubt about the inferences. Both require a list of alternative assumptions. The first approach is statistical estimation which uses the data to select from the list of alternative assumptions and then makes suitable adjustments to the inferences to allow for doubt about the assumptions. The second approach is a sensitivity analysis that uses the alternative assumptions one at a time, thereby demonstrating either that all the alternatives lead to essentially the asame inferences or that minor changes in the assumptions make major changes in the inferences. For example, a doubtful variable can simply be included in the equation (estimation), or two different equations can be estimated, one with and one without the doubtful variable (sensitivity analysis).
Simplification is a third. The intent of simplification is to find a simple model that works well for a class of decisions. A specification search can be used for simplification,as well as for estimation and sensitivity analysis. the very prevalent confusion among these three kinds of searches ought to be eliminated since the rules for a search and measures of its success will properly depend on its intent.

Again, Peter Kennedy gave following summarization: 
♣ Models whose residuals do not test as insignificantly different from white noise (random errors) should be initially viewed as containing a misspecification, not as needing a special estimation procedure.
♣ "Testing down" is more suitable than "Testing up"; one should begin with a general, unrestricted model and then systematically simplify it in light of the sample evidence.
♣ Tests of misspecification are better undertaken by testing simultaneously for several misspecifications rather than testing one-by-one for these misspcifications.

Likelihood Ratio, Wald, Lagrange Multiplier Tests

The F test is applicable whenever we are testing linear restrictions in the classic normal linear regression model. However, if, (1) the restrictions are nonlinear; (2) the model is nonlinear in the parameters; (3) the errors are distributed non-normally; then we need other asymptotically equivalent tests.

Suppose the restriction being tested is written as g(β), satisfied at the value βMLE-R where the function g(β) cuts the horizontal axis (please refer to the graph at the bottom). Then we have three asymptotically equivalent tests available to do the test and make reference, all of them are distributed asymptotically as chi-square with degrees of freedom equal to the number of restrictions being tested.

(1) The Likelihood Ratio Test: if the restrictions is true, then ln(LR), the maximized value of ln(L) imposing the restrictions, should not be significantly less than ln(Lmax), then unrestricted maximum value of ln(L). The Likelihood Ratio test tests whether [ln(LR)-ln(Lmax)] is significantly different from zero.

(2) Wald Test: if the restriction g(β)=0 is true, then g(βMLE) should not be significantly different from zero. The Wald test tests whether βMLE (the unrestricted estimate of β) violates the restriction by a significant amount.

(3) Lagrange Multiplier Test: The log-likelihood function of ln(L) is maximized at point A where the slope of ln(L) with respect to β is zero. If the restriction is true, then the slope of ln(L) at point B should be significantly different from zero. The Lagrange Multiplier test tests whether the slope of ln(L), evaluated at the restricted estimate, is significantly different from zero.

Graph for reference:

Feb 24, 2010

Future and Complexity

--For the Understanding of Environmental Economics and Studies Concerned


I believe that man has the power, the intelligence, and the imagination to extricate himself from the serious predicament that now confronts him. The necessary first step toward wise action in the future is to obtain an understanding of the problems that exist. This in turn necessitates an understanding of the relationships between man, his natural environment, and his technology.
                                                            -Ocho Rios, Jamaica, April 1953.

In principle, the vast knowledge we have accumulated during the last 150 years makes it possible for us to look into the future with considerably more accuracy than could Malthus. But in actual fact we are dealing with an extremely complex problem which cuts across all of our major fields of inquiry and which, because of this, is difficult to unravel (to explain something that is difficult to understand or is mysterious) in all of its interlocking aspects. The complexity of the problem, our confusion, and our prejudices, have combined to form a dense fog that has obscured the most important features of the problem from our view - a fog which is in certain respects even more dense than that which existed in Malthus’ time. As a result, the basic factors that are determining the future are not generally known or appreciated.

In spite of the complexity of the problem which confronts us, its overwhelming importance, both to ourselves and to our descendants, warrants our dissecting it as objectively as possible. In doing so we must put aside our hatreds, desires, and prejudices, and look calmly upon the past and present. If we are successful in lifting ourselves from the morass (an unpleasant and complicated situation that is difficult to escape from) of irrelevant fact and opinion and in divorcing ourselves from our preconceived ideas, we will be able to see mankind both in perspective and in relation to his environment. In turn we will be able to appreciate something of the fundamental physical limitations to man’s future development and of the hazards which will confront him in the years and centuries ahead.

Feb 23, 2010

Rejection From Yale

2/23/2010

Dear Mr. Wang:

Thank you very much for applying to the Graduate School of Arts and Sciences at Yale University. I regret to inform you that we are unable to offer you admission. As you know, the very high number of extraordinary candidates among our 10,400 applicants far exceeds the number of places we have in each program, and we are not able to admit many excellent candidates.

We are using this system of electronic notification to communicate with you five to ten days more rapidly than we could by letter and, therefore, help applicants plan their futures quickly and effectively. We wish you every success in all your endeavors.

Sincerely,

Jon Butler
Dean of the Graduate School

Why Student's T-test? (Part 2)

An approximate answer to the right question is worth a great deal more than a precise answer to the wrong question. 
--The first golden rule of mathematics, sometimes attributed to John Tukey

With many calculations, one can win; with few one cannot. How much less chance of victory has one who makes none at all! 
--Sun Tzu 'Art of War'

The T-test may be used to compare the means of a criterion variable for two independent samples or for two dependent samples (ex., before-after studies, matched-pairs studies), or between a sample mean and a known mean (one-sample t-test). In regression analysis, A T-test can be used to test any single linear constraint. Nonlinear constraints are usually tested by using a W, LR or LM test, but sometimes an "asymptotic" T-test is encountered: the nonlinear constraint is written with its right-hand side equal to zero, the left-hand side is estimated and then divided by the square root of an estimate of its asymptotic variance to produce the asymptotic T statistics.

For example, here is the formula to test mean difference for the case of equal sample sizes, n, in both groups:

Let E be the experimental condition and let C be the control condition. Let m be the means, s the standard deviations, and n be the sample size. Then
t = (mE - mC) / SQRT[(s2E + s2C) / n ] 

Three Different Types of T-test:

(1) One-sample T-tests test whether the mean of one variable differs from a constant (ex., does the mean grade of 72 for a sample of students differ significantly from the passing grade of 70?). When p<.05 the researcher concludes the group mean is significantly different from the constant.

(2) Independent sample T-tests are used to compare the means of two independently sampled groups (ex., do those working in high noise differ on a performance variable compared to those working in low noise, where individuals are randomly assigned to the high-noise or low-noise groups?) . When p<.05 the researcher concludes the two groups are significantly different in their means. This test is often used to compare the means of two groups in the same sample (ex., men vs. women) even though individuals are not (in the case of gender, cannot be) assigned randomly to the two groups (to "men" and to "women"). Random assignment would have controlled for unmeasured variables. This opens up the possibility that other variables either mask or enhance any apparent significant difference in means. That is, the independent sample t-test tests the uncontrolled difference in means between two groups If a significant difference is found, it may be due not just to gender; control variables may be at work. The researcher will wish to introduce control variables, as in any multivariate analysis. 

(3) Paired sample T-tests compare means where the two groups are correlated, as in before-after, repeated measures, matched-pairs, or case-control studies (ex., mean candidate evaluations before and after hearing a speech by the candidate). The algorithm applied to the data is different from the independent sample t-test, but interpretation of output is otherwise the same.

Associated Assumptions:

(1) Approximately Normal Distribution of the measure in the two groups is assumed. There are tests for normality. The t-test may be unreliable when the two samples come from widely different shaped distributions (see Gardner, 1975). Moore (1995) suggests data for t-tests should be normally distributed for sample size less than 15, and should be approximately notmal and without outliers for samples between 15 and 40; but may markedly skewed when sample size is greater than 40. 

(2) Roughly Similar Variances: There is a test for homogeneity of variance, also called a test of homoscedasticity. In SPSS homogeneity of variances is tested by "Levene's Test for Equality of Variances", with F value and corresponding significance. There are also other tests for homogeneity of variances. The T-test may be unreliable when the two samples are unequal in size and also have unequal variances (see Gardner, 1975). 

(3) Dependent/Independent Samples. The samples may be independent or dependent (ex., before-after, matched pairs). However, the calculation of T differs accordingly. In the one-sample test, it is assumed that the observations are independent. 

One last note is that, don't confuse a T test with analyses of a contingency table (Fishers or chi-square test). Use a T test to compare a continuous variable (e.g., blood pressure or weight). Use a contingency table to compare a categorical variable (e.g., pass vs. fail, viable vs. not viable). 

Reference:
Gardner, P. L. (1975). Scales and statistics. Review of Educational Research. 45: 43-57. Discusses assumptions of the t-test. 
Moore, D. S. (1995). The Basic Practice of Statistics. NY: Freeman and Co. 

Feb 22, 2010

Why Student's T-test? (Part 1)

Here I am trying to answer two questions for myself:

1. What is the difference between Z-test and T-test?
2. Why we need student's T-test?

First, let's be clear on Z-test V.S. T-test. A thumb rule can be referred as, Z-test is used when the sample size is more than 30 while T-test is used for smaple size less than 30. Now let's get back to the history of story:

Sometimes, measuring every single piece of item is just not practical. That is why we developed and use statistical methods to solve problems. The most practical way to do it is to measure just a sample of the population. Some methods test hypothesis by comparison. The two of the more known statistical hypothesis tests are the T-test and the Z-test. Let's try to break down the two.

Strictly speaking, the Z-test is a test for populations rather than samples. In the real world though, either test will give you a pretty close answer. using the T-test is more accurate because the sample deviation is specific and tailored to the sample you are studying, so the answer will be more accurate. When using a T-test of significance, it is assumed that the observations come from a population which follows a Normal distribution. This is often true for data that is influenced by random fluctuations in environmental conditions or random measurement errors. Whereas the T-distribution is essentially a corrected version of the normal distribution in which the population variance is unknown and hence is estimated by the sample standard deviation.

There are various T-tests and two most commonly applied tests are the one-sample and paired-sample T-tests. One-sample T-tests are used to compare a sample mean with the known population mean. Two-sample T-tests, the other hand, are used to compare either independent samples or dependent samples.

As mentioned above, T-test is best applied, at least in theory, if you have a limited sample size (n < 30) as long as the variables are approximately normally distributed and the variation of values in the two groups is not reliably different. It is also great if you do not know the populations’ standard deviation. If the standard deviation is known, then, it would be best to use another type of statistical test, the Z-test. The Z-test is also applied to compare sample and population means to know if there’s a significant difference between them. Z-tests always use normal distribution and also ideally applied if the standard deviation is known. Z-tests are often applied if the certain conditions are met; otherwise, other statistical tests like T-tests are applied in substitute. Z-tests are often applied in large samples (n > 30). When T-test is used in large samples, the T-test becomes very similar to the Z-test. There are fluctuations that may occur in T-tests sample variances that do not exist in Z-tests. Because of this, there are differences in both test results.

Summary:


1. Z-test is a statistical hypothesis test that follows a normal distribution while T-test follows a Student’s T-distribution.
2. A T-test is appropriate when you are handling small samples (n < 30) while a Z-test is appropriate when you are handling moderate to large samples (n > 30).
3. T-test is more adaptable than Z-test since Z-test will often require certain conditions to be reliable. Additionally, T-test has many methods that will suit any need.
4. T-tests are more commonly used than Z-tests.
5. Z-tests are preferred than T-tests when population standard deviations are known.

Feb 21, 2010

Maybe We Just Need a New Word: Gadget

"Samsung has just announced at Barcelona a new cell phone, the Beam, that they expect to have on the market this summer. Its special feature is a built-in pico projector, making it a combination cell phone and (very wimpy) video projector. A cute gadget, although not one that I am likely to have much use for. I do, however, have one suggestion for improving it."

Reading through this news, I am happened to be interested in the word Gadget: Two similar explanations can be easily referenced from dictionary:
(1) an often small mechanical or electronic device with a practical use but often thought of as a novelty;
(2) any object that is interesting for its ingenuity or novelty rather than for its practical use.

So it comes to me as a question, have we been proposing and digging Gadgets in econometrics and economics? This could happen to be a 'gadget' question, but it is definitely not a 'gadget' issue. Too many people are publishing papers which probably are going to have its author(s) as the only and last careful reader. So why we spend one or two years, even three years to invent such a "gadget"? For tenure, for promotion or just for fun (self understanding of the subjects)? Maybe it is just for a popular social demand of vanity, maybe it is just an indispensable part of the system, who knows?



Monte Carlo Studies

Monte Carlo methods have been used for centuries, but only in the past several decades has the technique gained the status of a full-fledged numerical method capable of addressing the most complex applications. The Monte Carlo method may be thought of as similar to a political poll, where a carefully selected statistical sample is used to predict the behavior or characteristics of a large group.

Enrico Fermi in the 1930's used Monte Carlo in the calculation of neutron diffusion, and later designed the Fermiac, a Monte Carlo mechanical device used in the calculation of criticality (The point at which a nuclear reaction is self-sustaining) in nuclear reactors.

In the 1940's, a formal foundation for the Monte Carlo method was developed by von Neumann, who established the mathematical basis for probability density functions (PDFs), inverse cumulative distribution functions (CDFs), and pseudorandom number generators. The work was done in collaboration with Stanislaw Ulam, who realized the importance of the digital computer in the implementation of the approach.

Before digital computers were available to the labs, "computer" was a job title. Parallel computing was done by rows and columns of mathematicians. The applications, which arose mostly from the Manhattan Project, included design of shielding for reactors.

Uses of Monte Carlo methods have been many and varied since that time. In the late 1950's and 1960's, the method was tested in a variety of engineering fields. At that time, even simple problems were compute-bound. Many complex problems remained intractable through the seventies. With the advent of high-speed supercomputers, the field has received increased attention, particularly with parallel algorithms which have much higher execution rates.

In econometrics, the general idea behind a Monte Carlo study is to (1) model the data-generating process, (2) generate several sets of artificial data, (3) employ these data and an estimator to create several estimates, and (4) use these estimates to gauge the sampling distribution properties of that estimator.

A useful reference is the paper:

Design and analysis of Monte Carlo experiments
Written by Kleijnen, J.P.C. (Tilburg University, Center for Economic Research)

Mean Square Error (MSE) and Variance

The difference between the variance of an estimator and its MSE is that the variance measures the dispersion of the estimator around its mean whereas the MSE measures its dispersion around the true value of the parameter being estimated. for unbiased estimators they are identical.

Biased estimator with smaller variances than unbiased estimators are easy to find. The MSE estimator has not been as popular as the best unbiased estimator because of the mathematical difficulties in its derivation. Furthermore, when it can be derived its formula often involves unknown coefficients (the value of beta), making its application impossible. Monte Carlo studies have shown that approximating the estimator by using OLS estimates of the unknown parameters can sometimes circumvent this problem (a little confused here, using approximated OLS estimates to substitute the real beta?)

Note: Weighted Square(d) Error Criterion can be a very interested topic to explore!
Peter Kennedy: When the weights are equal, the criterion is the popular mean square error (MSE) criterion. It happens that the expected value of a loss function consisting of the square of the difference between beta and its estimate (i.e. the square of the estimation error) is the same as the sum of the variance and the squared bias. 
Please refer to following derivation:



OLS: It is not the case that the OLS estimator is the minimum mean square error estimator in the Classic Linear Regression model. Even among linear estimators, it is possible that a substantial reduction in variance can be obtained by adopting a slightly biased estimator.

Feb 17, 2010

Toronto Milk Producers


The Toronto Sunday World, Mar 23, 1914.

A meeting of the Toronto Milk and Cream Producers' Association will be held at the Labor Temple on Thursday, commencing at 2 p.m. The meeting is called to discuss and decide on prices of milk and cream for the ensuing season, May to October, and any other business in the interest of the association. In the evening a banquet will be held at the Grand Union Hotel.

The Story of Maximum Likelihood

The theory of maximum likelihood is very beautiful indeed: a conceptually simple approach to an amazingly broad collection of problems. This theory provides a simple recipe that purports to lead to the optimum solution for all parametric problems and beyond, and not only promises an optimum estimate, but also a simple all-purpose assessment of its accuracy. And all this comes with no need for the specification of a priori probabilities, and no complicated derivation of distributions. Furthermore, it is capable of being automated in modern computers and extended to any number of dimensions. Maximum-likelihood estimation was recommended, analyzed and vastly popularized by R. A. Fisher between 1912 and 1922 (although it had been used earlier by Gauss, Laplace, Thiele, and F. Y. Edgeworth). Reviews of the development of maximum likelihood have been provided by a number of authors.


When we analyze an analysis of variance or linear regression, typically we estimate parameters for the model using the principle of least squares. The idea of least squares is that we choose parameter estimates that minimize the average squared difference between observed and predicted values. That is, we maximize the fit of the model to the data by choosing the model that is closest, on average, to the data.

For many other procedures such as logistic, Poisson, and proportional hazards regression, least squares usually cannot be used as an estimation method. Instead, most often we turn to the method of maximum likelihood. In maximum likelihood estimation, we search over all possible sets of parameter values for a specified model to find the set of values for which the observed sample was most likely. That is, we find the set of parameter values that, given a model, were most likely to have given us the data that we have in hand.

By way of analogy, imagine that you are in a jury for a civil trial. Four things are presented to you in the course of the trial: 1) charges that specify the purpose of the trial, 2) prosecution's version of the truth, 3) defendant's version of the truth, and 4) evidence. Your task on the jury is to decide, in the context of the specified charges and given the evidence presented, which of the two versions of the truth most likely occurred. You are asked to choose which version of the truth was most likely to have resulted in the evidence that was observed and presented.

Analogously, in statistical analysis with maximum likelihood, we are given: 1) a specified conceptual, mathematical, and statistical model, 2) one set of values for the parameters of the model, 3) another set of values for the parameters of the model, and 4) observed data. We want to find the set of values for the parameters of the model that are most likely to have resulted in the data that were actually observed. (We do this by searching over all possible sets of values for the parameters, not just two sets.)

In analysis of variance or linear regression, we measure the fit of the model to the data using the regression sum of squares. With maximum likelihood, the likelihood measures the fit of the model to the data, Therefore, we want to choose parameter values that maximize the likelihood. In analysis of variance or linear regression if we want to compare the fit of two models, we form the ratio of two mean squares to yield an F-test . With maximum likelihood, we do this by forming the ratio of two likelihoods to yield a chi-square test.

Asymptotic Properties

Since econometricians quite often must work with small samples, depending estimators on the basis of their asymptotic properties is legitimate only if it is the case that estimators with desirable asymptotic properties have more desirable small-sample properties than do estimators without desirable asymptotic properties.

Feb 16, 2010

Rejection From Rice



We regret having to inform you that Rice University cannot offer you admission for graduate study.
You can be assured that your application received very careful consideration.
Our decision is based on high standards of selectivity and on the constraints of space, and faculty.
For these reasons, we must limit the number of admissions in all departments.
The other members of the departmental graduate committee join me in wishing you success in your
future endeavors.
Yours sincerely,
Simon Grant, Director
Economics Graduate Program
 
Locations of visitors to this page