Amemiya introduction to statistics and econometrics pdf




















X2 and let Y1. That this is needed can be best understood by the following geometric consideration. In some applications we need to understand the characteristics of the joint distribution of a discrete and a continuous random variable.

In this section we ask two questions: I How are the four quantities f x. Solving 3. The area of the rectangle is AX1AX2. Y2 5 1 3 3 B y using matrix notation. If we assume that X and Y are related to each other.

Let X be a continuous random variable with density f x and let Y be a discrete random variable taking values yi. I 4 Illustration for Exampk 5. Chapter 11 shows that alla The linear mapping 3. Consider a small rectangle on the X l l X 2 plane. Specify ajoint distribution in such a way that the three variables are pairwise independent but not mutually independent.

I f 4 by the mean d u e theorem. Since the conditional probability P y. Section 3. The score is scaled to range between 0 and 1. Y be given by a Calculate marginal density f x. Note that. Next consider What is the probability that he will get an A on the final if he got an A on the midterm? If X is the payoff of a gamble that is. Obtain the density of Y. Let Xi be the payoff of a particular gamble made for the ith time. Then the expected value expectation or mean of X. The symbol log refers to natural logarithm throughout.

Let X be a discrete random variable taking the value x. I The expected value has an important practical meaning. We shall define the expected value of a random variable. This is a consequence of. We can formalize this statement as follows.

It means that if we repeat this gamble many times we will gain 50 cents on the average. The mode is a value of x for which f x is the maximum. A risk taker may be willing to pay as much as 90 cents to gamble. Such a strategy is called the minimax strategy. Choosing the value of d that maximizes EX d may not necessarily be a reasonable decision strategy. A coin is tossed repeatedly until a head comes up. The payoff of this gamble is represented by the random variable X that takes the value 2' with probability 2-'.

More exactly. We can think of many other strategies which may be regarded as reasonable by certain people in certain situations. How much this gamble is worth depends upon the subjective evaluation of the risk involved. Petersburg Paradox. This example is called the "St. This does not mean. More generally. The other important measures of central location are mode and median. Petersburg gamble for any price higher than two dollars.

Coming back to the aforementioned gamble that pays one dollar when a head comes up. Such a person will not undertake the St. The three measures of central location are computed in the following examples. If the density is positively skewed as in Figure 4. By changing the utility function. Here X d is the random gain in dollars that results from choosing a decision d.

To illustrate this point. For the exact definition of convergence in probability. EX is sometimes called the populatzon mean so that it may be distinguished from the sample mean.

The decision to gamble for c cents or not can be thought of as choosing between two random variables XI and X P. One way to resolve this paradox is to note that what one should maximize is not EX itself but. A good. We shall learn in Chapter '7 that the sample mean is a good estimator of the population mean. Petersburg Academy. Note that the density of Example 4.

Then we have 4. Y be a bivariate continuous random variable with joint density function f x. Then Theorems 4. Therefore 4. Y is either discrete or continuous follow easily from Definitions 4.

Although we have defined the expected value so far only for a discrete or a continuous random variable. Y be a bivariate discrete random variable taking value x. Let X be a discrete random variable taking value x. The mode is again 1. The following three theorems characterize the properties of operator E. The proofs of Theorems 4. The same value is obtained by either procedure. For example, let X and Y denote the face numbers showing in two dice rolled independently.

Calculating EXY directly from Definition 4. Theorems 4. Let Y be a continuous random variable with density f y. Another way to define Z is to say that Z is equal to X with probability p and equal to Y with probability 1 - p.

A random variable such as Z is called a mixture random variable. Using Theorems 4. We shall write a generalization of this result as a theorem. Therefore, by Theorem 4. Thus 4. Assuming that his income is a continuous random variabIe with uniform density defined over the interval [5, , compute the expected amount of money he spends on a car. Let X be his income and let Y be his expenditure on a car. Then Y is related to X by As noted in Section 4.

Although it is probably the single most important measure of the characteristics of a probability distribution, it alone cannot capture all of the characteristics. For example, in the coin-tossing gamble of the previous section, suppose one must choose between two random variables, X 1and X2, when XI is 1 o r 0 with probability 0. Though the two random variables have the same mean, they are obviously very different.

The characteristics of the probability distribution of random variable X can be represented by a sequence of moments defined either as 4. The expected value or the mean is the first moment around zero.

As we defined sample mean in the previous section, we can similarly define the sample klh moment around zero. Like the sample mean, the sample kth moment converges to the population kth moment in probability, as will be shown in Chapter 6. A die is loaded so that the probability of a given face turning up is proportional to the number on that face. Calculate the mean and variance for X, the face number showing.

We have, by Definition 4. It gives a more convenient formula than the first. It says that the variance is the mean of the square minus the square of the mean. The square root of the variance is called the standard deviation and is denoted by a. The variance measures the degree of dispersion of a probability distribution. As can be deduced from the definition, the variance of any constant is 0.

The following three examples indicate that the variance is an effective measure of dispersion. Compute VX. By Definition 4. IfaandPareconstants,wehave 1, - 1 11 Note that Theorem 4. This makes intuitive sense because adding a constant changes only the central location of the probability distribution and not its dispersion, of which the variance is a measure.

We shall seldom need to know any other moment, but we mention the third moment around the mean. It is 0 if the probability distribution is symmetric around the mean, positive if it is positively skewed as in Figure 4. The next example shows that the converse of Theorem 4.

Note that because of Theorem 4. Using the results of Chapter - 3. Assume that the returns from the five stocks are pairwise independent. S There are five stocks. Cov Income. Let X. Then where the marginal probabilities are also s h m. We have - :. Consumption is larger if both variables are measured in cents than in. X and Y are not independent by Theorem 3.

Examples 4. As an application of Theorem 4. Let the joint probability distribution of X. Combining Theorems 4. Y be given by The proof follows immediately from the definitions of variance and covariance. Calculate Cov X. Cov X. The latter will be called either the prediction error or the residual.

This weakness is remedied by considering cumlation coeflcient. We next consider the problem of finding the best predictor of one random variable.

The problem can be mathematically formulated as 4. Solving 4. We shall solve this problem by calculus. Expanding the squared term. We shall interpret the word best in the sense of minimizing the mean squared error of prediction. Equating the derivatives to zero. We have. This result suggests that the correlation coefficient is a measure of the degree of a linear relationship between a pair of random variables. Y is a function only of X. Let P y. If we treat it as a random variable.

As a further illustration of the point that p is a measure of linear dependence. In the next section we shall obtain the best predictor and compare it with the best linear predictor. Then Cov X. Y be a bivariate continuous random variable with conditional densityf y 1 x. Here we shall give two definitions: one for the discrete bivariate random variables and the other concerning the conditional density given in Theorem 3. It may be evaluated at a particular value that X assumes.

Combining 4. This may be thought of as another example where no correlation does not imply independence. We have A nonlinear dependence may imply a very small value of p.

The following theorem shows what happens. We also have In Chapters 2 and 3 we noted that conditional probability and conditional density satisfy all the requirements for probability and density. We call W the mean squared prediction errorof the best linear predictor. I by Definition 4. Y be a bivariate discrete random variable taking values x. The symbol Ex indicates that the expectation is taken treating X as a random variable. It says that the variance is equal to the mean of the conditional mriance plus the variance of the conditional mean.

Proof: We shall prove it only for the case of continuous random variables. Calculate EY. The result obtained in Example 4. This problem may be solved in two ways. Find EY and W. Thus we have Despite the apparent complexity of the problem. Let Y be the face number showing. Obtain the The best predictor or more exactly. Here we shall consider the problem of optimally predicting Y by a general function of X.

We shall compute the mean squared error of prediction for each predictor: -. I X 0 2 4 6 Therefore the mean and variance of Y can be shown to be the same as obtained above. The following table gives E Y X.

In the previous section we solved the problem of optimally predicting Y by a linear function of X. E n d the best predictor and the best linear predictor of Y based on X. Inserting these values into equations 4. Compute EX and VX assuming the probability of heads is equal to p. From 4. Section 4. This result shows that the best predictor is identical with the best linear predictor.

You get on the first bus that comes. What is the expected waiting time? Both equations can be combined into one as 4. In the first line buses come at a regular interval of five minutes. Calculate Cov Z. Compute Cov X. Y values: 1 and 0. Compute VX and Cov X. Assume that the probability distribu- Y have joint density f x. Obtain Cov X. Supply your own definition of the phrase "no value in predicting X. Compute their respective mean squared prediction errors and directly compare them.

See Section 5. Then the random variable X defined by n 5. Symbolically we write X. Find the best predictor and the best linear predictor of Y given X. Note that Y. Such a random variable often appears in practice for example. Note that the above derivation of the mean and variance is much simpler than the direct derivation using 5. Since k successes can occur in any of C. We should mention that the binomial random variable X defined in Definition 5.

Prooj The probability that the first k trials are successes and the remaining n. The direct evaluation of a general integral J-b. O 1 When X has the above density. Therefore by 5. This is a special case of the so-called central limit theorem.

Using 5. Many reasons for its importance will become apparent as we study its properties below. Examples of the normal approximation of binomial are given in Section 6. The normal density is completely characterized by two parameters.

Hoel We have Evaluating 5. Such an integral may be approximately evaluated from a normal p r o b ability table or by a computer program based on a numerical method. Thenwehave Proot Using Theorem 3. Next we have A useful corollary of Theorem 5.

Defining Z in the above way. To study the effect of o on the shape off x. If it is required that the life should exceed 80 with at least 0. Theorem 5. Y and. O because the integrand in 5. I 5 1 Binomial and Normal Random Variables 5. O In the above discussion we have given the bivariate normal density 5. If X and Y are bivariate normal and c-r and 3 are con-. All that is left to show is that Correlation X.

The next theorem shows avery important property of the bivariate normal distribution. Correlation X. We can also prove that 5. Then X determine u2 so as to satis5. Then the marginal densities f x and f y and the conditional densities f y I x and f x I y are univariate normal densities. Therefore Cov X. Next we have.

We have by Theorem 4. All the assertions of the theorem follow from 5. Y have the density 5. N Because of Theorem 5. Z is also normal because of Theorem 5. Note that the expression for E Y X obtained in 5. Therefore Z and X are independent because of Theorem 5. We conclude. By applying Theorem 5. Recall that these three equations imply that for any pair of random variables X and Y there exists a random variable Z such that I I But clearly h is the density of N [ a p.

X and Y are bivariate normal. It may be worthwhile to point out that 5. O If we define 'J? In the preceding discussion we proved 5. Since we showed in Theorem 4. Then we have 5. See Ferguson The following is another important property of the bivariate normal distribution. It is important to note that the conclusion of Theorem 5. In particular. Therefore X and Y are independent by Definition 3. Suppose X and Y are distributed jointly normal with EX 1.

We sometimes write u: for aI. The student unfamiliar with matrix analysis should read Chapter 11 before this section. We write their elements explicitly as follows: Therefore. Put X. Defining Z. Calculate P 2. The following two examples are applications of Theorems 5. The results of this section will not be used directly until Section 9. Throughout this section. Now we state without proof generalizations of Theorems 5.

Section 5. Ez '1. Ez z Ez 'I. Compute EX. Then Ax. Let X be the number of aces that turn up. Partition 2 conformably as tor and the best linear predictor of Y given X and calculate their respective mean squared prediction errors.

Z and let A be an m X n matrix of constants such that m 5 n and the rows of A are linearly independent. Find the best predictor and the best linear predictor of y2 given X and find their respective mean squared prediction errors. Z a n d l e t y a n d z bedefinedasin Theorem 5. Ey ']. Find the best predic-. S and T are independent and distributed as N 0. Let x. Then any subvector of x. The last equality reads "the probabzlity limit of X.

Most of the theorems will be stated without proofs.. We write X. For the proofs the reader should consult. Chung This suggests that we should modify the definition in such a way that the conclusion states that 6. Serfling Let us first review the definition of the convergence of a sequence of real numbers. A definition can be found in any of the aforementioned books. The first two are examples of a law of large numbers.

Rao We could only talk about the probability of 6. There is still another mode of convergence. Wewrite x. I The following two theorems state that convergence in mean square implies convergence in probability. In this chapter we shall make the notions of these convergences more precise. Suppose that g is a continuous function except for finite discontinuities. We state without proof the following generalization of the Slutsky theorem. Here we have assumed the existence of the density for simplicity.

Theorem 6. For if the limit distribution of Z. To prove Theorem 6. It is an uninteresting limit distribution. YJn is the same as the distribution of g X1. We do not use the strong law of convergence. Chebyshev's inequality follows from the simple result: 6.

A central limit theorem CLT specifies the conditions under which Z. Then 2. Then X. Now we ask the question. It follows from Theorem 6. In certain situations it will be easier to apply z. It is more meaningful to inquire into the limit distribution of Z. This law is sometimes which referred to as a weak law of large numbers to distinguish it from a strong law of large numbers.

We shall write 2. Then the limit distribution of g X1. In many applications the simplest way to show X. The following two theorems are very useful in proving the convergence of a sequence of functions of random variables. Note that the conclusion of Theorem 6. A law of large numbers LLN specifies the conditions under.

L e v y and Liapounov. More precisely. Let g be a function continuous at a constant vector pointa. Note that it would be meaningless to say that "the limit distribution of X. We now introduce the term asymptotic distnbution. We shall not use it in this book. The true probabilities and their approximations are given in Table 6.

The results are summarized in Table 6. In the terminology of Definition 6. Both are special cases of the most general CLT. The densityfunction f x of N 2. Let X be as defined in Example 5. In Definition The former seems preferable. We shall consider three examples of a normal approximation of a binomial. Let X be the number of unemployed workers among twelve workers.

Results 1 — 30 of 34 Introduction to Statistics and Econometrics by Amemiya, Takeshi and a great selection of related books, art and collectibles available now at.

Shop with confidence on eBay!. Available in: Hardcover. This outstanding text by a foremost econometrician combines instruction in probability and statistics with econometrics. Open to the public ; YY Cover may not represent actual copy or condition available. And unlike many econometrics texts, it offers a thorough treatment of statistics.

If you are completely new to the subject you may want to get some supplemental introductory materials in addition. University of Western Australia Library. La Trobe University Library. Elements Of Matrix Analysis None of your libraries hold this item. First, it covers a full range of techniques with the estimation method called the Generalized Method of Moments GMM as the organizing principle. Unlike many statistics texts, it discusses regression analysis in depth.

Believing that students should acquire the habit of questioning conventional statistical techniques, Takeshi Amemiya discusses the problem of choosing estimators and compares various criteria for ranking them. Australian College of Applied Psychology. Harvard University Press, In order to set up a list of libraries that you have access to, you must first login or sign up.

This outstanding text by a foremost econometrician combines instruction in probability and statistics with econometrics in a rigorous but relatively nontechnical manner. Nowadays applied work in business and economics requires a solid understanding of econometric methods to support decision-making. Harry Potter Years by J. The digital Loeb Classical Library loebclassics. Multiple Regression Model In this event, there may be a slight delay in shipping and possible variation in description.

This outstanding text by a foremost econometrician combines instruction in probability and statistics with econometrics in a rigorous but relatively nontechnical manner. Harvard University Press, Comments and reviews What are comments? Elements of Matrix Analysis About this product Synopsis This outstanding text by a foremost econometrician combines instruction in probability and statistics introductoin econometrics in a rigorous but relatively nontechnical manner.

The coverage of probability and statistics includes best prediction and best linear prediction, the joint distribution of a continuous and discrete random variable, large sample theory, and the properties of the maximum likelihood estimator. Then set up a personal list of libraries from your profile page by clicking on your user name at the top right of any screen. Add a tag Cancel Be the first to add a tag for this edition. Combining a solid exposition of econometric methods with an application-oriented approach, this rigorous textbook provides students with a working understanding and hands-on experience of Statistica Sample Theory 7.

A,emiya coverage of probability and statistics includes best prediction and best linear prediction, the joint distribution of a continuous and discrete random variable, large sample theory, and the properties of stagistics maximum likelihood estimator.

La Trobe University Library. Other suppliers National Library of Australia — Copies Direct The National Library may be able to supply you smemiya a photocopy or electronic copy of all or part of this item, for a fee, depending on copyright restrictions. Reviews Introduction to Statistics and Econometrics covers probability and statistics, with emphasis on certain topics that are important in econometrics but often overlooked by statistics textbooks at this level National Library of Australia.

This book provides a broad survey of the field of econometrics that Has wear to the cover and pages. Show More Show Less.



0コメント

  • 1000 / 1000