SSC CGL Tier 2 Paper 3 Study Material Day 7 [statistics]
- Based on latest Pattern
- English Medium eBooks
SSC CGL Tier 2 Paper 3 Study Material Day 7 [statistics]
SSC CGL Tier 2 Paper 3 Study Material Day 7 [statistics]:-
Different Types of Moments and Their Relationship
19th century under the framework of the group theory and of the theory of algebraic invariants. The theory of algebraic invariants was thoroughly studied by famous German mathematicians P.A. Gordan and D. Hilbert and was further developed in the 20th century in and among others.
Moment invariants were first introduced to the pattern recognition and image processing community in 1962 when Hu employed the results of the theory of algebraic invariants and derived his seven famous invariants to rotation of 2-D objects. Since that time, hundreds of papers have been devoted to various improvements, extensions and generalizations of moment invariants and also to their use in many areas of application. Moment invariants have become one of the most important and most frequently used shape descriptors. Even though they suffer from certain intrinsic limitations (the worst of which is their globalness, which prevents direct utilization for occluded object recognition), they frequently serve as ”first-choice descriptors” and as a reference method for evaluating the performance of other shape descriptors. Despite a tremendous effort and huge number of published papers, many open problems remain to be resolved.
Moments in mathematical statistics involve a basic calculation. These calculations can be used to find a probability distribution's mean, variance and skewness.
Suppose that we have a set of data with a total of n discrete points. One important calculation, which is actually several numbers, is called the sth moment. The sth moment of the data set with values x1, x2, x3, . . . , xn is given by the formula:
(x1s + x2s + x3s + . . . + xns)/n
Using this formula requires us to be careful with our order of operations. We need to do the exponents first, add, then divide this sum by n the total number of data values.
The term moment has been taken from physics. In physics the moment of a system of point masses is calculated with a formula identical to that above, and this formula is used in finding the center of mass of the points. In statistics the values are no longer masses, but as we will see, moments in statistics still measure something relative to the center of the values.
Moments are scalar quantities used for hundreds of years to characterize function and to capture its significant features. They have been widely used in statistics for description of the shape of a probability density function and in classic rigid-body mechanics to measure the mass distribution of a body. From the mathematical point of view, moments are ”projections” of a function onto a polynomial basis (similarly, Fourier transform is a projection onto a basis of harmonic functions). For the sake of clarity, we introduce some basic terms and propositions, which we will use throughout the book.
Definition 1: By an image function (or image) we understand any piecewise continuous real function f(x, y) of two variables defined on a compact support D ⊂ R × R and having a finite nonzero integral.
Definition 2: General moment M(f) pq of an image f(x, y), where p, q are non-negative integers and r = p + q is called the order of the moment, is defined as
where p00(x, y), p10(x, y), . . . , pkj(x, y), . . . are polynomial basis functions defined on D. (We omit the superscript (f) if there is no danger of confusion.)
Depending on the polynomial basis used, we recognize various systems of moments.
The nth raw moment µn (i.e., moment about zero) of a distribution P(x) is defined by µn = (xn)
µn the mean, is usually simply denoted u=u1 If the moment is instead taken about a point a,
µn(a) = <(x-a)n> = ∑(x-a)n P(x).
A statistical distribution is not uniquely specified by its moments, although it is by its characteristic function.
The moments are most commonly taken about the mean. These so-called central moments are denoted un and are defined by
µn = <(x-µ)n>
with µ1 = 0 The second moment about the mean is equal to the variance
where σ = √µ2 is called the standard deviation
The related characteristic function is defined by
=ø(n) (0) = [dn ø/ d tn]t = 0
in µn (0).
The moments may be simply computed using the moment-generating function,
Different types of Moments-
The types of moments are-
µn = M(n) (0).
1. First Moment
For the first moment we set s = 1. The formula for the first moment is thus:
(x1 x2 + x3 + . . . + xn)/n
This is identical to the formula for the sample mean.
The first moment of the values 1, 3, 6, 10 is (1 + 3 + 6 + 10) / 4 = 20/4 = 5.
2. Second Moment
For the second moment we set s = 2. The formula for the second moment is:
(x21 + x22 + x23 + . . . + x2n)/n
The second moment of the values 1, 3, 6, 10 is (12 + 32+ 62 + 102) / 4 = (1 + 9 + 36 + 100)/4 = 146/4 = 36.5.
For the third moment we set s = 3. The formula for the third moment is:
(x31 + x32 + x33 + . . . + x3n)/n
(The third moment of the values 1, 3, 6, 10 is (13 + 33 + 63 + 103) / 4 = (1 + 27 + 216 + 1000)/4 = 1244/4 = 311.
Higher moments can be calculated in a similar way. Just replace s in the above formula with the number denoting the desired moment
4. Fourth (s=4).
The 4th moment = (x14 + x24 + x34 + . . . + xn4)/n
Moments about the Mean
A related idea is that of the sth moment about the mean. In this calculation we perform the following steps:
- First calculate the mean of the values.
- Next, subtract this mean from each value.
- Then raise each of these differences to the sth power.
- Now add the numbers from step #3 together.
- Finally, divide this sum by the number of values we started with.
The formula for the sth moment about the mean m of the values x1, x2, x3, . . . , xn is given by:
MS = (x1 - m)S + (x2- m)S + (x3 - m)S + . . . + (xn - m)s)/n
First Moment about the Mean
The first moment about the mean is always equal to zero, no matter what the data set is that we are working with. This can be seen in the following:
M1 = ((x1 - m) + (x2 - m) + (x3 - m) + . . . + (xn - m))/n = ((x1+ x2 + x3 + . . . + xn) - nm)/n = m - m = 0.
Second Moment about the Mean
The second moment about the mean is obtained from the above formula by settings = 2:
m2 = ((x1 - m)2 + (x2 - m)2 + (x3 - m)2 + . . . + (xn - m)2)/n
This formula is equivalent to that for the sample variance.
For example, consider the set 1, 3, 6, 10. We have already calculated the mean of this set to be 5. Subtract this from each of the data values to obtain differences of:
1 – 5 = -4
3 – 5 = -2
6 – 5 = 1
10 – 5 = 5
We square each of these values and add them together: (-4)2 + (-2)2 + 12 + 52 = 16 + 4 + 1 + 25 = 46. Finally divide this number by the number of data points: 46/4 = 11.5
Applications of Moments
As mentioned above, the first moment is the mean and the second moment about the mean is the sample variance. Pearson introduced the use of the third moment about the mean in calculating skewness and the fourth moment about the mean in the calculation of kurtosis.
Uses of Moments In Statistics-
The central question in statistics is that given a set of data, we would like to recover the random process that produced the data (that is, the probability law of the population). This question is extremely difficult in general and in the absence of strong assumptions on the underlying random process you really can't get very far (those who work in nonparametric statistics may disagree with me on this). A natural way to approach this problem would be to look for simple objects that do identify the population distribution if we do make some reasonable assumptions.
The question then becomes what type of objects should we search for. The best arguments I know about why we should look at the Laplace (or Fourier; I'll show you what this is in a second if you don't know) transform of the probability measure are a bit complicated, but naively we can draw a good heuristic from elementary calculus: given all the derivatives of an analytic function evaluated at zero we know everything there is to know about the function through its Taylor series.
Suppose for a moment that the function f(t)=E[etX] exists and is well behaved in a neighborhood of zero. It is a theorem that this function (when it exists and behaves nicely) uniquely identifies the probability law of the random variable XX. If we do a Taylor expansion of what is inside the expectation, this becomes a power series in the moments of XX: X: and so to completely identify the law of XX we just need to know the population moments. In effect we reduce the question above "identify the population law of XX" to the question "identify the population moments of XX".
It turns out that (from other statistics) population moments are extremely well estimated by sample moments when they exist, and you can even get a good feel on how far off from the true moments it is possible to be under some often realistic assumptions. Of course we can never get infinitely many moments with any degree of accuracy from a sample, so now we would really want to do another round of approximations, but that is the general idea. For nice random variables, moments are sufficient to estimate the sample law.
I should mention that what I have said above is all heuristic and doesn't work in most interesting modern examples. In truth, I think the right answer to your question is that we don't need moments because for many relevant applications (particularly in economics) it seems unlikely that all moments even exist. The thing is that when you get rid of moment assumptions you lose an enormous amount of information and power: without at least two, the Central Limit Theorem fails and with it go most of the elementary statistical tests. If you do not want to work with moments, there is a whole theory of nonparametric statistics that make no assumptions at all on the random process.