Page images
PDF
EPUB

use of the criterion that the cause must be equal to the effect. Whenever one finds two quantities correlated he may properly proceed to test the hypotheses that one causes the other in part and that both are due in part to some common cause.

The point of view of this long chapter may be summed up in a few short practical precepts. They are:

Think what you are relating, and that any relationship is measured by a series of ratios.

If the measures are absolute amounts, bear in mind the significance of the zero points from which they are measured.

If the measures are deviations from some central tendency, bear in mind the nature of the group whose central tendency it is.

Keep before you always the total series of ratios found.

Do not be satisfied with crude means of measuring any presumably rectilinear relationship. The Pearson coefficient requires not much more time and is, for both exactness and convenience, far superior.

PROBLEMS.

29. Calculate the relationship between changes in pauperism and changes in out-relief from the following data: *

PERCENTAGE RATIOS OF PAUPERISM.

15-25 25-35 35-45 45-55 55-65 65-75 75-85 85-95 95-105 105-115 115-125

[merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small]

From an article by G. Udny Yule, in the Journal of the Royal Statistical Society, Vol. 62, p. 281.

Each figure in the table represents the number of cases of the relationship denoted by the figure above it in the horizontal scale taken with the figure opposite it in the vertical scale. Thus the second column reads: 'Of districts having a change of 25-35 in pauperism, one had a change of 25-35 in out-relief ratio, three had changes of 35-45 in out-relief ratio, and 2 had changes of 45–55.

[ocr errors]

;

THE RELIABILITY OF MEASURES.

WHEN from a limited number of measurements of an individual fact, say of A's monthly expenses or B's ability in perception, we calculate its average, the result is not, except by chance, the true average. For, obviously, one more measurement will, unless it happens to coincide with the average obtained, change it. For instance, the first 30 measures of H's ability in reaction time gave the average .1405; the next seven measures being taken into account, the average became .1400; with the next seven it became .1406 with the next seven, .1406 +. By the true average we mean the average that would come from all the possible tests of the trait in question. The actual average obtained from a limited finite number of these measures is, except by chance, only an approximation toward the true average. So also with the accuracy of the measure of variability obtained. The true variability is that manifested in the entire series of measurements of the trait; the actually obtained variability is an approximation toward it. The true average and the true variability of a group mean similarly the measures obtained from a study of all the members of the group.

It is necessary, then, to know how many trials of an individual, how many members of a group, must be measured, to obtain as accurate knowledge as we need. Or, to speak more properly, it is necessary to know how close to the true measure the result obtained from a certain finite number of measures will be.

It is clear that the true average of any set of measures is the average calculated from all of them. If the average we actually obtain is calculated from samples chosen at random, it will probably diverge somewhat from the average calculated from all. So also with obtained and true measures of total distribution, variability, of difference and of relationship. We measure the unreliability of any obtained measure by its probable divergence from the true measure.

It is clear also that the divergence of any measure due to a limited number of measures from the corresponding measure due to the entire series, will vary according to what particular samples we

hit upon, and that if the samples are taken at random this variation in the amount of divergence will follow the laws of probability. For these laws, based on the algebraic law expressing the number of combinations of r things taken n at a time, will account for the difference between the constitution of a total series and the constitution of any group of things chosen at random from it, consequently for the differences between any two measures due respectively to these two constitutions.

We have, consequently, to find the distribution of a divergence (of obtained from true or of true from obtained) and know beforehand, in cases of random sampling, that it will be of the type of the probability surfaces given in Figs. 12 and 49, will be symmetrical (since the true is as likely to be greater as to be less than the obtained) with its mode at 0 (since all that we do know about the true is that it is more likely to be the obtained measure than to be any other one measure). What we need to know is its form and variability, to know, that is, how often we may expect a divergence of .01, how often one of .02, how often one of .03, etc. Suppose our obtained measure to be 10.4 and the distribution of the probable divergence of its corresponding true measure from it to be known to be as follows:

[blocks in formation]

0.1
.1 per cent.

[merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][ocr errors][ocr errors][merged small][ocr errors][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small]

We can say: 'The true measure will not rise above 11.3 (10.4.9) in more than one case out of 1,024,' or, 'The chances are over 1,000 to 1 against the measure being over 11.3,' or, "The chances are nearly 99 to 1 against the true measure being over 11.1,' or, 'The chances are about 8 to 1 against the true measure differing from 10.4 either above or below by more than .5.'*

* It may appear strange to talk about the true measure, which is a fixed value, 'rising above' or 'being over,' but if the reader will bear in mind that we do not know just where it is fixed, but do know the probability of its being at this or that point, he will not misunderstand the terms used. They could not well be avoided without much circumlocution.

If the form of the distribution of the divergence were known, its variability would be the only measure needed. The form will always be fairly near to the normal surface of frequency and it is customary to disregard the very slight error involved and assume the form to be normal.

If we know the variability of the divergence, the probable frequency of any divergence or of divergences less than or greater than any given amount can be calculated from the table of frequencies of the normal probability surface. Conversely, the table will tell us the amount of divergence which will be exceeded (or not exceeded) by any given per cent. of comparisons of true and obtained. Illustrations of the use of the table will be given in Chapter XI.

The problem of determining the reliability of any measure due to a limited series of samples is, then, to determine the variability of the fact, divergence of true from obtained measure. (Mtrue - Mobt.)

It is clear that the more nearly the number of samples taken approaches the number of things they represent the closer the obtained measure will, in general, be to the true measure, the less will be the range of divergence.

It is clear that the less the variability amongst the individual samples, the less will be the divergence of the obtained from the true measure of central tendency. For instance, if men range from 4 to 7 feet in height, averaging 5 feet 8 inches, we can not possibly get an average more than 1 foot 8 inches wrong, while if they range from 2 to 10 feet, we may make an error of 3 feet 8 inches. The same holds true for the divergence of obtained from true variability.

Upon these facts are based the formulæ for the calculation of the variability of the divergence of true measure from that obtained from any given series of samples. These formulæ take as the definition of 'true measure,' the measure which would be found if an infinite number of cases were studied.

The formula to be given here for the reliability of central tendencies and variabilities are those in common use. They are absolutely exact only for a case where the distribution of the trait itself is that of the normal probability surface with extremes at minus infinity and plus infinity, and so are never absolutely exact for any real case. They are very inexact, except for a trait showing a clear central tendency with decreasing frequencies on either side. This,

[merged small][ocr errors]
« PreviousContinue »