Page images
PDF
EPUB

"Should any further irrelevant correlation, say rpw, be admitted,

[merged small][merged small][ocr errors][ocr errors][merged small][subsumed][subsumed][ocr errors][ocr errors][ocr errors][ocr errors][subsumed][merged small][merged small][merged small]

"Distortion occurs whenever the two series to be compared together both correspond to any appreciable degree with the same third irrelevant variant. In this case, the relation is given by

where Tpq

rpq'

=

[ocr errors][merged small][merged small]

the apparent correlation between p and q, the two
characteristics to be compared,

Tp and re the correlation of p and q with some third and per

and

[ocr errors]

=

turbing variable v,

Tpq the required real correlation between

Τρα

=

[ocr errors]

Ρ and q, after compensating for the illegitimate influence of v.' Should the common correspondence with v have been irrelevantly excluded instead of admitted, the relation becomes

[ocr errors][merged small][merged small][merged small][merged small]

§ 42. The Dependence of the Meaning of a Coefficient of Correlation upon the Values that Are Paired

The facts to be correlated in the mental and social sciences may be: (1) the varying conditions of a trait in an individual (to be correlated with corresponding conditions in him of some other trait), or (2) the varying conditions of a trait found in different individuals of a group (to be correlated with the conditions found in some other trait in the same individuals), or (3) the varying central tendencies of a trait found in different subgroups of a larger group or collection of groups (to be correlated with the central tendencies found in the case of some other trait in the same subgroups), or many other series of pairs.

For example, one may seek (Case 1) the correlation between the quickness of perception of an individual at various times and his

[merged small][ocr errors]

quickness of movement at corresponding times. Or one may seek (Case 2) the correlation between the quickness of perception in general of Jones, Smith, Brown, etc., and the quickness of movement possessed in general by the same individuals. Or (Case 3) one may seek the correlation between the general quickness in perception of races and their quickness of movement.

It should be noted that the differences in the three cases have nothing to do with the mere number of individuals studied. The essential differences would remain if we used a million cases to determine the correlation of two traits within an individual, only a hundred thousand to determine the correlation among individuals and only ten thousand to determine it for races. The essential difference is in the questions to be solved. From them it follows also that in Case 1, if several individuals are studied, a number of pairs of measures for each individual will be used and the coefficient of correlation in each individual will be worked out separately. If the results from different individuals are then combined they will be combined as a group of facts according to the methods of Chapter III. In Case 2, on the contrary, a single pair of measures will represent the correlation in any one individual and these pairs will be combined according to the method of the present chapter. In Case 3 a single pair of figures will represent the correlation in each subgroup.

The problem of measurement itself is the same for three cases, the difference being in the data used and the consequent meaning of the coefficient of correlation obtained. To any one of the following series of related pairs the mode of procedure discussed in this chapter is applicable.

RELATED BY IDENTITY OF CONDITIONS

Trait T and trait T1 in individual A under conditions C1
Trait T and trait T1 in individual A under conditions C2
Trait T and trait T1 in individual A under conditions Cз

RELATED BY IDENTITY OF THE INDIVIDUAL
Trait T and trait T1 in group, ten-year-olds, in individual I1
Trait T and trait T1 in group, ten-year-olds, in individual I2
Trait T and trait T1 in group, ten-year-olds, in individual I,
RELATED BY IDENTITY OF THE SUB-GROUP
Trait T and trait T1 in group, all men, in sub-group Chinese.
Trait T and trait T1 in group, all men, in sub-group Negroes.
Trait T and trait T1 in group, all men, in sub-group Indians.

It is perhaps needless to point out that the existence of a certain relation within an individual does not imply anything about the relation within a group of individuals, nor that again about the relation within a group of groups. Individuals may be happier when they are richer, but rich individuals amongst Americans may be no happier than poor individuals, and from neither fact could we infer that the American population would be happier or less happy than the Chinese or the Negro population.

For similar reasons the nature and amount of a correlation will depend upon the group selected. If, for instance, the correlation between knowledge of history and knowledge of English literature is measured in the group, high-school graduates, by using the deviations of individuals from the high-school graduates' averages in the two traits, the correlation will be less close than if we use the group, all people. The correlation between height and weight will be less close if measured in the group, 18-year-olds, than if measured in all children under twenty. Any relation so calculated should always be thought of as the correlation of deviations from the assigned points of reference in the two traits in the case of the individuals of the group in question. To assume that the correlation found in any given group holds good also for a different group is valid only if the given group is a random selection from the other group.

PROBLEMS

41. Arrange a correlation-table, pairing the two series given below so that r when calculated will be approximately .8.

42. Arrange a second correlation table, pairing them so that r will be approximately .5.

43. Pair them so that r will be approximately .2.

[merged small][merged small][ocr errors][ocr errors][merged small]

[ocr errors]

6, 5, 5, — 4, — 4, — 4,

[ocr errors]

[ocr errors]

— 3, — 3, — 3, — 2, — 2, — 2, — 2, − 1, − 1, −

-

[ocr errors]
[blocks in formation]

1, − 1, + 1, + 1, + 3, + 3, + 3, + 4, + 4,

Series II. 5, 4, 4, 3, 3, 3, 2, — 2, — 2, — 2,

[ocr errors]

- — — —

— 2, — 2, — 1, — 1, — 1, — 1,

---

[ocr errors]
[blocks in formation]

+ 1, + 1, + 1, + 1, +

1,

+ 2,

+ 2,

+ 2,

1, 0, 0, 0, 0, +1, +1,

+ 2, + 2, + 3, + 3,

+3, +3, +4, + 4, + 5.

44. Computer by all possible methods for the following series of pairs, using 115 and 80 as C.T.'s.

[merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][subsumed][ocr errors][subsumed][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][subsumed][merged small][merged small][merged small][subsumed][merged small][merged small][subsumed][subsumed][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small]

45. Compute the Pearson coefficients for A with B, A with C, and A with D in Table 31 (page 158).

46. Compute v1, v2, the mid x/y ratio, and the mid y/x ratio and r by the formula r = √[(mid x/y)v1][(mid y/x)] for the data

of Table 42.

To save time, treat ratios with zero in the denominator as extreme negatives when the numerator is negative, and as extreme positives when the numerator is positive. In practice, the midratio method would not be used without distributing the zero measures. The principle of distribution would be that used for the "percentage of like-signed pairs" method.

CHAPTER XII

THE RELIABILITY OF MEASURES

§ 43. Dependence upon the Number of Separate Measures of the Fact in Question and upon their Variability

WHEN from a limited number of measurements of a fact, say of A's monthly expenses or B's ability in perception, we calculate its average, the result is not, except by chance, the true average. For, obviously, one more measurement will, unless it happens to coincide with the average obtained, change it. For instance, the first 30 measures of H's ability in reaction time gave the average .1405; the next seven measures being taken into account, the average became .1400; with the next seven it became .1406; with the next seven, .1406+. By the true average we mean the average that would come from all possible measures of the fact in question. The actual average obtained from a limited finite number of these measures is, except by chance, only an approximation toward the true average. So also with the accuracy of the measure of variability obtained. The true variability is that manifested in the entire series of measurements of the trait; the actually obtained variability is an approximation toward it. The true average and the true variability of a group mean similarly the measures obtained from a study of all the members of the group.

It is necessary, then, to know how many trials of an individual, how many members of a group, must be measured, to obtain as accurate knowledge as we need. Or, to speak more properly, it is necessary to know how close to the true measure the result obtained from a certain finite number of measures will be.

It is clear that the true average of any set of measures is the average calculated from all of them. If the average we actually obtain is calculated from samples chosen at random, it will probably diverge somewhat from the average calculated from all. So also with obtained and true measures of total distribution, varia

« PreviousContinue »