Sunday, April 12, 2020

The Validity and Utility of Selection Methods in Personnel Essay Example

The Validity and Utility of Selection Methods in Personnel Essay Psychological Bulletin 1998, Vol. 124, No. 2, 262-274 Copyright 1998 by the American Psychological Association, Inc. 0033-2909/98/S3. 00 The Validity and Utility of Selection Methods in Personnel Psychology: Practical and Theoretical Implications of 85 Years of Research Findings Frank L. Schmidt University of Iowa John E. Hunter Michigan State University This article summarizes the practical and theoretical implications of 85 years of research in personnel selection. On the basis of meta-analytic findings, this article presents the validity of 19 selection procedures for predicting job performance and training performance and the validity of paired combinations of general mental ability (GMA) and Ihe 18 other selection procedures. Overall, the 3 combinations with the highest multivariate validity and utility for job performance were GMA plus a work sample test (mean validity of . 63), GMA plus an integrity test (mean validity of . 65), and GMA plus a structured interview (mean validity of . 63). A further advantage of the latter 2 combinations is that they can be used for both entry level selection and selection of experienced employees. The practical utility implications of these summary findings are substantial. The implications of these research findings for the development of theories of job performance are discussed. From the point of view of practical value, the most important property of a personnel assessment method is predictive validity: the ability to predict future job performance, job-related learning (such as amount of learning in training and development programs), and other criteria. We will write a custom essay sample on The Validity and Utility of Selection Methods in Personnel specifically for you for only $16.38 $13.9/page Order now We will write a custom essay sample on The Validity and Utility of Selection Methods in Personnel specifically for you FOR ONLY $16.38 $13.9/page Hire Writer We will write a custom essay sample on The Validity and Utility of Selection Methods in Personnel specifically for you FOR ONLY $16.38 $13.9/page Hire Writer The predictive validity coefficient is directly proportional to the practical economic value (utility) of the assessment method (Brogden, 1949; Schmidt, Hunter, McKenzie, Muldrow, 1979). Use of hiring methods with increased predictive validity leads to substantial increases in employee performance as measured in percentage increases in output, increased monetary value of output, and increased learning of job-related skills (Hunter, Schmidt, Judiesch, 1990). Today, the validity of different personnel measures can be determined with the aid of 85 years of research. The most wellknown conclusion from this research is that for hiring employees without previous experience in the job the most valid predictor of future performance and learning is general mental ability ([GMA], i. e. , intelligence or general cognitive ability; Hunter Hunter, 1984; Ree Earles, 1992). GMA can be measured using commercially available tests. However, many other measures can also contribute to the overall validity of the selection process. These include, for example, measures of onscientiousness and personal integrity, structured employment interviews, and (for experienced workers) job knowledge and work sample tests. On the basis of meta-analytic findings, this article examines and summarizes what 85 years of research in personnel psychology has revealed about the validity of measures of 19 different selection methods that can be used in making decisions about hiring, training, and developmental assignments. In this sense, this article is an expansion and updating of Hunter and Hunter (1984). In addition, this article examines how well certain combinations of these methods work. These 19 procedures do not all work equally well; the research evidence indicates that some work very well and some work very poorly. Measures of GMA work very well, for example, and graphology does not work at all. The cumulative findings show that the research knowledge now available makes it possible for employers today to substantially increase the productivity, output, and learning ability of their workforces by using procedures that work well and by avoiding those that do not. Finally, we look at the implications of these research findings for the development of theories of job performance. Determinants of Practical Value (Utility) of Selection Methods Frank L. Schmidt, Department of Management and Organization, University of Iowa; John E. Hunter, Department of Psychology, Michigan State University. An earlier version of this article was presented to Korean Human Resource Managers in Seoul, South Korea, June 11, 1996. The presentation was sponsored by long Yang Company We would like to thank President Wang-Ha Cho of Tong Yang for is support and efforts in this connection. We would also like to thank Deniz Ones and Kuh %on for their assistance in preparing Tables 1 and 2 and Gershon Ben-Shakhar for his comments on research on graphology. Correspondence concerning this article should be addressed to Frank L. Schmidt, Department of Management and Organization, College of Business, University of Iowa, Iowa City, Iowa 52240. Electronic mail may be sent to fra[e mailprotected] edu. The validity of a hiring method is a direct determinant of its practical value, but not the only determinant. Another direct determinant is the variability of job performance. At one extreme, if variability were zero, then all applicants would have exactly the same level of later job performance if hired. In this case, the practical value or utility of all selection procedures would be zero. In such a hypothetical case, it does not matter who is hired, because all workers are the same. At the other extreme, if performance variability is very large, it then becomes important to hire the best performing applicants and the practical utility of valid selection methods is very large. As it happens, this extreme case appears to be the reality for most jobs. 262 VALIDITY AND UTILITY Research over the last 15 years has shown that the variability of performance and output among (incumbent) workers is very large and that it would be even larger if all job applicants were hired or if job applicants were selected randomly from among those that apply (cf. Hunter et al. , 1990; Schmidt Hunter, 1983; Schmidt et al. , 1979). This latter variability is called the applicant pool variability, and in hiring this is the variability that operates to determine practical value. This is because one is selecting new employees from the applicant pool, not from among those already on the job in question. The variability of employee job performance can be measured in a number of ways, but two scales have typically been used: dollar value of output and output as a percentage of mean output. The standard deviation across individuals of the dollar value of output (called SDy) has been found to be at minimum 40% of the mean salary of the job (Schmidt Hunter, 1983; Schmidt et al. , 1979; Schmidt, Mack, Hunter, 1984). The 40% figure is a lower bound value; actual values are typically considerably higher. Thus, if the average salary for a job is $40,000, then SD, is at least $16,000. If performance has a normal distribution, then workers at the 84th percentile produce $16,000 more per year than average workers (i. e. , those at the 50th percentile). And the difference between workers at the 16th percentile ( below average workers) and those at the 84th percentile (superior workers) is twice that: $32,000 per year. Such differences are large enough to be important to the economic health of an organization. Employee output can also be measured as a percentage of mean output; that is, each employees output is divided by the output of workers at the 50th percentile and then multiplied by 100. Research shows that the standard deviation of output as a percentage of average output (called SDf) varies by job level. For unskilled and semi-skilled jobs, the average SDf figure is 19%. For skilled work, it is 32%, and for managerial and professional jobs, it is 48% (Hunter et al. , 1990). These figures are averages based on all available studies that measured or counted the amount of output for different employees. If a superior worker is defined as one whose performance (output) is at the 84th percentile (that is, 1 SD above the mean), then a superior worker in a lower level job produces 19% more output than an average worker, a superior skilled worker produces 32% more output than the average skilled worker, and a superior manager or professional produces output 48% above the average for those jobs. These differences are large and they indicate that the payoff from using valid hiring methods to predict later job performance is quite large. Another determinant of the practical value of selection methods is the selection ratio—the proportion of applicants who are hired. At one extreme, if an organization must hire all who apply for the job, no hiring procedure has any practical value. At the other extreme, if the organization has the luxury of hiring only the top scoring 1%, the practical value of gains from selection per person hired will be extremely large. But few organizations can afford to reject 99% of all job applicants. Actual selection ratios are typically in the . 0 to . 70 range, a range that still produces substantial practical utility. The actual formula for computing practical gains per person hired per year on the job is a three way product (Brogden, 1949; Schmidt et al. , 1979): A? //hire/year = A. rvSDyZ, (when performance is measured in dollar value) At7/hire/year = ArvSD,,Z, 263 (1) (when performance is measured in percentage of average output). (2) In these equations, rv is the difference betwe en the validity of the new (more valid) selection method and the old selection method. If the old selection method has no validity (that is, selection is random), then Ar^ is the same as the validity of the new procedure; that is, AJV, = rv. Hence, relative to random selection, practical value (utility) is directly proportional to validity. If the old procedure has some validity, men the utility gain is directly proportional to Ar w . Z, is the average score on the employment procedure of those hired (in z-score form), as compared to the general applicant pool. The smaller the selection ratio, the higher this value will be. The first equation expresses selection utility in dollars. For example, a typical final figure for a medium complexity job might be $18,000, meaning that increasing the validity of the hiring methods leads to an average increase in output per hire of $18,000 per year. To get the full value, one must of course multiply by the number of workers hired. If 100 are hired, then the increase would be (100)($18,000) = $1,800,000. Finally, one must consider the number of years these workers remain on the job, because the $18,000 per worker is realized each year that worker remains on the job. Of all these factors that affect the practical value, only validity is a characteristic of the personnel measure itself. The second equation expresses the practical value in percentage of increase in output. For example, a typical figure is 9%, meaning that workers hired with the improved selection method will have on average 9% higher output. A 9% increase in labor productivity would typically be very important economically for the firm, and might make the difference between success and bankruptcy. What we have presented here is not, of course, a comprehensive discussion of selection utility. Readers who would like more detail are referred to the research articles cited above and to Boudreau (1983a, 1983b, 1984), Cascio and Silbey (1979), Cronshaw and Alexander (1985), Hunter, Schmidt, and Coggin (1988), Hunter and Schmidt (1982a, 1982b), Schmidt and Hunter (1983), Schmidt, Hunter, Outerbridge, and Tratmer (1986), Schmidt, Hunter, and Pearlman (1982), and Schmidt et al. (1984). Our purpose here is to make three important points: (a) the economic value of gains from unproved hiring methods are typically quite large, (b) these gains are directly proportional to the size of the increase in validity when moving from the old to the new selection methods, and (c) no other characteristic of a personnel measure is as important as predictive validity. If one looks at the two equations above, one sees that practical value per person hired is a three way product. One of the three elements in that three way product is predictive validity. The other two—SD y or SDP and Z,—are equally important, but they are characteristics of the job or the situation, not of the personnel measure. 264 SCHMIDT AND HUNTER Validity of Personnel Assessment Methods: 85 Years of Research Findings Research studies assessing the ability of personnel assessment methods to predict future job performance and future learning (e. g. , in training programs) have been conducted since the first decade of the 20th century. However, as early as the 1920s it became apparent that different studies conducted on the same assessment procedure did not appear to agree in their results. Validity estimates for the same method and same job were quite different for different studies. During the 1930s and 1940s the belief developed that this state of affairs resulted from subtle differences between jobs that were difficult or impossible for job analysts and job analysis methodology to detect. That is, researchers concluded that the validity of a given procedure really was different in different settings for what appeared to be basically the same job, and that the conflicting findings in validity studies were just reflecting this fact of reality. This belief, called the theory of situational specificity, remained dominant in personnel psychology until the late 1970s when it was discovered that most of the differences across studies were due to statistical and measurement artifacts and not to real differences in the jobs (Schmidt Hunter, 1977; Schmidt, Hunter, Pearlman, Shane, 1979). The largest of these artifacts was simple sampling error variation, caused by the use of small samples in the studies. (The number of employees per study was usually in the 40-70 range. This realization led to the development of quantitative techniques collectively called metaanalysis that could combine validity estimates across studies and correct for the effects of these statistical and measurement artifacts (Hunter Schmidt, 1990; Hunter, Schmidt, Jackson, 1982). Studies based on meta-analysis provided more accurate estimates of the average operational validity and showed that the level of real variability of validities was usually quite sma ll and might in fact be zero (Schmidt, 1992; Schmidt et a]. 1993). In fact, the findings indicated that the variability of validity was not only small or zero across settings for the same type of job, but was also small across different kinds of jobs (Hunter, 1980; Schmidt, Hunter, Pearlman, 1980). These findings made it possible to select the most valid personnel measures for any job. They also made it possible to compare the validity of different personnel measures for jobs in general, as we do in this article. Table 1 summarizes research findings for the prediction of performance on the job. The first column of numbers in Table 1 shows the estimated mean validity of 19 selection methods for predicting performance on the job, as revealed by metaanalyses conducted over the last 20 years. Performance on the job was typically measured using supervisory ratings of job performance, but production records, sales records, and other measures were also used. The sources and other information about these validity figures are given in the notes to Table 1. Many of the selection methods in Table 1 also predict jobrelated learning; that is, the acquisition of job knowledge with experience on the job, and the amount learned in training and development programs. However, the overall amount of research on the prediction of learning is less. For many of the procedures in Table 1, there is little research evidence on their ability to predict future job-related-leaming. Table 2 summarizes available research findings for the prediction of performance in training programs. The first column in Table 2 shows the mean validity of 10 selection methods as revealed by available meta-analyses. In the vast majority of the studies included in these meta-analyses, performance in training was assessed using objective measures of amount learned on the job; trainer ratings of amount learned were used in about 5% of the studies. Unless otherwise noted in Tables 1 and 2, all validity estimates in Tables 1 and 2 are corrected for the downward bias due to measurement error in the measures of job performance and to range restriction on the selection method in incumbent samples relative to applicant populations. Observed validity estimates so corrected estimate operational validities of selection methods when used to hire from applicant pools. Operational validities are also referred to as true validities. In the pantheon of 19 personnel measures in Table 1, GMA (also called general cognitive ability and general intelligence) occupies a special place, for several reasons. First, of all procedures that can be used for all jobs, whether entry level or advanced, it has the highest validity and lowest application cost. Work sample measures are slightly more valid but are much more costly and can be used only with applicants who already know the job or have been trained for the occupation or job. Structured employment interviews are more costly and, in some forms, contain job knowledge components and therefore are not suitable for inexperienced, entry level applicants. The assessment center and job tryout are both much more expensive and have less validity. Second, the research evidence for the validity of OMA measures for predicting job performance is stronger than that for any other method (Hunter, 1986; Hunter Schmidt, 1996; Ree Earles, 1992; Schmidt Hunter, 1981). Literally thousands of studies have been conducted over the last nine decades. By contrast, only 89 validity studies of the structured interview have been conducted (McDaniel, Whetzel, Schmidt, Mauer, 1994). Third, GMA has been shown to be the best available predictor of job-related learning. It is the best predictor of acquisition of job knowledge on the job (Schmidt Hunter, 1992; Schmidt, Hunter, Outerbridge, 1986) and of performance in job training programs (Hunter, 1986; Hunter Hunter, 1984; Ree Earles, 1992). Fourth, the theoretical foundation for GMA is stronger than for any other personnel measure. Theories of ntelligence have been developed and tested by psychologists for over 90 years (Brody, 1992; Carroll, 1993; Jensen, 1998). As a result of this massive related research literature, the meaning of the construct of intelligence is much clearer than, for example, the meaning of what is measured by interviews or assessment centers (Brody, 1992; Hunter, 1986; Jensen, 1998). The value of . 51 in Table 1 for the validity of GMA is from a very large met a-analytic study conducted for the U. S. Department of Labor (Hunter, 1980; Hunter Hunter, 1984). The database for this unique meta-analysis included over 32,000 employees in 515 widely diverse civilian jobs. This meta-analysis examined both performance on the job and performance in job training programs. This meta-analysis found that the validity of GMA for predicting job performance was . 58 for professional-managerial jobs, . 56 for high level complex technical jobs, . 51 for medium complexity jobs, . 40 for semi-skilled jobs, and . 23 for completely unskilled jobs. The validity for the middle complexity level of jobs (. 51) —which includes 62% of all VALIDITY AND UTILITY 265 Table 1 Predictive Validity for Overall Job Performance of General Mental Ability (GMA) Scores Combined With a Second Predictor Using (Standardized) Multiple Regression Standardized regression weights % increase in validity Personnel measures GMA testsWork sample tests* Integrity tests Conscientiousness tests1 Employment interviews (structured)11 Employment interviews (unstructured/ Job knowledge tests8 Job tryout procedure11 Peer ratings1 T E behavioral consistency method1 Reference checksk Job experience (years)1 Biographical data measures111 Assessment centers T E point method Years of education*1 Interests* Graphology Age- Validity (r) Multiple R Gain in validity from adding supplement GMA Supplement .51 . 54 . 41 . 31 . 51 . 38 . 48 . 44 . 49 . 45 .26 .18 . 35 . 37 . 11 . 10 . 10 . 02 -. 01 .63 . 65 . 60 . 63 . 55 . 58 . 58 . 58 . 58 . 57 . 54 . 52 . 53 . 52 . 52 . 52 . 51 . 51 .12 . 14 . 09 . 12 . 04 . 07 . 07 . 07 . 07 . 06 . 03 . 01 . 02 . 01 . 01 . 01 . 00 . 00 24% 27% 18% 24% 8% 14% 14% 14% 14% 12% 6% 2% 4% 2% 2% 2% 0% 0% .36 . 51 . 51 . 39 . 43 . 36 . 40 . 35 . 39 . 51 . 51 . 45 . 43 . 39 . 51 . 51 . 51 . 51 .41 . 41 . 31 . 39 . 22 . 31 . 20 . 31 . 31 . 26 . 18 . 13 . 15 . 29 . 10 . 10 . 02 -. 01 Note. T E = training and experience. The percentage of increase in validity is also the percentage of increase in utility (practical value). All of the validities presented are based on the most current meta-analytic results for the various predictors. See Schmidt, Ones, and Hunter (1992) for an overview. All of the validities in this table are for the criterion of overall job performance. Unless otherwise noted, all validity estimates are corrected for the downward bias due to measurement error in die measure of job performance and range restriction on the predictor in incumbent samples relative to applicant populations. The correlations between GMA and other predictors are corrected for range restriction but not for measurement error in either measure (thus they are smaller than fully corrected mean values in the literature). These correlations represent observed score correlations between selection methods in applicant populations. From Hunter (1980). The value used for the validity of GMA is the average validity of GMA for medium complexity jobs (covering more than 60% of all jobs in die United States). Validities are higher for more complex jobs and lower for less complex jobs, as described in the text. From Hunter and Hunter (1984, Table 10). The correction for range restriction was not possible in these data. The correlation between work sample scores and ability scores is . 38 (Schmidt, Hunter; Outerbridge, 1986). Cid From Ones, Viswesvaran, and Schmidt (1993, Table 8). The figure of . 41 is from predictive validity studies conducted on job applicants. The validity of . 31 for conscientiousn ess measures is from Mount and Barrick (1995, Table 2). The correlation between integrity and ability is zero, as is the correlation between conscientiousness and ability (Ones, 1993; Ones et al. , 1993). -f from McDaniel, Whetzel, Schmidt, and Mauer (1994, Table 4). folues used are those from studies in which the job performance ratings were for research purposes only (not administrative ratings). The correlations between interview scores and ability scores are from Huffcutt, Roth, and McDaniel (1996, Table 3). The correlation for structured interviews is . 30 and for unstructured interviews, . 38. From Hunter and Hunter (1984, Table 11). The correction for range restriction was not possible in these data. The correlation between job knowledge scores and GMA scores is . 48 (Schmidt, Hunter, Outerbridge, 1986). From Hunter and Hunter (1984, Table 9). No correction for range restriction (if any) could be made. (Range restriction is unlikely with this selection method. ) The correlat ion between job tryout ratings and ability scores is estimated at . 38 (Schmidt, Hunter, Outerbridge, 1986); that is, it was taken to be the same as that between job sample tests and ability. Use of the mean correlation between supervisory performance ratings and ability scores yields a similar value (. 35, unconnected for measurement error). From Hunter and Hunter (1984, Table 10). No correction for range restriction (if any) could be made. The average fully corrected correlation between ability and peer ratings of job performance is approximately . 55. If peer ratings are based on an average rating from 10 peers, the familiar Spearman-Brown formula indicates that the interrater reliability of peer ratings is approximately . 91 (Viswesvaran, Ones, Schmidt, 1996). Assuming a reliability of . 90 for the ability measure, the correlation between ability scores and peer ratings is . 55v^91(-90) = . 50. From McDaniel, Schmidt, and Hunter (1988a). These calculations are based on an estimate of the correlation between T E behavioral consistency and ability of . 0. This estimate reflects the fact that the achievements measured by this procedure depend on not only personality and other noncognitive characteristics, but also on mental ability. k From Hunter and Hunter (1984, Table 9). No correction for range restriction (if any) was possible. In the absence of any data, the correlation between reference checks and ability was t aken as . 00. Assuming a larger correlation would lead to lower estimated incremental validity. From Hunter (1980), McDaniel, Schmidt, and Hunter (1988b), and Hunter and Hunter (1984). In the only relevant meta-analysis, Schmidt, Hunter, and Outerbridge (1986, Table 5) found the correlation between job experience and ability to be . 00. This value was used here. m The correlation between biodata scores and ability scores is . 50 (Schmidt, 1988). Both the validity of . 35 used here and the intercorrelation of . 50 are based on the Supervisory Profile Record Biodata Scale (Rothstein, Schmidt, Erwin, Owens, and Sparks, 1990). (The validity for the Managerial Profile Record Biodata Scale in predicting managerial promotion and advancement is higher [. 52; Carlson, Scullen, Schmidt, Rothstein, Erwin, 1998]. However, rate of promotion is a measure different from overall performance on ones current job and managers are less representative of the general working population than are first line supervisors). From Gaugler, Rosenthal, Thornton, and Benson (1987, Table 8). The correlation between assessment center ratings and ability is estimated at . 50 (Collins, 1998). It should be noted that most assessment centers use ability tests as part of the evaluation process; Gaugler et al. (1987) found that 74% of the 106 assessment centers they examined used a written test of intelligence (see their Table 4). From McDaniel, Schmidt, and Hunter (I988a, Table 3). The calculations here are based on a zero correlation between the T E point method and ability; the assumption of a positive correlation would at most lower the estimate of incremental validity from . 01 to . 00. p From Hunter and Hunter (1984, Table 9). For purposes of these calculations, we assumed a zero correlation between years of educ ation and ability. The reader should remember that this is the correlation within the applicant pool of individuals who apply to get a particular job. In the general population, the correlation between education and ability is about . 55. Even within applicant pools there is probably at least a small positive correlation; thus, our figure of . 01 probably overestimates the incremental validity of years of education over general mental ability. Assuming even a small positive value for the correlation between education and ability would drive the validity increment of . 01 toward . 00. q From Hunter and Hunter (1984, Table 9). The general finding is that interests and ability are uncorrelated (Holland, 1986), and that was assumed to be the case here. From Neter and Ben-Shakhar (1989), BenShakhar (1989), Ben-Shakhar, Bar-Hillel, Bilu, Ben-Abba, and Flug (1986), and Bar-Hillel and Ben-Shakhar (1986). Graphology scores were assumed to be uncorrelated with mental ability. B From Hunter and Hunter (1984, Table 9). Age was assumed to be unrelated to ability within applicant pools. 266 Table 2 SCHMIDT AND HUNTER Predictive Validity for Over all Performance in Job Training Programs of General Mental Ability (GMA) Scores Combined With a Second Predictor Using (Standardized) Multiple Regression Standardized regression weights % increase in validity Personnel measures GMA TestsIntegrity tests Conscientiousness tests6 Employment interviews (structured and unstructured)11 Peer ratings Reference checks1 Job experience (years)8 Biographical data measures1 Years of education Interest^ Validity (r) Multiple K Gain in validity from adding supplement GMA Supplement .56 .38 . 30 . 35 . 36 . 23 . 01 . 30 . 20 . 18 . 67 . 65 . 59 . 57 . 61 . 56 . 56 . 60 . 59 . 11 . 09 . 03 . 01 . 05 . 00 . 00 . 04 . 03 20% 16% 5% 1. 4% .56 . 56 . 59 . 51 . 56 . 56 . 55 . 56 . 56 .38 . 30 . 19 . 11 . 23 . 01 . 03 . 20 . 18 9% 0% 0% 7% 5% Note. The percentage of increase in validity is also the percentage of increase in utility (practical value). All of the validities presented are based on the most current mela-analytic results reported for the various predictors. All of the validities in this table are for the criterion of overall performance in job training programs. Unless otherwise noted, all validity estimates are corrected for the downward bias due to measurement error in the measure of job performance and range restriction on the predictor in incumbent samples relative to applicant populations. All correlations between GMA and other predictors are corrected for range restriction but not for measurement error. These correlations represent observed score correlations between selection methods in applicant populations. The validity of GMA is from Hunter and Hunter (1984, Table 2). It can also be found in Hunter (1980). *lt; The validity of . 38 for integrity tests is from Schmidt, Ones, and Viswesvaran (1994). Integrity tests and conscientiousness tests have been found to correlate zero with GMA (Ones, 1993; Ones, Viswesvaran Schmidt, 1993). The validity of . 30 for conscientiousness measures is from the meta-analysis presented by Mount and Barrick (1995, Table 2). d The validity of interviews is from McDaniel, Whetzel, Schmidt, and Mauer (1994, Table 5). McDaniel et al. reported values of . 34 and . 36 for structured and unstructured interviews, respectively. However, this small difference of . 02 appears to be a result of second order sampling error (Hunter Schmidt, 1990, Ch. 9). We therefore used the average value of . 35 as the validity estimate for structured and unstructured interviews. The correlation between interviews and ability scores (. 32) is the overall figure from Huffcutt, Roth, and McDaniel (1996, Table 3) across all levels of interview structure. * The validity for peer ratings is from Hunter and Hunter (1984, Table 8). These calculations are based on an estimate of the correlation between ability and peer ratings of . 50. (See note i to Table 1). No correction for range restriction (if any) was possible in the data. The validity of reference checks is from Hunter and Hunter (1984, Table 8). The correlation between reference checks and ability was taken as . 0. Assumption of a larger correlation will reduce the estimate of incremental validity. No correction for range restriction was possible. The validity of job experience is from Hunter and Hunter (1984, Table 6). These calculations are based on an estimate of the correlation between job experience and ability of zero. (See note 1 to Table 1). * The validity of biographical data measures is from Hunte r and Hunter (1984, Table 8). This validity estimate is not adjusted for range restriction (if any). The correlation between biographical data measures and ability is estimated at . 0 (Schmidt, 1988). The validity of education is from Hunter and Hunter (1984, Table 6). The correlation between education and ability within applicant pools was taken as zero. (See note p to Table 1). The validity of interests is from Hunter and Hunter (1984, Table 8). The correlation between interests and ability was taken as zero (Holland, 1986). the jobs in the U. S. economy—is the value entered in Table 1. This category includes skilled blue collar jobs and mid-level white collar jobs, such as upper level clerical and lower level administrative jobs. Hence, the conclusions in this article apply mainly to the middle 62% of jobs in the U. S. economy in terms of complexity. The validity of . 51 is representative of findings for GMA measures in other meta-analyses (e. g. , Pearlman et al. , 1980) and it is a value that produces high practical utility. As noted above, GMA is also an excellent predictor of jobrelated learning. It has been found to have high and essentially equal predictive validity for performance (amount learned) in job training programs for jobs at all job levels studied. In