The site contains several tools for analyzing psychometric test results, such as calculating confidence intervals and comparing discrepancies.
Psychometric test results are not absolutely accurate. Confidence intervals are estimates of the range in which the true value lies with a certain probability. The following calculation determines the confidence interval based on the standard error of estimation. The result can also be corrected for the effect of regression to the mean. In this case, the estimated value is also displayed.
Type of score | |
Score | |
Reliability | |
Confidence | |
Regression to the mean | |
Estimated Value | |
Confidence interval |
The calculation is performed using the following formula (correcting for regression to the mean with estimated value zpredicted = zscore * rel; for standard estimation errors, see Krum, Amelang & Schmidt-Atzert, 2022, p. 149):
If you want to test against a fixed value, it is sufficient to use the one-sided confidence interval and test with z1-α (Krum et al., 2022, p. 151 f.). However, it is still necessary to specify the direction in which the result is to be secured. It also depends on the exact formulation of the hypothesis. Because of the higher accuracy of the calculation, the regression to the mean correction should be applied again.
For example, if a person scores an IQ of 135, it can be investigated whether the result is significantly higher than a value of 130, which is considered the threshold for giftedness. In this case, the score would have to be significantly higher than 130 (the direction must be "... is higher than..."). Another question may be whether giftedness can be ruled out, for example, if the result is 122. In this case, one would hedge upwards (direction "... is lower than...") and require a non-significant result, which would correspond to the statement that giftedness cannot be ruled out.
Type of score | |
Result | |
Direction of test | |
Cutoff | |
Reliability | |
Significance level | |
Regression to the mean | |
Result of the hypothesis test |
The test is one-tailed with the standard error of measurement using the following formula (correction for regression to the mean is made with the estimated value zpredicted = zscore * rel):
When a test is repeated on an individual, the so-called Reliable Change Index (RCI; Jacobson & Truax, 1991; see also Krum et al., 2022, p. 153) can be determined. The RCI can be interpreted as a test variable in a z-test. It can be used to express whether there are significant differences between two test scores, e.g. whether an intervention has led to a significant change in characteristics.
Type of score | |
Result 1 | |
Result 2 | |
Reliability | |
Test value | |
Interpretation |
As in Calculators 1 and 2, the percentiles are converted to z-scores per inverse cumulative normal distribution prior to the calculation. The formula for calculating the RCI is based on Jacobson and Truax (1991; see Krum et al., 2022, p. 153):
When a person is tested with different tests or scales of a test, it can be interesting to compare the results. For example, one might want to investigate whether logical reasoning is better developed than verbal comprehension, if intelligence tests do not already provide such analysis options. Or one might want to clarify whether the stress levels of different clinical symptoms differ.
Type of score | |
Result 1 | |
Result 2 | |
Reliability 1 | |
Reliability 2 | |
Test value | |
Interpretation |
As with the previous calculators, percentiles are converted to z-scores by inverse cumulative normal distribution before calculation. In general, the procedure is also suitable for raw values, provided that the population mean is known. This is given per se when norm scores are used. For both test results, a value Yi must first be calculated. Then the test statistic z can be determined. The formulas for calculating the test statistic to compare the test results (Krum et al., 2022, p. 154f.):
Profiles of psychometric results can be analyzed regarding equality (= profile identity), structure (= profile shape), or magnitude (= profile height) (cf. Huber 1973, chap. 10). Such an approach can be applied, for example, to intelligence profiles or the clinical stress spectrum before and after therapy. Huber (1973, p.) gives as an example the result of a 35-year-old person in the Intelligence Structure Test (IST; Amthauer, 1953) with the following results:
Subtest | Testing 1 | Testing 2 |
Sentence completion | 92 | 113 |
Vocabulary | 103 | 96 |
Analogy | 93 | 113 |
Similarities | 100 | 116 |
Memory tasks | 94 | 103 |
Calculation tasks | 102 | 104 |
Cube tasks | 109 | 98 |
Hypothesis test | |||||||||||||
Norm scale | |||||||||||||
Number of scales | |||||||||||||
Reliability | |||||||||||||
Total reliability | |||||||||||||
Test results
|
|||||||||||||
Chi2 tests are conducted with the help of jStat.
Citeable source:
Lenhard, W. & Lenhard, A. (2023). Confidence intervals, test of discrepancy and profile analysis for psychometric results. available: https://www.psychometrica.de/discrepancy.html. Psychometrica.