An important consideration for studies that derive utility scores using multi-attribute utility measures is the psychometric integrity of the measurement instrument. Of particular importance is the requirement to establish the empirical validity of multi-attribute utility measures; that is, whether they generate utility scores that, in practice, reflect people's preferences. We compared the empirical validity of EQ-5D versus SF-6D utility scores based on hypothetical preferences in a large, representative sample of the English population.
Adult participants in the 1996 Health Survey for England (n=16 443) formed the basis of the investigation. The subjects were asked to complete the EQ-5D and SF-36 measures. Their responses were converted into utility scores using the York A1 tariff set and the SF-6D utility algorithm, respectively. One-way analysis of variance was used to test the hypothetically constructed preference rule that each set of utility scores differs significantly by self-reported health status (categorised as very good, good, fair, bad or very bad). The degree to which EQ-5D and SF-6D utility scores reflect alternative configurations of self-reported health status; illness, disability or infirmity, and medication use was tested using the relative efficiency statistic and receiver operating characteristic (ROC) curves.
The mean utility score for the EQ-5D was 0.845 (95% CI: 0.842, 0.849), whilst the mean utility score for the SF-6D was 0.799 (95% CI: 0.797, 0.802), representing a mean difference in utility score of 0.046 (95% CI: 0.044, 0.049; p