OBJECTIVE: This article compares preference-based utilities from the multiattribute utility instrument 15D with those derived from the EQ-5D and the Short Form 36 (SF-6D) in patients with HIV/AIDS. In particular, we wanted to examine if the finer descriptive system of the 15D would result in better discriminative capacity or responsiveness. METHODS: In a prospective observational study of 60 Norwegian patients with HIV/AIDS from two hospitals, the authors compared scores, assessed associations with disease staging systems, and assessed test-retest reliability and responsiveness of the instruments. RESULTS: On average, the 15D gave higher utility scores than the other two measures, the mean utility scores were: 15D--0.86, SF-6D--0.73, and EQ-5D Index--0.77. Test-retest reliability was acceptable for all measures, with intraclass correlation coefficients between 0.78 and 0.94. The correlation between scores of the 3 scales was substantial (p = 0.74-0.80). There was no major difference in responsiveness between the measures. CONCLUSIONS: The different measures gave different utility values in this sample of patients with HIV/AIDS, although many of the measurement properties were similar. There was no evidence for better discriminative capacity or responsiveness for the 15D, than for the two other multiattribute measures.
The aim of the study was to validate the Norwegian version of a self-administered 43-item questionnaire designed to assess quality of life in kidney transplant recipients, the End-Stage Renal Disease Symptom Checklist--Transplantation Module (ESRD-SCL).
In total, 53 kidney transplant recipients from one university-affiliated hospital responded to a questionnaire including the ESRD-SCL and the Short Form 36 (SF-36). We assessed internal consistency reliability and test-retest reliability with 2 weeks between assessments. Construct validity was assessed by correlations of the ESRD-SCL subscales with related and unrelated SF-36 scales, demographic, and clinical characteristics.
Subscales of the ESRD-SCL showed good internal consistency reliability (Cronbach's = 0.72-0.81) and for the aggregate total scale alpha was 0.94. Test-retest reliability median 14 days apart was excellent with intraclass coefficients ranging from 0.87 to 0.95. The pattern of correlations of the ESRD-SCL scales with related and unrelated scales SF-36 scales and demographic and clinical characteristics gave support to the construct validity of the ESRD-SCL.
The Norwegian translation of the ESRD-SCL showed satisfactory internal consistency reliability, test-retest reliability and construct validity, at the level of the original German version.
The aim of this study was to validate the Norwegian version of the Seattle Angina Questionnaire (SAQ), a self-administered 19-item questionnaire designed to assess health-related quality of life in patients with chest pain or coronary artery disease. In 885 patients with prior myocardial infarction (MI), we abstracted clinical data from the patients' medical records. Two to three years after the MI, we mailed a self-administered questionnaire including the SAQ, the Short Form 36 (SF-36), and questions about current medication, to the 548 patients still alive. The response rate was 74%. Internal consistency reliability of the SAQ, assessed with Cronbach's alpha, ranged 0.75-0.92. Test-retest reliability, tested with an intraclass correlation coefficient, ranged 0.29-0.84. The pattern of association between similar and dissimilar scales of the SAQ and SF-36 mainly supported the construct validity of the SAQ. Four of the five SAQ scales discriminated between patients with different medication regimens as a proxy for severity of angina pectoris. We conclude that the Norwegian version of the SAQ showed acceptable reliability and cross-sectional validity following MI, with properties in line with the original US version.
The objective of this study was to assess the reliability and validity of a Norwegian version of the self-administered Epworth sleepiness scale (ESS).
Two samples responded to the ESS: (1) 226 patients previously evaluated for obstructive sleep apnea, of whom 51 also responded to a retest 2 weeks later, and (2) 37 ambulant patients complaining of excessive daytime sleepiness, who were referred to multiple sleep latency testing (MSLT). We assessed internal consistency reliability with Cronbach's alpha and test-retest reliability with weighted kappa (Kw) or an intraclass correlation coefficient (ICC). The validity of the Norwegian ESS was assessed by correlating ESS item and total scores with the number of times a patient fell asleep and the mean latency found on the MSLT.
Internal consistency reliability, as assessed with Cronbach's alpha, was 0.84 (n = 154). Test-retest reliability for the eight ESS items ranged from Kw of 0.61 to 0.80 (n = 50) and for the total score. ICC was 0.81.There was only fair to moderate correlation of ESS item and total scores with MSLT variables, mainly in a subset of patients with total ESS score >10.
The Norwegian version of the ESS had acceptable internal consistency and test-retest reliability. The association of the ESS items and total score with the MSLT was only fair to moderate, in line with previous studies.
Pain is a cardinal symptom of osteoarthritis (OA) of the hip and important for deciding when to operate. This study assessed the internal consistency reliability, validity and responsiveness of the Brief Pain Inventory (BPI) among patients with OA undergoing total hip replacement (THR).
We prospectively included 250 of 356 patients who were accepted to the waiting list for primary THR surgery. All participants responded to the BPI, WOMAC and SF-36 at baseline and 1 year after surgery.
Internal consistency reliability (Cronbach's a) was >0.80 for the BPI, the WOMAC and five of the eight SF-36 scales The pattern of associations of the two BPI scales with corresponding and non-corresponding scales of the WOMAC and SF-36 largely supported the construct validity of the BPI. The responsiveness indices for change from baseline to 1 year after THR ranged from 1.52 to 2.05 for the BPI scales, from 1.69 to 2.84 for the WOMAC scales, and from 0.25 (general health) to 2.77 (bodily pain) for the SF-36 scales.
The BPI showed acceptable reliability, construct validity and responsiveness in patients with OA undergoing THR. BPI is short and therefore is easy to use and score, though the instrument offers few advantages over and duplicates scales of more comprehensive instruments, such as the WOMAC and SF-36.
Stroke severity is an important determinant of outcome, however, quantitative data on the initial neurological status might be lacking in retrospective studies. We wanted to assess the reliability and validity of the retrospective use of the Canadian Neurological Scale (CNS).
In 181 patients with validated stroke, two raters scored the CNS based on medical record review. We assessed interrater reliability and construct validity of the CNS. Predictive validity was assessed by the ability of the CNS to predict 30-day and 1-year mortality.
Interrater reliability was high (kappa or weighted kappa 0.76-0.96). Correlations between similar items of prospective Scandinavian Stroke Scale scores and retrospective CNS scores ranged from 0.54 to 0.85. CNS total score was a strong predictor of death within 30 days and 1 year in multivariate models.
The retrospective algorithm for the CNS had a high to substantial interrater reliability and predictive validity. Accordingly, in retrospective stroke studies using medical record information, the CNS can be a feasible instrument to adjust for differences in stroke severity.
Misinterpretation of radiological examinations is an important contributing factor to diagnostic errors. Double reading reduces interpretation errors and increases sensitivity. Consultant radiologists in Norwegian hospitals submit 39% of computed tomography (CT) reports for quality assurance by double reading. Our objective was to estimate the proportion of radiology reports that were changed during double reading and to assess the potential clinical impact of these changes.
In this retrospective cross-sectional study we acquired preliminary and final reports from 1023 consecutive double read chest CT examinations conducted at five public hospitals. The preliminary and final reports were compared for changes in content. Three experienced pulmonologists independently rated the clinical importance of these changes. The severity of the radiological findings in clinically important changes was classified as increased, unchanged, or decreased.
Changes were classified as clinically important in 91 (9%) of 1023 reports. Of these: 3 were critical (demanding immediate action), 15 were major (implying a change in treatment) and 73 were intermediate (affecting subsequent investigations). More clinically important changes were made to urgent examinations and less to female first readers. Chest radiologist made more clinically important changes than other second readers. The severity of the radiological findings was increased in 73 (80%) of the clinically important changes.
A 9% rate of clinically important changes made during double reading may justify quality assurance of radiological interpretation. Using expert second readers and targeting a selection of urgent cases prospectively may increase the yield of discrepant cases and reduce harm to patients.