Health administrative data can be a valuable tool for disease surveillance and research. Few studies have rigorously evaluated the accuracy of administrative databases for identifying rheumatoid arthritis (RA) patients. Our aim was to validate administrative data algorithms to identify RA patients in Ontario, Canada.
We performed a retrospective review of a random sample of 450 patients from 18 rheumatology clinics. Using rheumatologist-reported diagnosis as the reference standard, we tested and validated different combinations of physician billing, hospitalization, and pharmacy data.
One hundred forty-nine rheumatology patients were classified as having RA and 301 were classified as not having RA based on our reference standard definition (study RA prevalence 33%). Overall, algorithms that included physician billings had excellent sensitivity (range 94-100%). Specificity and positive predictive value (PPV) were modest to excellent and increased when algorithms included multiple physician claims or specialist claims. The addition of RA medications did not significantly improve algorithm performance. The algorithm of "(1 hospitalization RA code ever) OR (3 physician RA diagnosis codes [claims] with =1 by a specialist in a 2-year period)" had a sensitivity of 97%, specificity of 85%, PPV of 76%, and negative predictive value of 98%. Most RA patients (84%) had an RA diagnosis code present in the administrative data within ±1 year of a rheumatologist's documented diagnosis date.
We demonstrated that administrative data can be used to identify RA patients with a high degree of accuracy. RA diagnosis date and disease duration are fairly well estimated from administrative data in jurisdictions of universal health care insurance.