Human immunodeficiency virus (HIV) is a serious health problem in the Russian Federation. However, the true scale of HIV in Russia has long been the subject of considerable debate. Using digital surveillance to monitor diseases has become increasingly popular in high income countries. But Internet users may not be representative of overall populations, and the characteristics of the Internet-using population cannot be directly ascertained from search pattern data. This exploratory infoveillance study examined if Internet search patterns can be used for disease surveillance in a large middle-income country with a dispersed population.
This study had two main objectives: (1) to validate Internet search patterns against national HIV prevalence data, and (2) to investigate the relationship between search patterns and the determinants of Internet access.
We first assessed whether online surveillance is a valid and reliable method for monitoring HIV in the Russian Federation. Yandex and Google both provided tools to study search patterns in the Russian Federation. We evaluated the relationship between both Yandex and Google aggregated search patterns and HIV prevalence in 2011 at national and regional tiers. Second, we analyzed the determinants of Internet access to determine the extent to which they explained regional variations in searches for the Russian terms for "HIV" and "AIDS". We sought to extend understanding of the characteristics of Internet searching populations by data matching the determinants of Internet access (age, education, income, broadband access price, and urbanization ratios) and searches for the term "HIV" using principal component analysis (PCA).
We found generally strong correlations between HIV prevalence and searches for the terms "HIV" and "AIDS". National correlations for Yandex searches for "HIV" were very strongly correlated with HIV prevalence (Spearman rank-order coefficient [rs]=.881, P = .001) and strongly correlated for "AIDS" (rs = .714, P = .001). The strength of correlations varied across Russian regions. National correlations in Google for the term "HIV" (rs = .672, P = .004) and "AIDS" (rs = .584, P = .001) were weaker than for Yandex. Second, we examined the relationship between the determinants of Internet access and search patterns for the term "HIV" across Russia using PCA. At the national level, we found Principal Component 1 loadings, including age (-0.56), HIV search (-0.533), and education (-0.479) contributed 32% of the variance. Principal Component 2 contributed 22% of national variance (income, -0.652 and broadband price, -0.460).
This study contributes to the methodological literature on search patterns in public health. Based on our preliminary research, we suggest that PCA may be used to evaluate the relationship between the determinants of Internet access and searches for health problems beyond high-income countries. We believe it is in middle-income countries that search methods can make the greatest contribution to public health.
Notes
Cites: Stat Methods Med Res. 1992;1(1):69-951341653
Cites: Oncology. 1986;43(2):116-263951786
Cites: AIDS. 2006 Apr 4;20(6):901-616549975
Cites: Health Psychol. 2006 Mar;25(2):205-1016569112
Cites: AIDS Care. 2006 Oct;18(7):846-5216971297
Cites: AMIA Annu Symp Proc. 2006;:244-817238340
Cites: BMC Public Health. 2007;7:5317425798
Cites: Int J Health Geogr. 2007;6:2217553136
Cites: Health (London). 2007 Jul;11(3):327-4717606698
Cites: BMC Med Inform Decis Mak. 2007;7:2417850656
Cites: J Health Commun. 2008 Mar;13(2):181-9918300068
Cites: Clin Infect Dis. 2008 Dec 1;47(11):1443-818954267
Cites: Nature. 2009 Feb 19;457(7232):1012-419020500
Cites: J Med Internet Res. 2009;11(1):e1119329408
Cites: N Engl J Med. 2009 May 21;360(21):2153-5, 215719423867
Cites: Public Health Nutr. 2009 Sep;12(9):1366-7219063765
Cites: Clin Infect Dis. 2009 Nov 15;49(10):1557-6419845471
Cites: Subst Use Misuse. 2010 May;45(6):813-6420397872
Cites: Drug Alcohol Depend. 2010 Jun 1;109(1-3):79-8320060238
Cites: Int J Drug Policy. 2011 Mar;22(2):133-921055913
Cites: AIDS Behav. 2011 May;15(4):767-7720803063
Cites: Am J Prev Med. 2011 May;40(5 Suppl 2):S154-821521589