We present a general algorithm for the detection of genomic variants using the Illumina iSelect platform. The Illumina iSelect platform is designed to detect SNPs, but our algorithm allows for the detections of more general forms of variations, including copy number polymorphisms and microsatellites. The algorithm does not rely on a priori information of the type of polymorphism being studied and is designed to genotype call a large number of individuals simultaneously. The algorithm proceeds by initially normalizing intensity and correcting for batch effects. Then each marker is clustered using a modified Gaussian mixture model where we account for variances in the expression of an individuals and the variance measured in bead level intensities of a probe/marker pair. Finally, these clusters are used to determine genotypes. The algorithm was then run on a dataset of 35,000 Icelandic individuals.
Genetic studies have evaluated the influence of blood lipid levels on the risk of coronary artery disease (CAD), but less is known about how they are associated with the extent of coronary atherosclerosis.
To estimate the contributions of genetically predicted blood lipid levels on the extent of coronary atherosclerosis.
This genetic study included Icelandic adults who had undergone coronary angiography or assessment of coronary artery calcium using cardiac computed tomography. The study incorporates data collected from January 1987 to December 2017 in Iceland in the Swedish Coronary Angiography and Angioplasty Registry and 2 registries of individuals who had undergone percutaneous coronary interventions and coronary artery bypass grafting. For each participant, genetic scores were calculated for levels of non-high-density lipoprotein cholesterol (non-HDL-C), low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglycerides, based on reported effect sizes of 345 independent, lipid-associated variants. The genetic scores' predictive ability for lipid levels was assessed in more than 87?000 Icelandic adults. A mendelian randomization approach was used to estimate the contribution of each lipid trait.
Genetic scores for levels of non-HDL-C, LDL-C, HDL-C, and triglycerides.
The extent of angiographic CAD and coronary artery calcium quantity.
A total of 12?460 adults (mean [SD] age, 65.1 [10.7] years; 8383 men [67.3%]) underwent coronary angiography, and 4837 had coronary artery calcium assessed by computed tomography. A genetically predicted increase in non-HDL-C levels by 1 SD (38 mg/dL [to convert to millimoles per liter, multiply by 0.0259]) was associated with greater odds of obstructive CAD (odds ratio [OR], 1.83 [95% CI, 1.63-2.07]; P?=?2.8?×?10-23). Among patients with obstructive CAD, there were significant associations with multivessel disease (OR, 1.26 [95% CI, 1.11-1.44]; P?=?4.1?×?10-4) and 3-vessel disease (OR, 1.47 [95% CI, 1.26-1.72]; P?=?9.2?×?10-7). There were also significant associations with the presence of coronary artery calcium (OR, 2.04 [95% CI, 1.70-2.44]; P?=?5.3?×?10-15) and loge-transformed coronary artery calcium (effect, 0.70 [95% CI, 0.53-0.87]; P?=?1.0?×?10-15). Genetically predicted levels of non-HDL-C remained associated with obstructive CAD and coronary artery calcium extent even after accounting for the association with LDL-C. Genetically predicted levels of HDL-C and triglycerides were associated individually with the extent of coronary atherosclerosis, but not after accounting for the association with non-HDL cholesterol.
In this study, genetically predicted levels of non-HDL-C were associated with the extent of coronary atherosclerosis as estimated by 2 different methods. The association was stronger than for genetically predicted levels of LDL-C. These findings further support the notion that non-HDL-C may be a better marker of the overall burden of atherogenic lipoproteins than LDL-C.
Chronic kidney disease (CKD) is a worldwide public health problem that is associated with substantial morbidity and mortality. To search for sequence variants that associate with CKD, we conducted a genome-wide association study (GWAS) that included a total of 3,203 Icelandic cases and 38,782 controls. We observed an association between CKD and a variant with 80% population frequency, rs4293393-T, positioned next to the UMOD gene (GeneID: 7369) on chromosome 16p12 (OR = 1.25, P = 4.1x10(-10)). This gene encodes uromodulin (Tamm-Horsfall protein), the most abundant protein in mammalian urine. The variant also associates significantly with serum creatinine concentration (SCr) in Icelandic subjects (N = 24,635, P = 1.3 x 10(-23)) but not in a smaller set of healthy Dutch controls (N = 1,819, P = 0.39). Our findings validate the association between the UMOD variant and both CKD and SCr recently discovered in a large GWAS. In the Icelandic dataset, we demonstrate that the effect on SCr increases substantially with both age (P = 3.0 x 10(-17)) and number of comorbid diseases (P = 0.008). The association with CKD is also stronger in the older age groups. These results suggest that the UMOD variant may influence the adaptation of the kidney to age-related risk factors of kidney disease such as hypertension and diabetes. The variant also associates with serum urea (P = 1.0 x 10(-6)), uric acid (P = 0.0064), and suggestively with gout. In contrast to CKD, the UMOD variant confers protection against kidney stones when studied in 3,617 Icelandic and Dutch kidney stone cases and 43,201 controls (OR = 0.88, P = 5.7 x 10(-5)).
Attention-deficit/hyperactivity disorder (ADHD) is a highly heritable common childhood-onset neurodevelopmental disorder. Some rare copy number variations (CNVs) affect multiple neurodevelopmental disorders such as intellectual disability, autism spectrum disorders (ASD), schizophrenia and ADHD. The aim of this study is to determine to what extent ADHD shares high risk CNV alleles with schizophrenia and ASD. We compiled 19 neuropsychiatric CNVs and test 14, with sufficient power, for association with ADHD in Icelandic and Norwegian samples. Eight associate with ADHD; deletions at 2p16.3 (NRXN1), 15q11.2, 15q13.3 (BP4 & BP4.5-BP5) and 22q11.21, and duplications at 1q21.1 distal, 16p11.2 proximal, 16p13.11 and 22q11.21. Six of the CNVs have not been associated with ADHD before. As a group, the 19 CNVs associate with ADHD (OR?=?2.43, P?=?1.6?×?10-21), even when comorbid ASD and schizophrenia are excluded from the sample. These results highlight the pleiotropic effect of the neuropsychiatric CNVs and add evidence for ADHD, ASD and schizophrenia being related neurodevelopmental disorders rather than distinct entities.
BACKGROUND: The contribution of low-penetrant susceptibility variants to cancer is not clear. With the aim of searching for genetic factors that contribute to cancer at one or more sites in the body, we have analyzed familial aggregation of cancer in extended families based on all cancer cases diagnosed in Iceland over almost half a century. METHODS AND FINDINGS: We have estimated risk ratios (RRs) of cancer for first- and up to fifth-degree relatives both within and between all types of cancers diagnosed in Iceland from 1955 to 2002 by linking patient information from the Icelandic Cancer Registry to an extensive genealogical database, containing all living Icelanders and most of their ancestors since the settlement of Iceland. We evaluated the significance of the familial clustering for each relationship separately, all relationships combined (first- to fifth-degree relatives) and for close (first- and second-degree) and distant (third- to fifth-degree) relatives. Most cancer sites demonstrate a significantly increased RR for the same cancer, beyond the nuclear family. Significantly increased familial clustering between different cancer sites is also documented in both close and distant relatives. Some of these associations have been suggested previously but others not. CONCLUSION: We conclude that genetic factors are involved in the etiology of many cancers and that these factors are in some cases shared by different cancer sites. However, a significantly increased RR conferred upon mates of patients with cancer at some sites indicates that shared environment or nonrandom mating for certain risk factors also play a role in the familial clustering of cancer. Our results indicate that cancer is a complex, often non-site-specific disease for which increased risk extends beyond the nuclear family.
Genetic diversity arises from recombination and de novo mutation (DNM). Using a combination of microarray genotype and whole-genome sequence data on parent-child pairs, we identified 4,531,535 crossover recombinations and 200,435 DNMs. The resulting genetic map has a resolution of 682 base pairs. Crossovers exhibit a mutagenic effect, with overrepresentation of DNMs within 1 kilobase of crossovers in males and females. In females, a higher mutation rate is observed up to 40 kilobases from crossovers, particularly for complex crossovers, which increase with maternal age. We identified 35 loci associated with the recombination rate or the location of crossovers, demonstrating extensive genetic control of meiotic recombination, and our results highlight genes linked to the formation of the synaptonemal complex as determinants of crossovers.
ErratumIn: Science. 2019 Feb 8;363(6427): PMID 30733390
Meiotic recombination contributes to genetic diversity by yielding new combinations of alleles. Individuals vary with respect to the genome-wide recombination counts in their gametes. Exploiting data resources in Iceland, we compiled a data set consisting of 35,927 distinct parents and 71,929 parent-offspring pairs. Within this data set, we called over 2.2 million recombination events and imputed variants with sequence-level resolution from 2,261 whole genome-sequenced individuals into the parents to search for variants influencing recombination rate. We identified 13 variants in 8 regions that are associated with genome-wide recombination rate, 8 of which were previously unknown. Three of these variants associate with male recombination rate only, seven variants associate with female recombination rate only and three variants affect both. Two are low-frequency variants with large effects, one of which is estimated to increase the male and female genetic maps by 111 and 416 cM, respectively. This variant, located in an intron, would not be found by exome sequencing.
Alpha-fetoprotein (AFP), cancer antigens 15.3, 19.9, and 125, carcinoembryonic antigen, and alkaline phosphatase (ALP) are widely measured in attempts to detect cancer and to monitor treatment response. However, due to lack of sensitivity and specificity, their utility is debated. The serum levels of these markers are affected by a number of nonmalignant factors, including genotype. Thus, it may be possible to improve both sensitivity and specificity by adjusting test results for genetic effects.
We performed genome-wide association studies of serum levels of AFP (N = 22,686), carcinoembryonic antigen (N = 22,309), cancer antigens 15.3 (N = 7,107), 19.9 (N = 9,945), and 125 (N = 9,824), and ALP (N = 162,774). We also examined the correlations between levels of these biomarkers and the presence of cancer, using data from a nationwide cancer registry.
We report a total of 84 associations of 79 sequence variants with levels of the six biomarkers, explaining between 2.3% and 42.3% of the phenotypic variance. Among the 79 variants, 22 are cis (in- or near the gene encoding the biomarker), 18 have minor allele frequency less than 1%, 31 are coding variants, and 7 are associated with gene expression in whole blood. We also find multiple conditions associated with higher biomarker levels.
Our results provide insights into the genetic contribution to diversity in concentration of tumor biomarkers in blood.
Genetic correction of biomarker values could improve prediction algorithms and decision-making based on these biomarkers.
Kidney stone disease is a complex disorder with a strong genetic component. We conducted a genome-wide association study of 28.3 million sequence variants detected through whole-genome sequencing of 2,636 Icelanders that were imputed into 5,419 kidney stone cases, including 2,172 cases with a history of recurrent kidney stones, and 279,870 controls. We identify sequence variants associating with kidney stones at ALPL (rs1256328[T], odds ratio (OR)=1.21, P=5.8 ? 10(-10)) and a suggestive association at CASR (rs7627468[A], OR=1.16, P=2.0 ? 10(-8)). Focusing our analysis on coding sequence variants in 63 genes with preferential kidney expression we identify two rare missense variants SLC34A1 p.Tyr489Cys (OR=2.38, P=2.8 ? 10(-5)) and TRPV5 p.Leu530Arg (OR=3.62, P=4.1 ? 10(-5)) associating with recurrent kidney stones. We also observe associations of the identified kidney stone variants with biochemical traits in a large population set, indicating potential biological mechanism.
Cites: Nat Protoc. 2009;4(7):1073-8119561590
Cites: Nat Genet. 2009 Aug;41(8):926-3019561606
Cites: Am J Physiol Renal Physiol. 2009 Sep;297(3):F671-819570882
Creatine kinase (CK) and lactate dehydrogenase (LDH) are widely used markers of tissue damage. To search for sequence variants influencing serum levels of CK and LDH, 28.3 million sequence variants identified through whole-genome sequencing of 2,636 Icelanders were imputed into 63,159 and 98,585 people with CK and LDH measurements, respectively. Here we describe 13 variants associating with serum CK and 16 with LDH levels, including four that associate with both. Among those, 15 are non-synonymous variants and 12 have a minor allele frequency below 5%. We report sequence variants in genes encoding the enzymes being measured (CKM and LDHA), as well as in genes linked to muscular (ANO5) and immune/inflammatory function (CD163/CD163L1, CSF1, CFH, HLA-DQB1, LILRB5, NINJ1 and STAB1). A number of the genes are linked to the mononuclear/phagocyte system and clearance of enzymes from the serum. This highlights the variety in the sources of normal diversity in serum levels of enzymes.
Cites: Immunol Rev. 2008 Aug;224:98-12318759923
Cites: Mayo Clin Proc. 2008 Jun;83(6):687-70018533086
Cites: J Immunol. 2008 Dec 15;181(12):8433-4019050261
Cites: Bioinformatics. 2008 Dec 15;24(24):2938-918974171
Cites: Nat Genet. 2008 Sep;40(9):1068-7519165921
Cites: N Engl J Med. 2009 Jul 2;361(1):62-7219571284
Cites: Proc Natl Acad Sci U S A. 2010 Feb 2;107(5):2037-4220133848