Use of support vector machines for disease risk prediction in genome-wide association studies: concerns and opportunities.


The success of genome-wide association studies (GWAS) in deciphering the genetic architecture of complex diseases has fueled the expectations whether the individual risk can also be quantified based on the genetic architecture. So far, disease risk prediction based on top-validated single-nucleotide polymorphisms (SNPs) showed little predictive value. Here, we applied a support vector machine (SVM) to Parkinson disease (PD) and type 1 diabetes (T1D), to show that apart from magnitude of effect size of risk variants, heritability of the disease also plays an important role in disease risk prediction. Furthermore, we performed a simulation study to show the role of uncommon (frequency 1-5%) as well as rare variants (frequency