This time I elaborate on a much more specific subject that will mostly concern biologists and geneticists. I will try my best to outline the approach as to ensure non-experts will still have a basic understanding. This tutorial illustrates the power of genome-wide association (GWA) studies by mapping the genetic determinants of cholesterol levels using … Continue reading Genome-wide association studies in R

# Partial least squares in R

My last entry introduces principal component analysis (PCA), one of many unsupervised learning tools. I concluded the post with a demonstration of principal component regression (PCR), which essentially is a ordinary least squares (OLS) fit using the first $latex k &s=1$ principal components (PCs) from the predictors. This brings about many advantages: There is virtually no … Continue reading Partial least squares in R

# Principal Component Analysis in R

Principal component analysis (PCA) is routinely employed on a wide range of problems. From the detection of outliers to predictive modeling, PCA has the ability of projecting the observations described by $latex p &s=1$ variables into few orthogonal components defined at where the data 'stretch' the most, rendering a simplified overview. PCA is particularly powerful in dealing with multicollinearity … Continue reading Principal Component Analysis in R

# Probability distributions in R

Some of the most fundamental functions in R, in my opinion, are those that deal with probability distributions. Whenever you compute a P-value you rely on a probability distribution, and there are many types out there. In this exercise I will cover four: Bernoulli, Binomial, Poisson, and Normal distributions. Let me begin with some theory first: Bernoulli … Continue reading Probability distributions in R