R/dim_reduction.R
calculate_sample_mahalanobis_distances.RdDetermine each samples distance from the center of the data using Mahalanobis distance.
calculate_sample_mahalanobis_distances(
tomic,
value_var = NULL,
max_pcs = 10,
scale = FALSE
)The samples tibble with a new column `pc_distance` which contains the Mahalanobis distances of individual samples from the PC elipsoid
Since `romic` is built around using tall data where there are more features than samples calculating Mahalanobis distance off of the covariance matrix is not possible. Instead, we use SVD to create a low-dimensional representation of the covariance matrix and calculate distances from the center of the data in this space. This essentially involves weighting the principal components by their loadings.
calculate_sample_mahalanobis_distances(brauer_2008_tidy)
#> # A tibble: 36 × 4
#> sample nutrient DR pc_distance
#> <chr> <chr> <dbl> <dbl>
#> 1 G0.05 G 0.05 188.
#> 2 G0.1 G 0.1 125.
#> 3 G0.15 G 0.15 128.
#> 4 G0.2 G 0.2 125.
#> 5 G0.25 G 0.25 83.4
#> 6 G0.3 G 0.3 101.
#> 7 N0.05 N 0.05 371.
#> 8 N0.1 N 0.1 226.
#> 9 N0.15 N 0.15 123.
#> 10 N0.2 N 0.2 100.
#> # ℹ 26 more rows