| tags: [ GCTA Heritability ] categories: [Coding Experiments Reading ]
Re-estimating heritability of traits - GCTA
Introduction
To determine whether the low frequency variants were affecting the heritability estimates, heritabiity analysis must be re-performed with the new genomic data set.
Since ‘gaston’ is still unable to produce p-values for the heritability estimates, GCTA will be used to perform the analysis.
Methods and Results
1. Generate a GRM
./gcta64 --bfile merged_nies/merged_nies_geno071118 --make-grm --out gcta_output/merged_nies_grm091118
2. Perform PCA for inclusion as covariates
./gcta64 --grm gcta_output/merged_nies_grm091118 --pca 20 --out gcta_output/merged_nies_pca091118
This step performs a PCA on the genomic data, like in PLINK, however it only generates results for 2 principal components. The results from this step becomes the input as population structure covariate for the heritability analysis. I would like to consider varying the number of principal components included however, GCTA is not as flexible as gaston i.e. I would have to repeat this step and generate multiple sets of results to do so.
3. Perform heritability analysis on R K-value V
I am re-testing the potential effects of adding principal components to the heritability estimates. I will include age and sex as covariates to all tests and subsequently include 0, 1, 2, 3, 5, and 20 components.
*0 PC
./gcta64 --reml --grm gcta_output/merged_nies_grm0911178 --pheno nies_pheno_020918.txt --mpheno 11 --reml-pred-rand --covar nies_sex_covar.txt --qcovar nies_age_covar091118.txt --out gcta_output/rkvalv_nopc_her_test
h2 = 0.529352, pval = 8.076e-05
*1 PC
./gcta64 --reml --grm gcta_output/merged_nies_grm0911178 --pheno nies_pheno_020918.txt --mpheno 11 --reml-pred-rand --covar nies_sex_covar.txt --qcovar nies_1pc_cont_covar091118.txt --out gcta_output/rkvalv_1pc_her_test
h2 = 0.533720, pval = 1.508e-04
*2 PCs
./gcta64 --reml --grm gcta_output/merged_nies_grm0911178 --pheno nies_pheno_020918.txt --mpheno 11 --reml-pred-rand --covar nies_sex_covar.txt --qcovar nies_2pc_cont_covar091118.txt --out gcta_output/rkvalv_2pc_her_test
h2 = 0.506478, pval = 1.556e-03
*3 PCs
./gcta64 --reml --grm gcta_output/merged_nies_grm0911178 --pheno nies_pheno_020918.txt --mpheno 11 --reml-pred-rand --covar nies_sex_covar.txt --qcovar nies_3pc_cont_covar091118.txt --out gcta_output/rkvalv_3pc_her_test
h2 = 0.517451, pval = 1.4176e-03
*5 PCs
./gcta64 --reml --grm gcta_output/merged_nies_grm0911178 --pheno nies_pheno_020918.txt --mpheno 11 --reml-pred-rand --covar nies_sex_covar.txt --qcovar nies_5pc_cont_covar091118.txt --out gcta_output/rkvalv_5pc_her_test
h2 = 0.533488 pval = 1.4065e-03
*20 PCs
./gcta64 --reml --grm gcta_output/merged_nies_grm0911178 --pheno nies_pheno_020918.txt --mpheno 11 --reml-pred-rand --covar nies_sex_covar.txt --qcovar nies_20pc_cont_covar091118.txt --out gcta_output/rkvalv_20pc_her_test
h2 = 0.686707, pval = 1.7098e-05
4. Compare results
rkvalv_pc_test <- read.csv('C:/Users/Martha/Documents/Honours/Project/honours.project/Data/gcta_output/rkvalv_pc_her_test.csv')
rkvalv_pc_test
## R.K.val.V.heritability.estimates X X.1 X.2 X.3 PC1
## 1 NA h2 p-val NA
## 2 NA low freq 0.732774 1.68E-07 NA low freq
## 3 NA 0 0.529352 8.08E-05 NA 0
## 4 NA 1 0.53372 1.51E-03 NA 1
## 5 NA 2 0.506478 1.56E-03 NA 2
## 6 NA 3 0.517451 1.42E-03 NA 3
## 7 NA 5 0.533488 1.41E-03 NA 5
## 8 NA 20 0.6867 1.71E-05 NA 20
## X.4 X.5
## 1 h2 p-val
## 2 0.584877 1.47E-05
## 3 0.558604 2.65E-06
## 4 0.559359 7.03E-06
## 5 0.55413 2.35E-05
## 6 0.564873 1.46E-05
## 7 0.572998 1.62E-05
## 8 0.660708 8.97E-07
5. Repeat comparison with Component 1
*0 PC
./gcta64 --reml --grm gcta_output/merged_nies_grm0911178 --pheno pheno_pca_coord.txt --mpheno 1 --reml-pred-rand --covar nies_sex_covar.txt --qcovar nies_age_covar091118.txt --out gcta_output/pc1_nopc_her_test
And so on, including 1, 2, 3, 4, 5, and 20 PCs.
pc_her_test <- read.csv('C:/Users/Martha/Documents/Honours/Project/honours.project/Data/gcta_output/rkvalv_pc1_pc_hertest.csv')
pc_her_test
## R.K.val.V X X.1 X.2 PC1 X.3 X.4
## 1 h2 p-val NA h2 p-val
## 2 low freq 0.732774 1.68E-07 NA low freq 0.584877 1.47E-05
## 3 0 0.529352 8.08E-05 NA 0 0.558604 2.65E-06
## 4 1 0.53372 1.51E-03 NA 1 0.559359 7.03E-06
## 5 2 0.506478 1.56E-03 NA 2 0.55413 2.35E-05
## 6 3 0.517451 1.42E-03 NA 3 0.564873 1.46E-05
## 7 5 0.533488 1.41E-03 NA 5 0.572998 1.62E-05
## 8 20 0.6867 1.71E-05 NA 20 0.660708 8.97E-07
Interestingly, these results indicate that removing the low frequency variants from the genomic data singificantly reduces the heritability estimate for this particular trait (R Kval V). Also, as per the previous analysis, the addition of 1 to 5 components does not significantly alter the estimates, but the addition of 2PCs produced the lowest estimate. These results also indicate that adding 20 components produces a significantly higher heritability estimate. Why?
5. Perform heritability analysis on all traits and components
Traits
for i in {1..27}
do ./gcta64 --reml --grm gcta_output/merged_nies_grm0911178 --pheno nies_pheno_020918.txt --mpheno $i --reml-pred-rand --covar nies_sex_covar.txt --qcovar nies_2pc_cont_covar091118.txt --out gcta_output/$i_est
head $i_est.hsq >> trait.est.head.txt
done
Components
for i in {1..9}
do ./gcta64 --reml --grm gcta_output/merged_nies_grm0911178 --pheno pheno_pca_coord.txt --mpheno $i --reml-pred-rand --covar nies_sex_covar.txt --qcovar nies_2pc_cont_covar091118.txt --out gcta_output/$i_pc_est
done
new_heritability091118 <- read.csv('C:/Users/Martha/Documents/Honours/Project/honours.project/Data/gcta_output/heritability_161118/new_heritability161118.csv')
new_heritability091118
## X Trait h2 pval X.1 PC h2.1 pval.1
## 1 18 L.Pachimetry 0.6211 2.48e-06 NA 1 0.55410 2.35e-05
## 2 17 R.Pachimetry 0.5696 1.29e-05 NA 4 0.46860 6.74e-05
## 3 25 R.CDR 0.5628 3.13e-05 NA 7 0.23790 3.65e-02
## 4 27 totaluvafmm 0.5330 2.48e-05 NA 2 0.22110 7.83e-02
## 5 11 R.K.value.V 0.5065 1.56e-03 NA 8 0.20280 1.50e-02
## 6 26 L.CDR 0.4554 7.00e-04 NA 6 0.09980 2.04e-01
## 7 15 L.K.value.V 0.4466 8.98e-04 NA 3 0.04100 3.49e-01
## 8 9 R.K.value.H 0.4157 8.20e-04 NA 9 0.02296 4.10e-01
## 9 20 L.Axial.Length 0.3918 1.38e-03 NA 5 0.00000 5.00e-01
## 10 19 R.Axial.Length 0.3709 2.13e-03 NA NA NA NA
## 11 13 L.K.value.H 0.2997 1.48e-02 NA NA NA NA
## 12 22 L.AC.Depth 0.2372 1.21e-02 NA NA NA NA
## 13 23 R.IOP.mmHg 0.2298 2.84e-02 NA NA NA NA
## 14 21 R.AC.Depth 0.2221 1.46e-02 NA NA NA NA
## 15 24 L.IOP.mmHg 0.1944 3.90e-02 NA NA NA NA
## 16 8 L.Axis..pre.dilate. 0.1212 1.80e-01 NA NA NA NA
## 17 5 R.Axis..pre.dilate. 0.0310 4.13e-01 NA NA NA NA
## 18 1 R.Logmar.VA 0.0000 5.00e-01 NA NA NA NA
## 19 2 L.Logmar.VA 0.0000 5.00e-01 NA NA NA NA
## 20 3 R.Sph..pre.dilate. 0.0000 5.00e-01 NA NA NA NA
## 21 4 R.Cyl..pre.dilate. 0.0000 5.00e-01 NA NA NA NA
## 22 6 L.Sph..pre.dilate. 0.0000 5.00e-01 NA NA NA NA
## 23 7 L.Cyl..pre.dilate. 0.0000 5.00e-01 NA NA NA NA
## 24 10 R.K.Value.H.Axis 0.0000 5.00e-01 NA NA NA NA
## 25 12 R.K.value.V.Axis 0.0000 5.00e-01 NA NA NA NA
## 26 14 L.K.value.H.Axis 0.0000 5.00e-01 NA NA NA NA
## 27 16 L.K.value.V.Axis 0.0000 5.00e-01 NA NA NA NA
estimates are lower than the previous estimates
results seem more symmetrical (i.e. L/R values are significantly heritable but some L/R values still have ~10% difference)
Of note, R cyl and component 3 are no longer significantly heritable despite the associated variants identified in the previous GWAS analysis. Some concern that removing the rare variants may be producing less accurate results.
Include all variants for PCA and heritability analysis then filter out for GWAS analysis.