rm(list=ls())
If you haven’t already installed the necessary packages with
libraries, please do so!
For this practical, you will need:
install.packages(“metafor”)
install.packages(“plyr”)
install.packages(“meta”)
install.packages(“rmeta”)
library(metafor)
library(plyr)
library(meta)
library(rmeta)
Set working directory to your local directory.
For example:
setwd("/home/christopher/MR/PRACTICAL2/")
Identify independent SNPs from BMI GWAS for use as the
instrument variable of MR.
Merge and harmonize with SNPs from the CHD
GWAS.
Check for palindromic SNPs and for SNPs in opposing
directions.
Estimate Wald Ratio and meta analyze
results.
Calculate heterogeneity
statistics.
Run sensitivity analyses.
1. We will be using results from the Locke et al. 2015
paper using data from the GIANT Consortium (downloaded from https://portals.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium_data_files)
Note: If you want to have a look at the full GIANT data, then
download, save to your working directory and load in the full results
(this is a fairly large file) using the following code:
giant_full
<- read.table(“./GIANT_raw.uniq”, header = T)
In the Locke et al. paper, the authors describe a certain number of
SNPs that are “approximately independently associated with BMI” across
all ancestries.
Let’s have a look at these SNPs.
Note: This file was generated by taking the first rows from the
Supplementary Table 8 of the Locke et al. paper.
snps <- read.csv("./Data/giant_snps_all.csv", header = T)
a. How many SNPs do authors describe as being independently
associated with BMI in all ancestries?
dim(snps)
## [1] 97 14
head(snps)
## SNP CHR BP NearestGene Effect_Allele Other_Allele Beta SE
## 1 rs1558902 16 52,361,075 FTO A T 0.081 0.003
## 2 rs6567160 18 55,980,115 MC4R C T 0.056 0.004
## 3 rs13021737 2 622,348 TMEM18 G A 0.060 0.004
## 4 rs10938397 4 44,877,284 GNPDA2 G A 0.040 0.003
## 5 rs543874 1 176,156,103 SEC16B G A 0.050 0.004
## 6 rs2207139 6 50,953,449 TFAP2B G A 0.045 0.004
## EAF Variance N P Sig_Analysis Sig_P
## 1 0.409 0.316% 336,974 1.13e-156 European sex-combined 7.51e-153
## 2 0.236 0.114% 339,006 6.68e-59 European sex-combined 3.93e-53
## 3 0.830 0.103% 333,169 5.44e-54 European sex-combined 1.11e-50
## 4 0.428 0.078% 337,092 1.42e-40 European sex-combined 3.21e-38
## 5 0.195 0.077% 339,078 2.29e-40 European sex-combined 2.62e-35
## 6 0.176 0.058% 339,089 8.06e-31 European sex-combined 4.13e-29
b. What information does this file contain that are needed for
Mendelian randomization analyses?
colnames(snps)
## [1] "SNP" "CHR" "BP" "NearestGene"
## [5] "Effect_Allele" "Other_Allele" "Beta" "SE"
## [9] "EAF" "Variance" "N" "P"
## [13] "Sig_Analysis" "Sig_P"
c. How many are associated within only Europeans?
table(snps$Sig_Analysis)
##
## All Ancestries European Men European Population Based
## 10 3 4
## European sex-combined European Women
## 77 3
d. Why might it be best to use the SNPs that have been
identified as being associated with BMI in Europeans only?
2. Read in the second sheet of this file to get the estimates of
the SNPs associated with BMI in Europeans.
euro_snps <- read.csv("./Data/giant_snps_euro.csv", header = T)
dim(euro_snps)
## [1] 77 12
head(euro_snps)
## SNP CHR BP NearestGene Effect_Allele Other_Allele Beta SE
## 1 rs1558902 16 52,361,075 FTO A T 0.082 0.003
## 2 rs6567160 18 55,980,115 MC4R C T 0.056 0.004
## 3 rs13021737 2 622,348 TMEM18 G A 0.060 0.004
## 4 rs10938397 4 44,877,284 GNPDA2 G A 0.040 0.003
## 5 rs543874 1 176,156,103 SEC16B G A 0.048 0.004
## 6 rs2207139 6 50,953,449 TFAP2B G A 0.045 0.004
## EAF Variance N P
## 1 0.415 0.325% 320,073 7.51e-153
## 2 0.236 0.111% 321,958 3.93e-53
## 3 0.828 0.103% 318,287 1.11e-50
## 4 0.434 0.079% 320,955 3.21e-38
## 5 0.193 0.072% 322,008 2.62e-35
## 6 0.177 0.058% 322,019 4.13e-29
a. Check that these are all associated with BMI at a
conventional level of genome-wide significance.
sort(euro_snps$P)
## [1] 7.51e-153 3.93e-53 1.11e-50 3.21e-38 2.62e-35 4.13e-29 5.56e-28
## [8] 2.66e-26 8.15e-24 8.78e-24 3.14e-23 1.89e-22 1.48e-18 4.59e-18
## [15] 1.91e-17 5.15e-17 6.19e-17 3.28e-15 4.81e-15 1.23e-14 1.74e-14
## [22] 6.61e-14 7.03e-14 5.48e-13 1.09e-12 1.31e-12 1.83e-12 2.07e-12
## [29] 1.11e-11 1.14e-11 2.07e-11 2.25e-11 2.90e-11 5.94e-11 1.15e-10
## [36] 1.17e-10 1.63e-10 1.75e-10 1.83e-10 1.94e-10 2.29e-10 3.55e-10
## [43] 6.33e-10 7.47e-10 7.91e-10 8.11e-10 1.33e-09 1.92e-09 2.48e-09
## [50] 2.49e-09 3.99e-09 4.56e-09 7.34e-09 7.41e-09 7.76e-09 8.45e-09
## [57] 1.20e-08 1.25e-08 1.28e-08 1.39e-08 1.48e-08 1.61e-08 1.83e-08
## [64] 1.89e-08 2.02e-08 2.31e-08 2.41e-08 2.55e-08 2.60e-08 2.67e-08
## [71] 2.96e-08 2.97e-08 3.19e-08 3.42e-08 3.86e-08 4.17e-08 4.89e-08
length(which(euro_snps$P<=5E-8))
## [1] 77
b. Are all of these SNPs “good instruments”?
What else might
we want to check to see if they are strongly and independently
associated with BMI?
3. We’re going to make sure
the effect allele is the allele that increases BMI using the effect
allele and beta column. Browse the data, are all SNPs coded so that the
effect allele increases BMI?
a. Are all SNP effects in
the same direction?
euro_snps[,c("SNP","Effect_Allele","Beta","SE","P")]
## SNP Effect_Allele Beta SE P
## 1 rs1558902 A 0.082 0.003 7.51e-153
## 2 rs6567160 C 0.056 0.004 3.93e-53
## 3 rs13021737 G 0.060 0.004 1.11e-50
## 4 rs10938397 G 0.040 0.003 3.21e-38
## 5 rs543874 G 0.048 0.004 2.62e-35
## 6 rs2207139 G 0.045 0.004 4.13e-29
## 7 rs11030104 A 0.041 0.004 5.56e-28
## 8 rs3101336 C 0.033 0.003 2.66e-26
## 9 rs7138803 A 0.032 0.003 8.15e-24
## 10 rs10182181 G 0.031 0.003 8.78e-24
## 11 rs3888190 A 0.031 0.003 3.14e-23
## 12 rs1516725 C 0.045 0.005 1.89e-22
## 13 rs12446632 G 0.040 0.005 1.48e-18
## 14 rs2287019 C 0.036 0.004 4.59e-18
## 15 rs16951275 T 0.031 0.004 1.91e-17
## 16 rs3817334 T 0.026 0.003 5.15e-17
## 17 rs2112347 T 0.026 0.003 6.19e-17
## 18 rs12566985 G 0.024 0.003 3.28e-15
## 19 rs3810291 A 0.028 0.004 4.81e-15
## 20 rs7141420 T 0.024 0.003 1.23e-14
## 21 rs13078960 G 0.030 0.004 1.74e-14
## 22 rs10968576 G 0.025 0.003 6.61e-14
## 23 rs17024393 C 0.066 0.009 7.03e-14
## 24 rs657452 A 0.023 0.003 5.48e-13
## 25 rs12429545 A 0.033 0.005 1.09e-12
## 26 rs12286929 G 0.022 0.003 1.31e-12
## 27 rs13107325 T 0.048 0.007 1.83e-12
## 28 rs11165643 T 0.022 0.003 2.07e-12
## 29 rs7903146 C 0.023 0.003 1.11e-11
## 30 rs10132280 C 0.023 0.003 1.14e-11
## 31 rs17405819 T 0.022 0.003 2.07e-11
## 32 rs1016287 T 0.023 0.003 2.25e-11
## 33 rs4256980 G 0.021 0.003 2.90e-11
## 34 rs17094222 C 0.025 0.004 5.94e-11
## 35 rs12401738 A 0.021 0.003 1.15e-10
## 36 rs7599312 G 0.022 0.003 1.17e-10
## 37 rs2365389 C 0.020 0.003 1.63e-10
## 38 rs205262 G 0.022 0.004 1.75e-10
## 39 rs2820292 C 0.020 0.003 1.83e-10
## 40 rs12885454 C 0.021 0.003 1.94e-10
## 41 rs12016871 T 0.030 0.005 2.29e-10
## 42 rs16851483 T 0.048 0.008 3.55e-10
## 43 rs1167827 G 0.020 0.003 6.33e-10
## 44 rs758747 T 0.023 0.004 7.47e-10
## 45 rs1928295 T 0.019 0.003 7.91e-10
## 46 rs9925964 A 0.019 0.003 8.11e-10
## 47 rs11126666 A 0.021 0.003 1.33e-09
## 48 rs2650492 A 0.021 0.004 1.92e-09
## 49 rs6804842 G 0.019 0.003 2.48e-09
## 50 rs12940622 G 0.018 0.003 2.49e-09
## 51 rs11847697 T 0.049 0.008 3.99e-09
## 52 rs4740619 T 0.018 0.003 4.56e-09
## 53 rs13191362 A 0.028 0.005 7.34e-09
## 54 rs3736485 A 0.018 0.003 7.41e-09
## 55 rs17001654 G 0.031 0.005 7.76e-09
## 56 rs11191560 C 0.031 0.005 8.45e-09
## 57 rs1528435 T 0.018 0.003 1.20e-08
## 58 rs2075650 A 0.026 0.005 1.25e-08
## 59 rs1000940 G 0.019 0.003 1.28e-08
## 60 rs2033529 G 0.019 0.003 1.39e-08
## 61 rs11583200 C 0.018 0.003 1.48e-08
## 62 rs9400239 C 0.019 0.003 1.61e-08
## 63 rs10733682 A 0.017 0.003 1.83e-08
## 64 rs11688816 G 0.017 0.003 1.89e-08
## 65 rs11057405 G 0.031 0.006 2.02e-08
## 66 rs2121279 T 0.025 0.004 2.31e-08
## 67 rs29941 G 0.018 0.003 2.41e-08
## 68 rs11727676 T 0.036 0.006 2.55e-08
## 69 rs3849570 A 0.019 0.003 2.60e-08
## 70 rs6477694 C 0.017 0.003 2.67e-08
## 71 rs7899106 G 0.040 0.007 2.96e-08
## 72 rs2176598 T 0.020 0.004 2.97e-08
## 73 rs2245368 C 0.032 0.006 3.19e-08
## 74 rs17724992 A 0.019 0.004 3.42e-08
## 75 rs7243357 T 0.022 0.004 3.86e-08
## 76 rs1808579 C 0.017 0.003 4.17e-08
## 77 rs2033732 C 0.019 0.004 4.89e-08
summary(euro_snps$Beta)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.01700 0.02000 0.02300 0.02847 0.03200 0.08200
1. These were downloaded from the CARDIOGRAM website http://www.cardiogramplusc4d.org/ and provided for you
(however, this is quite a large file so we have truncated for you using
the following code):
CARDIOGRAM <-
read.table(“./CARDIoGRAM_GWAS_RESULTS.txt”, header = T)
BMI_SNPS_in_CARDIOGRAM <- CARDIOGRAM[CARDIOGRAM$SNP %in% euro_snps$SNP,]
top_1000 <- CARDIOGRAM[c(1:1000),]
bottom_1000 <- CARDIOGRAM[c((nrow(CARDIOGRAM)-1000):(nrow(CARDIOGRAM))),]
CARDIOGRAM_TRUNC <- rbind(BMI_SNPS_in_CARDIOGRAM, top_1000, bottom_1000)
write.table(CARDIOGRAM_TRUNC, “./CARDIOGRAM_CLEANED.txt”, quote = F,
col.names = T, row.names = F, sep = “)
a. How many SNPs are
in this truncated CARDIOGRAM dataset?
CARDIOGRAM <- read.table("./Data/CARDIOGRAM_CLEANED.txt", sep="\t", header=T, colClasses = "character")
dim(CARDIOGRAM)
## [1] 2078 12
head(CARDIOGRAM)
## SNP chr_pos_.b36. reference_allele other_allele ref_allele_frequency
## 1 rs657452 chr1:49362434 G A 0.60513584
## 2 rs11583200 chr1:50332407 C T 0.39664739
## 3 rs3101336 chr1:72523773 C T 0.60256842
## 4 rs12566985 chr1:74774781 G A 0.44899777
## 5 rs12401738 chr1:78219349 G A 0.62028006
## 6 rs11165643 chr1:96696685 C T 0.41074541
## pvalue het_pvalue log_odds log_odds_se N_case N_control model
## 1 0.2262602 0.27224751 -0.0174996 0.0144619 19723 57869 FE
## 2 0.1818971 0.45532089 0.0190464 0.0142676 20247 58547 FE
## 3 0.6796559 0.57600141 0.005948 0.0144044 19760 58014 FE
## 4 0.3564933 0.42546079 -0.0128532 0.0139396 20806 61606 FE
## 5 0.1955482 0.24955897 -0.0195047 0.0150693 18196 58218 FE
## 6 0.9557793 0.82853378 -0.0007751 0.0139775 20644 61437 FE
b. Does this file contain everything that is required to perform
a two-sample Mendelian randomization analysis?
colnames(CARDIOGRAM)
## [1] "SNP" "chr_pos_.b36." "reference_allele"
## [4] "other_allele" "ref_allele_frequency" "pvalue"
## [7] "het_pvalue" "log_odds" "log_odds_se"
## [10] "N_case" "N_control" "model"
2. How many of the BMI SNPs are included in the
CARDIOGRAM dataset?
BMI_SNPs <- euro_snps$SNP
BMI_SNPs <- as.vector(BMI_SNPs)
matches <- unique(grep(paste(BMI_SNPs, collapse="|"), CARDIOGRAM$SNP, value=TRUE))
matches
## [1] "rs657452" "rs11583200" "rs3101336" "rs12566985" "rs12401738"
## [6] "rs11165643" "rs17024393" "rs543874" "rs2820292" "rs13021737"
## [11] "rs10182181" "rs11126666" "rs1016287" "rs11688816" "rs2121279"
## [16] "rs1528435" "rs7599312" "rs6804842" "rs2365389" "rs3849570"
## [21] "rs13078960" "rs16851483" "rs1516725" "rs10938397" "rs17001654"
## [26] "rs13107325" "rs11727676" "rs2112347" "rs205262" "rs2033529"
## [31] "rs2207139" "rs9400239" "rs13191362" "rs1167827" "rs2245368"
## [36] "rs17405819" "rs2033732" "rs4740619" "rs10968576" "rs6477694"
## [41] "rs1928295" "rs10733682" "rs7899106" "rs17094222" "rs11191560"
## [46] "rs7903146" "rs4256980" "rs11030104" "rs2176598" "rs3817334"
## [51] "rs12286929" "rs7138803" "rs11057405" "rs12016871" "rs12429545"
## [56] "rs10132280" "rs12885454" "rs11847697" "rs7141420" "rs3736485"
## [61] "rs16951275" "rs758747" "rs12446632" "rs2650492" "rs3888190"
## [66] "rs9925964" "rs1558902" "rs1000940" "rs12940622" "rs1808579"
## [71] "rs7243357" "rs6567160" "rs17724992" "rs29941" "rs2075650"
## [76] "rs2287019" "rs3810291"
View the data from CARDIOGRAM for our BMI SNPs.
CARDIOGRAM_BMI <- CARDIOGRAM[grepl(paste(BMI_SNPs, collapse="|"), CARDIOGRAM$SNP),]
CARDIOGRAM_BMI
## SNP chr_pos_.b36. reference_allele other_allele
## 1 rs657452 chr1:49362434 G A
## 2 rs11583200 chr1:50332407 C T
## 3 rs3101336 chr1:72523773 C T
## 4 rs12566985 chr1:74774781 G A
## 5 rs12401738 chr1:78219349 G A
## 6 rs11165643 chr1:96696685 C T
## 7 rs17024393 chr1:109956211 C T
## 8 rs543874 chr1:176156103 G A
## 9 rs2820292 chr1:200050910 C A
## 10 rs13021737 chr2:622348 G A
## 11 rs10182181 chr2:25003800 G A
## 12 rs11126666 chr2:26782315 G A
## 13 rs1016287 chr2:59159129 C T
## 14 rs11688816 chr2:62906552 G A
## 15 rs2121279 chr2:142759755 C T
## 16 rs1528435 chr2:181259207 C T
## 17 rs7599312 chr2:213121476 G A
## 18 rs6804842 chr3:25081441 G A
## 19 rs2365389 chr3:61211502 C T
## 20 rs3849570 chr3:81874802 C A
## 21 rs13078960 chr3:85890280 G T
## 22 rs16851483 chr3:142758126 G T
## 23 rs1516725 chr3:187306698 C T
## 24 rs10938397 chr4:44877284 G A
## 25 rs17001654 chr4:77348592 C G
## 26 rs13107325 chr4:103407732 C T
## 27 rs11727676 chr4:145878514 C T
## 28 rs2112347 chr5:75050998 G T
## 29 rs205262 chr6:34671142 G A
## 30 rs2033529 chr6:40456631 G A
## 31 rs2207139 chr6:50953449 G A
## 32 rs9400239 chr6:109084356 C T
## 33 rs13191362 chr6:162953340 G A
## 34 rs1167827 chr7:75001105 G A
## 35 rs2245368 chr7:76446079 C T
## 36 rs17405819 chr8:76969139 C T
## 37 rs2033732 chr8:85242264 C T
## 38 rs4740619 chr9:15624326 C T
## 39 rs10968576 chr9:28404339 G A
## 40 rs6477694 chr9:110972163 C T
## 41 rs1928295 chr9:119418304 C T
## 42 rs10733682 chr9:128500735 G A
## 43 rs7899106 chr10:87400884 G A
## 44 rs17094222 chr10:102385430 C T
## 45 rs11191560 chr10:104859028 C T
## 46 rs7903146 chr10:114748339 C T
## 47 rs4256980 chr11:8630515 C G
## 48 rs11030104 chr11:27641093 G A
## 49 rs2176598 chr11:43820854 C T
## 50 rs3817334 chr11:47607569 C T
## 51 rs12286929 chr11:114527614 G A
## 52 rs7138803 chr12:48533735 G A
## 53 rs11057405 chr12:121347850 G A
## 54 rs12016871 chr13:26915782 C T
## 55 rs12429545 chr13:53000207 G A
## 56 rs10132280 chr14:24998019 C A
## 57 rs12885454 chr14:28806589 C A
## 58 rs11847697 chr14:29584863 C T
## 59 rs7141420 chr14:78969207 C T
## 60 rs3736485 chr15:49535902 G A
## 61 rs16951275 chr15:65864222 C T
## 62 rs758747 chr16:3567359 C T
## 63 rs12446632 chr16:19842890 G A
## 64 rs2650492 chr16:28240912 G A
## 65 rs3888190 chr16:28796987 C A
## 66 rs9925964 chr16:31037396 G A
## 67 rs1558902 chr16:52361075 T A
## 68 rs1000940 chr17:5223976 G A
## 69 rs12940622 chr17:76230166 G A
## 70 rs1808579 chr18:19358886 C T
## 71 rs7243357 chr18:55034299 G T
## 72 rs6567160 chr18:55980115 C T
## 73 rs17724992 chr19:18315825 G A
## 74 rs29941 chr19:39001372 G A
## 75 rs2075650 chr19:50087459 G A
## 76 rs2287019 chr19:50894012 C T
## 77 rs3810291 chr19:52260843 G A
## ref_allele_frequency pvalue het_pvalue log_odds log_odds_se N_case
## 1 0.60513584 0.2262602 0.27224751 -0.0174996 0.0144619 19723
## 2 0.39664739 0.1818971 0.45532089 0.0190464 0.0142676 20247
## 3 0.60256842 0.6796559 0.57600141 0.005948 0.0144044 19760
## 4 0.44899777 0.3564933 0.42546079 -0.0128532 0.0139396 20806
## 5 0.62028006 0.1955482 0.24955897 -0.0195047 0.0150693 18196
## 6 0.41074541 0.9557793 0.82853378 -0.0007751 0.0139775 20644
## 7 0.03280444 0.7560841 0.13856056 0.0139407 0.0448792 14934
## 8 0.19669364 0.7931315 0.27599476 -0.004636 0.0176779 21134
## 9 0.56965298 0.0099703 0.58433751 0.0359029 0.0139328 21198
## 10 0.80895378 0.3153078 0.69171591 0.0186889 0.0186117 20834
## 11 0.46240927 0.9498548 0.61182332 0.0009032 0.0143625 19568
## 12 0.71929726 0.1809574 0.89136444 0.0209674 0.0156729 21443
## 13 0.72117719 0.102763 0.12279346 -0.0254175 0.0155782 20468
## 14 0.5348187 0.9710981 0.35503175 0.0005135 0.0141725 19915
## 15 0.86261919 0.8737069 0.41814919 -0.0032328 0.0203383 20892
## 16 0.37962868 0.7415367 0.81685878 0.0047002 0.0142509 20460
## 17 0.70337496 0.9192913 0.10235164 0.0016027 0.0158173 20703
## 18 0.57965224 0.9107156 0.67473071 0.0016094 0.0143525 20403
## 19 0.55986741 0.3350863 0.91057566 0.0136276 0.0141377 20635
## 20 0.61947841 0.126133 0.00935007 -0.0390811 0.0255511 20556
## 21 0.22900043 0.1345178 0.03395648 -0.0258172 0.0172515 20549
## 22 0.93429797 0.5088742 0.05493564 -0.0188109 0.0284759 19390
## 23 0.8590188 0.1629709 0.22517731 -0.0279354 0.0200233 20476
## 24 0.41363733 0.0278084 0.68643434 0.0349131 0.0158698 16024
## 25 0.83729352 0.5035566 0.53373783 -0.0126514 0.0189136 21412
## 26 0.97601496 0.9074287 0.45904911 -0.0049381 0.0424663 14269
## 27 0.08996777 0.4402455 0.48987311 0.0253447 0.0328393 15668
## 28 0.36318919 0.5414854 0.58598695 -0.0088627 0.0145155 20579
## 29 0.25943383 0.0001102 0.4801711 0.0614484 0.0158908 20409
## 30 0.26951838 0.6939016 0.9186536 -0.0061959 0.015743 18908
## 31 0.17342856 0.3550604 0.6093131 0.0170642 0.0184515 20539
## 32 0.71596318 0.0227834 0.07735518 0.0348705 0.0153139 20460
## 33 0.12900275 0.5722799 0.54836822 -0.0116552 0.0206397 21401
## 34 0.59298726 0.2370525 0.84489268 0.0190735 0.0161314 15647
## 35 0.16123262 0.5501828 0.14094314 0.0303632 0.0508183 3340
## 36 0.32156614 0.9167122 0.22795208 -0.0015648 0.0149631 20724
## 37 0.75652234 0.1335914 0.04706672 -0.0253631 0.0169077 18943
## 38 0.46589474 0.2675504 0.95611066 -0.0154694 0.0139524 20779
## 39 0.31119502 0.5871441 0.53064937 0.00811 0.0149361 20590
## 40 0.3768822 0.0596142 0.73673365 -0.026955 0.0143101 21572
## 41 0.44991607 0.970387 0.77734102 0.00052 0.0140084 21633
## 42 0.52449229 0.1829671 0.5076359 -0.0212797 0.0159796 15670
## 43 0.06251531 0.5857598 0.35079691 -0.0163075 0.0299225 20393
## 44 0.22353262 0.72768 0.77948036 -0.0059791 0.0171708 21528
## 45 0.11109357 0.0001767 0.53911165 -0.0966196 0.0257636 20098
## 46 0.69817348 0.2034467 0.36081083 -0.0192779 0.0151581 19910
## 47 0.34826703 0.299491 0.02913706 0.0150656 0.0145207 20711
## 48 0.19185393 0.0496195 0.36961605 -0.034644 0.0176464 21638
## 49 0.74580823 0.1264641 0.05596993 -0.0247371 0.0161872 20452
## 50 0.58779101 0.1194239 0.81445328 -0.0219146 0.0140731 21525
## 51 0.50744154 0.9892883 0.46198211 0.0001851 0.0137839 21869
## 52 0.61809264 0.9656206 0.9396865 0.0006419 0.0148916 19151
## 53 0.92089578 0.4796771 0.59224188 -0.0214346 0.0303253 13629
## 54 0.80407573 0.3017645 0.36210717 -0.0181606 0.0175863 20330
## 55 0.845308 0.0396211 0.93056094 -0.0480905 0.0233712 16413
## 56 0.66888613 0.302347 0.2972794 0.0169764 0.0164594 16945
## 57 0.631088 0.9719424 0.66514854 -0.0005252 0.0149317 19926
## 58 0.96069173 0.0476032 0.26232231 -0.0707465 0.0357144 20155
## 59 0.48985341 0.8946607 0.95298609 -0.0018271 0.0137989 21258
## 60 0.55461993 0.5162199 0.10728276 0.0091099 0.0140328 20940
## 61 0.235351 0.0024846 0.22809659 -0.0513109 0.0169611 20781
## 62 0.74188258 0.0076412 0.91893399 0.0613185 0.022987 8437
## 63 0.84624483 0.6290302 0.33198375 -0.0101474 0.0210052 20232
## 64 0.70112768 0.5639757 0.05832099 0.0126597 0.0219426 8628
## 65 0.57882566 0.8709159 0.00489802 -0.0042697 0.026276 19770
## 66 0.38141338 0.6696603 0.2066786 0.0060442 0.0141679 20484
## 67 0.58450108 0.0205892 0.26380736 -0.0324866 0.0140305 20777
## 68 0.31640352 0.5722296 0.79718646 -0.0091901 0.0162722 16633
## 69 0.55719561 0.6813264 0.13926311 -0.005711 0.0139071 20824
## 70 0.54056958 0.500074 0.17404987 0.0096342 0.0142861 19215
## 71 0.19721707 0.5943844 0.73397076 -0.0098297 0.0184597 20404
## 72 0.26395759 0.13325 0.33646334 0.024367 0.0162294 20445
## 73 0.25712419 0.002919 0.87146506 -0.0493236 0.016573 19109
## 74 0.67182131 0.7661974 0.88467459 -0.0044497 0.0149645 20307
## 75 0.1705933 0.1957039 0.33474452 0.0378692 0.0292677 6913
## 76 0.79836678 0.0618941 0.086903 0.045534 0.0243882 8640
## 77 0.33939316 0.0303086 0.59137189 -0.0422661 0.0195131 11287
## N_control model
## 1 57869 FE
## 2 58547 FE
## 3 58014 FE
## 4 61606 FE
## 5 58218 FE
## 6 61437 FE
## 7 53634 FE
## 8 61643 FE
## 9 61870 FE
## 10 61762 FE
## 11 56734 FE
## 12 59360 FE
## 13 61571 FE
## 14 58201 FE
## 15 61352 FE
## 16 61311 FE
## 17 57717 FE
## 18 61351 FE
## 19 61523 FE
## 20 61394 RE
## 21 61416 FE
## 22 59683 FE
## 23 61368 FE
## 24 55685 FE
## 25 61536 FE
## 26 35628 FE
## 27 53910 FE
## 28 61408 FE
## 29 58712 FE
## 30 56847 FE
## 31 61411 FE
## 32 61373 FE
## 33 61861 FE
## 34 51329 FE
## 35 18930 FE
## 36 61620 FE
## 37 59038 FE
## 38 61682 FE
## 39 61535 FE
## 40 59220 FE
## 41 59309 FE
## 42 51340 FE
## 43 60788 FE
## 44 62065 FE
## 45 60493 FE
## 46 58247 FE
## 47 61646 FE
## 48 59361 FE
## 49 61406 FE
## 50 61809 FE
## 51 62138 FE
## 52 56023 FE
## 53 49312 FE
## 54 61194 FE
## 55 52969 FE
## 56 58264 FE
## 57 60456 FE
## 58 61186 FE
## 59 62144 FE
## 60 61375 FE
## 61 61718 FE
## 62 44300 FE
## 63 58533 FE
## 64 44465 FE
## 65 59987 RE
## 66 61554 FE
## 67 61645 FE
## 68 54057 FE
## 69 61691 FE
## 70 57225 FE
## 71 61241 FE
## 72 61296 FE
## 73 59755 FE
## 74 58647 FE
## 75 27904 FE
## 76 44554 FE
## 77 50470 FE
3. Merge the GIANT and CARDIOGRAM SNP summary
associations.
First, make sure the column headings are
easy to understand (i.e., add “BMI” and “CHD” onto the respective
datasets).
colnames(euro_snps) <- paste("BMI", colnames(euro_snps), sep = "_")
head(euro_snps)
## BMI_SNP BMI_CHR BMI_BP BMI_NearestGene BMI_Effect_Allele
## 1 rs1558902 16 52,361,075 FTO A
## 2 rs6567160 18 55,980,115 MC4R C
## 3 rs13021737 2 622,348 TMEM18 G
## 4 rs10938397 4 44,877,284 GNPDA2 G
## 5 rs543874 1 176,156,103 SEC16B G
## 6 rs2207139 6 50,953,449 TFAP2B G
## BMI_Other_Allele BMI_Beta BMI_SE BMI_EAF BMI_Variance BMI_N BMI_P
## 1 T 0.082 0.003 0.415 0.325% 320,073 7.51e-153
## 2 T 0.056 0.004 0.236 0.111% 321,958 3.93e-53
## 3 A 0.060 0.004 0.828 0.103% 318,287 1.11e-50
## 4 A 0.040 0.003 0.434 0.079% 320,955 3.21e-38
## 5 A 0.048 0.004 0.193 0.072% 322,008 2.62e-35
## 6 A 0.045 0.004 0.177 0.058% 322,019 4.13e-29
colnames(CARDIOGRAM) <- paste("CHD", colnames(CARDIOGRAM), sep = "_")
head(CARDIOGRAM)
## CHD_SNP CHD_chr_pos_.b36. CHD_reference_allele CHD_other_allele
## 1 rs657452 chr1:49362434 G A
## 2 rs11583200 chr1:50332407 C T
## 3 rs3101336 chr1:72523773 C T
## 4 rs12566985 chr1:74774781 G A
## 5 rs12401738 chr1:78219349 G A
## 6 rs11165643 chr1:96696685 C T
## CHD_ref_allele_frequency CHD_pvalue CHD_het_pvalue CHD_log_odds
## 1 0.60513584 0.2262602 0.27224751 -0.0174996
## 2 0.39664739 0.1818971 0.45532089 0.0190464
## 3 0.60256842 0.6796559 0.57600141 0.005948
## 4 0.44899777 0.3564933 0.42546079 -0.0128532
## 5 0.62028006 0.1955482 0.24955897 -0.0195047
## 6 0.41074541 0.9557793 0.82853378 -0.0007751
## CHD_log_odds_se CHD_N_case CHD_N_control CHD_model
## 1 0.0144619 19723 57869 FE
## 2 0.0142676 20247 58547 FE
## 3 0.0144044 19760 58014 FE
## 4 0.0139396 20806 61606 FE
## 5 0.0150693 18196 58218 FE
## 6 0.0139775 20644 61437 FE
merged <- merge(euro_snps,CARDIOGRAM, by.x="BMI_SNP", by.y="CHD_SNP")
dim(merged)
## [1] 77 23
head(merged)
## BMI_SNP BMI_CHR BMI_BP BMI_NearestGene BMI_Effect_Allele
## 1 rs1000940 17 5,223,976 RABEP1 G
## 2 rs10132280 14 24,998,019 STXBP6 C
## 3 rs1016287 2 59,159,129 LINC01122 T
## 4 rs10182181 2 25,003,800 ADCY3 G
## 5 rs10733682 9 128,500,735 LMX1B A
## 6 rs10938397 4 44,877,284 GNPDA2 G
## BMI_Other_Allele BMI_Beta BMI_SE BMI_EAF BMI_Variance BMI_N BMI_P
## 1 A 0.019 0.003 0.320 0.016% 321,836 1.28e-08
## 2 A 0.023 0.003 0.682 0.023% 321,797 1.14e-11
## 3 C 0.023 0.003 0.287 0.021% 321,969 2.25e-11
## 4 A 0.031 0.003 0.462 0.047% 321,759 8.78e-24
## 5 G 0.017 0.003 0.478 0.015% 320,727 1.83e-08
## 6 A 0.040 0.003 0.434 0.079% 320,955 3.21e-38
## CHD_chr_pos_.b36. CHD_reference_allele CHD_other_allele
## 1 chr17:5223976 G A
## 2 chr14:24998019 C A
## 3 chr2:59159129 C T
## 4 chr2:25003800 G A
## 5 chr9:128500735 G A
## 6 chr4:44877284 G A
## CHD_ref_allele_frequency CHD_pvalue CHD_het_pvalue CHD_log_odds
## 1 0.31640352 0.5722296 0.79718646 -0.0091901
## 2 0.66888613 0.302347 0.2972794 0.0169764
## 3 0.72117719 0.102763 0.12279346 -0.0254175
## 4 0.46240927 0.9498548 0.61182332 0.0009032
## 5 0.52449229 0.1829671 0.5076359 -0.0212797
## 6 0.41363733 0.0278084 0.68643434 0.0349131
## CHD_log_odds_se CHD_N_case CHD_N_control CHD_model
## 1 0.0162722 16633 54057 FE
## 2 0.0164594 16945 58264 FE
## 3 0.0155782 20468 61571 FE
## 4 0.0143625 19568 56734 FE
## 5 0.0159796 15670 51340 FE
## 6 0.0158698 16024 55685 FE
1. Make sure that the effect alleles in the CARDIOGRAM and
GIANT datasets are the same. We want the cardiogram effect allele to be
the allele that increases BMI.
But be careful of
palindromic SNPs or SNPs on different strands.
First we need to see
whether the effect alleles are the same.
Browse the data.
merged[,c("BMI_SNP", "BMI_Effect_Allele","CHD_reference_allele","BMI_Other_Allele","CHD_other_allele","BMI_EAF","CHD_ref_allele_frequency")]
## BMI_SNP BMI_Effect_Allele CHD_reference_allele BMI_Other_Allele
## 1 rs1000940 G G A
## 2 rs10132280 C C A
## 3 rs1016287 T C C
## 4 rs10182181 G G A
## 5 rs10733682 A G G
## 6 rs10938397 G G A
## 7 rs10968576 G G A
## 8 rs11030104 A G G
## 9 rs11057405 G G A
## 10 rs11126666 A G G
## 11 rs11165643 T C C
## 12 rs11191560 C C T
## 13 rs11583200 C C T
## 14 rs1167827 G G A
## 15 rs11688816 G G A
## 16 rs11727676 T C C
## 17 rs11847697 T C C
## 18 rs12016871 T C C
## 19 rs12286929 G G A
## 20 rs12401738 A G G
## 21 rs12429545 A G G
## 22 rs12446632 G G A
## 23 rs12566985 G G A
## 24 rs12885454 C C A
## 25 rs12940622 G G A
## 26 rs13021737 G G A
## 27 rs13078960 G G T
## 28 rs13107325 T C C
## 29 rs13191362 A G G
## 30 rs1516725 C C T
## 31 rs1528435 T C C
## 32 rs1558902 A T T
## 33 rs16851483 T G G
## 34 rs16951275 T C C
## 35 rs17001654 G C C
## 36 rs17024393 C C T
## 37 rs17094222 C C T
## 38 rs17405819 T C C
## 39 rs17724992 A G G
## 40 rs1808579 C C T
## 41 rs1928295 T C C
## 42 rs2033529 G G A
## 43 rs2033732 C C T
## 44 rs205262 G G A
## 45 rs2075650 A G G
## 46 rs2112347 T G G
## 47 rs2121279 T C C
## 48 rs2176598 T C C
## 49 rs2207139 G G A
## 50 rs2245368 C C T
## 51 rs2287019 C C T
## 52 rs2365389 C C T
## 53 rs2650492 A G G
## 54 rs2820292 C C A
## 55 rs29941 G G A
## 56 rs3101336 C C T
## 57 rs3736485 A G G
## 58 rs3810291 A G G
## 59 rs3817334 T C C
## 60 rs3849570 A C C
## 61 rs3888190 A C C
## 62 rs4256980 G C C
## 63 rs4740619 T C C
## 64 rs543874 G G A
## 65 rs6477694 C C T
## 66 rs6567160 C C T
## 67 rs657452 A G G
## 68 rs6804842 G G A
## 69 rs7138803 A G G
## 70 rs7141420 T C C
## 71 rs7243357 T G G
## 72 rs758747 T C C
## 73 rs7599312 G G A
## 74 rs7899106 G G A
## 75 rs7903146 C C T
## 76 rs9400239 C C T
## 77 rs9925964 A G G
## CHD_other_allele BMI_EAF CHD_ref_allele_frequency
## 1 A 0.320 0.31640352
## 2 A 0.682 0.66888613
## 3 T 0.287 0.72117719
## 4 A 0.462 0.46240927
## 5 A 0.478 0.52449229
## 6 A 0.434 0.41363733
## 7 A 0.320 0.31119502
## 8 A 0.792 0.19185393
## 9 A 0.901 0.92089578
## 10 A 0.283 0.71929726
## 11 T 0.583 0.41074541
## 12 T 0.089 0.11109357
## 13 T 0.396 0.39664739
## 14 A 0.553 0.59298726
## 15 A 0.525 0.5348187
## 16 T 0.910 0.08996777
## 17 T 0.042 0.96069173
## 18 T 0.203 0.80407573
## 19 A 0.523 0.50744154
## 20 A 0.352 0.62028006
## 21 A 0.133 0.845308
## 22 A 0.865 0.84624483
## 23 A 0.446 0.44899777
## 24 A 0.642 0.631088
## 25 A 0.575 0.55719561
## 26 A 0.828 0.80895378
## 27 T 0.196 0.22900043
## 28 T 0.072 0.97601496
## 29 A 0.879 0.12900275
## 30 T 0.872 0.8590188
## 31 T 0.631 0.37962868
## 32 A 0.415 0.58450108
## 33 T 0.066 0.93429797
## 34 T 0.784 0.235351
## 35 G 0.153 0.83729352
## 36 T 0.040 0.03280444
## 37 T 0.211 0.22353262
## 38 T 0.700 0.32156614
## 39 A 0.746 0.25712419
## 40 T 0.534 0.54056958
## 41 T 0.548 0.44991607
## 42 A 0.293 0.26951838
## 43 T 0.747 0.75652234
## 44 A 0.273 0.25943383
## 45 A 0.848 0.1705933
## 46 T 0.629 0.36318919
## 47 T 0.152 0.86261919
## 48 T 0.251 0.74580823
## 49 A 0.177 0.17342856
## 50 T 0.180 0.16123262
## 51 T 0.804 0.79836678
## 52 T 0.582 0.55986741
## 53 A 0.303 0.70112768
## 54 A 0.555 0.56965298
## 55 A 0.669 0.67182131
## 56 T 0.613 0.60256842
## 57 A 0.454 0.55461993
## 58 A 0.666 0.33939316
## 59 T 0.407 0.58779101
## 60 A 0.359 0.61947841
## 61 A 0.403 0.57882566
## 62 G 0.646 0.34826703
## 63 T 0.542 0.46589474
## 64 A 0.193 0.19669364
## 65 T 0.365 0.3768822
## 66 T 0.236 0.26395759
## 67 A 0.394 0.60513584
## 68 A 0.575 0.57965224
## 69 A 0.384 0.61809264
## 70 T 0.527 0.48985341
## 71 T 0.812 0.19721707
## 72 T 0.265 0.74188258
## 73 A 0.724 0.70337496
## 74 A 0.052 0.06251531
## 75 T 0.713 0.69817348
## 76 T 0.688 0.71596318
## 77 A 0.620 0.38141338
a. How can we tell if the CARDIOGRAM and GIANT SNPs are coded
using the same reference strand?
b. Are CARDIOGRAM and the
GIANT SNPs coded using the same reference strand?
c. Are there
any palindromic SNPs?
palindromic_at<-subset(merged,BMI_Effect_Allele %in% "A" & BMI_Other_Allele %in% "T")
palindromic_ta<-subset(merged,BMI_Effect_Allele %in% "T" & BMI_Other_Allele %in% "A")
palindromic_gc<-subset(merged,BMI_Effect_Allele %in% "G" & BMI_Other_Allele %in% "C")
palindromic_cg<-subset(merged,BMI_Effect_Allele %in% "C" & BMI_Other_Allele %in% "G")
dim(palindromic_at)
## [1] 1 23
dim(palindromic_ta)
## [1] 0 23
dim(palindromic_gc)
## [1] 2 23
dim(palindromic_cg)
## [1] 0 23
d. How can we tell whether the effect alleles are the same in
both datasets for palindromic SNPs (i.e., the allele that increases BMI
is the same as the reference allele in CARDIOGRAM)?
2. Make sure the CARDIOGRAM log odds ratio reflects the allele
that increases BMI in the GIANT data.
First, find the
positions of SNPs with different effect alleles.
head(merged)
## BMI_SNP BMI_CHR BMI_BP BMI_NearestGene BMI_Effect_Allele
## 1 rs1000940 17 5,223,976 RABEP1 G
## 2 rs10132280 14 24,998,019 STXBP6 C
## 3 rs1016287 2 59,159,129 LINC01122 T
## 4 rs10182181 2 25,003,800 ADCY3 G
## 5 rs10733682 9 128,500,735 LMX1B A
## 6 rs10938397 4 44,877,284 GNPDA2 G
## BMI_Other_Allele BMI_Beta BMI_SE BMI_EAF BMI_Variance BMI_N BMI_P
## 1 A 0.019 0.003 0.320 0.016% 321,836 1.28e-08
## 2 A 0.023 0.003 0.682 0.023% 321,797 1.14e-11
## 3 C 0.023 0.003 0.287 0.021% 321,969 2.25e-11
## 4 A 0.031 0.003 0.462 0.047% 321,759 8.78e-24
## 5 G 0.017 0.003 0.478 0.015% 320,727 1.83e-08
## 6 A 0.040 0.003 0.434 0.079% 320,955 3.21e-38
## CHD_chr_pos_.b36. CHD_reference_allele CHD_other_allele
## 1 chr17:5223976 G A
## 2 chr14:24998019 C A
## 3 chr2:59159129 C T
## 4 chr2:25003800 G A
## 5 chr9:128500735 G A
## 6 chr4:44877284 G A
## CHD_ref_allele_frequency CHD_pvalue CHD_het_pvalue CHD_log_odds
## 1 0.31640352 0.5722296 0.79718646 -0.0091901
## 2 0.66888613 0.302347 0.2972794 0.0169764
## 3 0.72117719 0.102763 0.12279346 -0.0254175
## 4 0.46240927 0.9498548 0.61182332 0.0009032
## 5 0.52449229 0.1829671 0.5076359 -0.0212797
## 6 0.41363733 0.0278084 0.68643434 0.0349131
## CHD_log_odds_se CHD_N_case CHD_N_control CHD_model
## 1 0.0162722 16633 54057 FE
## 2 0.0164594 16945 58264 FE
## 3 0.0155782 20468 61571 FE
## 4 0.0143625 19568 56734 FE
## 5 0.0159796 15670 51340 FE
## 6 0.0158698 16024 55685 FE
effect_diff <- which(merged$BMI_Effect_Allele != merged$CHD_reference_allele) # The position of SNPs where effect alleles are different
merged[effect_diff,c("BMI_SNP", "BMI_Effect_Allele","CHD_reference_allele","BMI_Other_Allele","CHD_other_allele","BMI_EAF","CHD_ref_allele_frequency")]
## BMI_SNP BMI_Effect_Allele CHD_reference_allele BMI_Other_Allele
## 3 rs1016287 T C C
## 5 rs10733682 A G G
## 8 rs11030104 A G G
## 10 rs11126666 A G G
## 11 rs11165643 T C C
## 16 rs11727676 T C C
## 17 rs11847697 T C C
## 18 rs12016871 T C C
## 20 rs12401738 A G G
## 21 rs12429545 A G G
## 28 rs13107325 T C C
## 29 rs13191362 A G G
## 31 rs1528435 T C C
## 32 rs1558902 A T T
## 33 rs16851483 T G G
## 34 rs16951275 T C C
## 35 rs17001654 G C C
## 38 rs17405819 T C C
## 39 rs17724992 A G G
## 41 rs1928295 T C C
## 45 rs2075650 A G G
## 46 rs2112347 T G G
## 47 rs2121279 T C C
## 48 rs2176598 T C C
## 53 rs2650492 A G G
## 57 rs3736485 A G G
## 58 rs3810291 A G G
## 59 rs3817334 T C C
## 60 rs3849570 A C C
## 61 rs3888190 A C C
## 62 rs4256980 G C C
## 63 rs4740619 T C C
## 67 rs657452 A G G
## 69 rs7138803 A G G
## 70 rs7141420 T C C
## 71 rs7243357 T G G
## 72 rs758747 T C C
## 77 rs9925964 A G G
## CHD_other_allele BMI_EAF CHD_ref_allele_frequency
## 3 T 0.287 0.72117719
## 5 A 0.478 0.52449229
## 8 A 0.792 0.19185393
## 10 A 0.283 0.71929726
## 11 T 0.583 0.41074541
## 16 T 0.910 0.08996777
## 17 T 0.042 0.96069173
## 18 T 0.203 0.80407573
## 20 A 0.352 0.62028006
## 21 A 0.133 0.845308
## 28 T 0.072 0.97601496
## 29 A 0.879 0.12900275
## 31 T 0.631 0.37962868
## 32 A 0.415 0.58450108
## 33 T 0.066 0.93429797
## 34 T 0.784 0.235351
## 35 G 0.153 0.83729352
## 38 T 0.700 0.32156614
## 39 A 0.746 0.25712419
## 41 T 0.548 0.44991607
## 45 A 0.848 0.1705933
## 46 T 0.629 0.36318919
## 47 T 0.152 0.86261919
## 48 T 0.251 0.74580823
## 53 A 0.303 0.70112768
## 57 A 0.454 0.55461993
## 58 A 0.666 0.33939316
## 59 T 0.407 0.58779101
## 60 A 0.359 0.61947841
## 61 A 0.403 0.57882566
## 62 G 0.646 0.34826703
## 63 T 0.542 0.46589474
## 67 A 0.394 0.60513584
## 69 A 0.384 0.61809264
## 70 T 0.527 0.48985341
## 71 T 0.812 0.19721707
## 72 T 0.265 0.74188258
## 77 A 0.620 0.38141338
a. How many SNPs have effect alleles that are coded in the
opposite direction?
b. Where the effect alleles are different,
flip the direction of the log odds ratio by multiplying it by -1.
If you want, you can also generate new columns that reflect the
effect allele change but this isn’t used in the causal estimate.
merged$CHD_flip_log_odds <- as.numeric(merged$CHD_log_odds) # Make log odds ratio numeric
merged$CHD_log_odds_se <- as.numeric(merged$CHD_log_odds_se) # Make standard error numeric
merged$CHD_flip_log_odds[effect_diff] <- merged$CHD_flip_log_odds[effect_diff]*(-1)
head(merged)
## BMI_SNP BMI_CHR BMI_BP BMI_NearestGene BMI_Effect_Allele
## 1 rs1000940 17 5,223,976 RABEP1 G
## 2 rs10132280 14 24,998,019 STXBP6 C
## 3 rs1016287 2 59,159,129 LINC01122 T
## 4 rs10182181 2 25,003,800 ADCY3 G
## 5 rs10733682 9 128,500,735 LMX1B A
## 6 rs10938397 4 44,877,284 GNPDA2 G
## BMI_Other_Allele BMI_Beta BMI_SE BMI_EAF BMI_Variance BMI_N BMI_P
## 1 A 0.019 0.003 0.320 0.016% 321,836 1.28e-08
## 2 A 0.023 0.003 0.682 0.023% 321,797 1.14e-11
## 3 C 0.023 0.003 0.287 0.021% 321,969 2.25e-11
## 4 A 0.031 0.003 0.462 0.047% 321,759 8.78e-24
## 5 G 0.017 0.003 0.478 0.015% 320,727 1.83e-08
## 6 A 0.040 0.003 0.434 0.079% 320,955 3.21e-38
## CHD_chr_pos_.b36. CHD_reference_allele CHD_other_allele
## 1 chr17:5223976 G A
## 2 chr14:24998019 C A
## 3 chr2:59159129 C T
## 4 chr2:25003800 G A
## 5 chr9:128500735 G A
## 6 chr4:44877284 G A
## CHD_ref_allele_frequency CHD_pvalue CHD_het_pvalue CHD_log_odds
## 1 0.31640352 0.5722296 0.79718646 -0.0091901
## 2 0.66888613 0.302347 0.2972794 0.0169764
## 3 0.72117719 0.102763 0.12279346 -0.0254175
## 4 0.46240927 0.9498548 0.61182332 0.0009032
## 5 0.52449229 0.1829671 0.5076359 -0.0212797
## 6 0.41363733 0.0278084 0.68643434 0.0349131
## CHD_log_odds_se CHD_N_case CHD_N_control CHD_model CHD_flip_log_odds
## 1 0.0162722 16633 54057 FE -0.0091901
## 2 0.0164594 16945 58264 FE 0.0169764
## 3 0.0155782 20468 61571 FE 0.0254175
## 4 0.0143625 19568 56734 FE 0.0009032
## 5 0.0159796 15670 51340 FE 0.0212797
## 6 0.0158698 16024 55685 FE 0.0349131
dim(merged)
## [1] 77 24
c. Check that all of the effect estimates have been flipped
appropriately.
merged[effect_diff,c("BMI_SNP", "BMI_Effect_Allele","CHD_reference_allele", "CHD_log_odds", "CHD_flip_log_odds")]
## BMI_SNP BMI_Effect_Allele CHD_reference_allele CHD_log_odds
## 3 rs1016287 T C -0.0254175
## 5 rs10733682 A G -0.0212797
## 8 rs11030104 A G -0.034644
## 10 rs11126666 A G 0.0209674
## 11 rs11165643 T C -0.0007751
## 16 rs11727676 T C 0.0253447
## 17 rs11847697 T C -0.0707465
## 18 rs12016871 T C -0.0181606
## 20 rs12401738 A G -0.0195047
## 21 rs12429545 A G -0.0480905
## 28 rs13107325 T C -0.0049381
## 29 rs13191362 A G -0.0116552
## 31 rs1528435 T C 0.0047002
## 32 rs1558902 A T -0.0324866
## 33 rs16851483 T G -0.0188109
## 34 rs16951275 T C -0.0513109
## 35 rs17001654 G C -0.0126514
## 38 rs17405819 T C -0.0015648
## 39 rs17724992 A G -0.0493236
## 41 rs1928295 T C 0.00052
## 45 rs2075650 A G 0.0378692
## 46 rs2112347 T G -0.0088627
## 47 rs2121279 T C -0.0032328
## 48 rs2176598 T C -0.0247371
## 53 rs2650492 A G 0.0126597
## 57 rs3736485 A G 0.0091099
## 58 rs3810291 A G -0.0422661
## 59 rs3817334 T C -0.0219146
## 60 rs3849570 A C -0.0390811
## 61 rs3888190 A C -0.0042697
## 62 rs4256980 G C 0.0150656
## 63 rs4740619 T C -0.0154694
## 67 rs657452 A G -0.0174996
## 69 rs7138803 A G 0.0006419
## 70 rs7141420 T C -0.0018271
## 71 rs7243357 T G -0.0098297
## 72 rs758747 T C 0.0613185
## 77 rs9925964 A G 0.0060442
## CHD_flip_log_odds
## 3 0.0254175
## 5 0.0212797
## 8 0.0346440
## 10 -0.0209674
## 11 0.0007751
## 16 -0.0253447
## 17 0.0707465
## 18 0.0181606
## 20 0.0195047
## 21 0.0480905
## 28 0.0049381
## 29 0.0116552
## 31 -0.0047002
## 32 0.0324866
## 33 0.0188109
## 34 0.0513109
## 35 0.0126514
## 38 0.0015648
## 39 0.0493236
## 41 -0.0005200
## 45 -0.0378692
## 46 0.0088627
## 47 0.0032328
## 48 0.0247371
## 53 -0.0126597
## 57 -0.0091099
## 58 0.0422661
## 59 0.0219146
## 60 0.0390811
## 61 0.0042697
## 62 -0.0150656
## 63 0.0154694
## 67 0.0174996
## 69 -0.0006419
## 70 0.0018271
## 71 0.0098297
## 72 -0.0613185
## 77 -0.0060442
merged[-effect_diff,c("BMI_SNP", "BMI_Effect_Allele","CHD_reference_allele", "CHD_log_odds", "CHD_flip_log_odds")]
## BMI_SNP BMI_Effect_Allele CHD_reference_allele CHD_log_odds
## 1 rs1000940 G G -0.0091901
## 2 rs10132280 C C 0.0169764
## 4 rs10182181 G G 0.0009032
## 6 rs10938397 G G 0.0349131
## 7 rs10968576 G G 0.00811
## 9 rs11057405 G G -0.0214346
## 12 rs11191560 C C -0.0966196
## 13 rs11583200 C C 0.0190464
## 14 rs1167827 G G 0.0190735
## 15 rs11688816 G G 0.0005135
## 19 rs12286929 G G 0.0001851
## 22 rs12446632 G G -0.0101474
## 23 rs12566985 G G -0.0128532
## 24 rs12885454 C C -0.0005252
## 25 rs12940622 G G -0.005711
## 26 rs13021737 G G 0.0186889
## 27 rs13078960 G G -0.0258172
## 30 rs1516725 C C -0.0279354
## 36 rs17024393 C C 0.0139407
## 37 rs17094222 C C -0.0059791
## 40 rs1808579 C C 0.0096342
## 42 rs2033529 G G -0.0061959
## 43 rs2033732 C C -0.0253631
## 44 rs205262 G G 0.0614484
## 49 rs2207139 G G 0.0170642
## 50 rs2245368 C C 0.0303632
## 51 rs2287019 C C 0.045534
## 52 rs2365389 C C 0.0136276
## 54 rs2820292 C C 0.0359029
## 55 rs29941 G G -0.0044497
## 56 rs3101336 C C 0.005948
## 64 rs543874 G G -0.004636
## 65 rs6477694 C C -0.026955
## 66 rs6567160 C C 0.024367
## 68 rs6804842 G G 0.0016094
## 73 rs7599312 G G 0.0016027
## 74 rs7899106 G G -0.0163075
## 75 rs7903146 C C -0.0192779
## 76 rs9400239 C C 0.0348705
## CHD_flip_log_odds
## 1 -0.0091901
## 2 0.0169764
## 4 0.0009032
## 6 0.0349131
## 7 0.0081100
## 9 -0.0214346
## 12 -0.0966196
## 13 0.0190464
## 14 0.0190735
## 15 0.0005135
## 19 0.0001851
## 22 -0.0101474
## 23 -0.0128532
## 24 -0.0005252
## 25 -0.0057110
## 26 0.0186889
## 27 -0.0258172
## 30 -0.0279354
## 36 0.0139407
## 37 -0.0059791
## 40 0.0096342
## 42 -0.0061959
## 43 -0.0253631
## 44 0.0614484
## 49 0.0170642
## 50 0.0303632
## 51 0.0455340
## 52 0.0136276
## 54 0.0359029
## 55 -0.0044497
## 56 0.0059480
## 64 -0.0046360
## 65 -0.0269550
## 66 0.0243670
## 68 0.0016094
## 73 0.0016027
## 74 -0.0163075
## 75 -0.0192779
## 76 0.0348705
d. Check that the effect allele frequencies are correlated.
Check correlation of effect allele frequency between BMI and
CARDIOGRAM datasets before harmonising alleles.
merged$BMI_EAF <- as.numeric(merged$BMI_EAF)
merged$CHD_ref_allele_frequency <- as.numeric(merged$CHD_ref_allele_frequency)
cor(merged$BMI_EAF,merged$CHD_ref_allele_frequency)
## [1] -0.007221126
Check correlation of effect allele frequency between BMI and
CARDIOGRAM datasets after harmonising alleles
merged$CHD_ref_allele_frequency[effect_diff] <- 1-merged$CHD_ref_allele_frequency[effect_diff]
cor(merged$BMI_EAF,merged$CHD_ref_allele_frequency)
## [1] 0.9980439
e. What happened and why?
1. Estimate the Wald ratios for each SNP and their delta approximated standard errors.
gp <- merged$BMI_Beta # The effect of the SNP on BMI
segp <- merged$BMI_SE # The standard error of the SNP effect on BMI
gd <- merged$CHD_flip_log_odds # The log odds ratio for CHD (that were harmonized to reflect an increase in BMI)
segd <- merged$CHD_log_odds_se # Standard error of the log odds ratio
wald_ratio <- gd/gp # The log odds ratios of CHD per unit change in BMI
Cov <- 0 # Only required when the SNP-BMI and SNP-CHD associations are estimated in the same participants (therefore for two-sample MR with non-overlapping samples, this is set to 0)
wald_ratio_se <- sqrt((segd^2/gp^2) + (gd^2/gp^4)*segp^2 - 2*(gd/gp^3)*Cov) # Delta approximated standard error of the wald ratio; see Thomas, D. C., Lawlor, D. a, & Thompson, J. R. (2007). Re: Estimation of bias in nongenetic observational studies using Mendelian triangulation by Bautista et al. Annals of Epidemiology, 17(7), 5113. doi:10.1016/j.annepidem.2006.12.005
z <- wald_ratio/wald_ratio_se # Z statistic for the wald ratio
p <- 2*pnorm(abs(z) ,lower.tail=F) # P value for the z statistics under the null hypothesis that there is not effect
wald_ratio_var = wald_ratio_se^2 # Variance
weight <- 1/wald_ratio_var # Inverse variance weight
snps <- merged$BMI_SNP # SNPs that we will use in the estimates
2. Combine the Wald ratios by fixed effects
meta-analysis.
meta_results <- metagen(wald_ratio,wald_ratio_se,comb.fixed=T,sm="OR") #combine the SNPs by fixed effects meta-analysis
meta_results
## Number of studies: k = 77
##
## OR 95%-CI z p-value
## Common effect model 1.3189 [1.1558; 1.5051] 4.11 < 0.0001
## Random effects model 1.2950 [1.1081; 1.5133] 3.25 0.0012
##
## Quantifying heterogeneity:
## tau^2 = 0.0723 [0.0447; 0.6449]; tau = 0.2689 [0.2114; 0.8031]
## I^2 = 31.5% [9.0%; 48.5%]; H = 1.21 [1.05; 1.39]
##
## Test of heterogeneity:
## Q d.f. p-value
## 110.97 76 0.0055
##
## Details on meta-analytical method:
## - Inverse variance method
## - Restricted maximum-likelihood estimator for tau^2
## - Q-Profile method for confidence interval of tau^2 and tau
3. Create a nice table of the results, which you could
export to other programs e.g. excel, STATA etc.
mr_results <- data.frame(matrix(c(as.character(merged$BMI_SNP),round(wald_ratio,2),round(wald_ratio_se,2),round(p,3)),nrow=length(merged$BMI_SNP),ncol=4))
names(mr_results) <- c("SNP","log_odds","se","p")
mr_results_order <- mr_results[order(mr_results$log_odds),]
overall_genetic_effect <- data.frame(matrix(c("Overall genetic effect",meta_results$TE.fixed,meta_results$seTE.fixed,meta_results$pval.fixed),nrow=1,ncol=4))
names(overall_genetic_effect) <- c("SNP","log_odds","se","p")
overall_genetic_effect
## SNP log_odds se
## 1 Overall genetic effect 0.276824963891921 0.0673660975621878
## p
## 1 3.96925063511262e-05
twosampleResults <- rbind.fill(mr_results_order,overall_genetic_effect)
twosampleResults
## SNP log_odds se
## 1 rs7138803 -0.02 0.47
## 2 rs12885454 -0.03 0.71
## 3 rs1928295 -0.03 0.74
## 4 rs543874 -0.1 0.37
## 5 rs17094222 -0.24 0.69
## 6 rs12446632 -0.25 0.53
## 7 rs29941 -0.25 0.83
## 8 rs1528435 -0.26 0.79
## 9 rs12940622 -0.32 0.77
## 10 rs9925964 -0.32 0.75
## 11 rs2033529 -0.33 0.83
## 12 rs7899106 -0.41 0.75
## 13 rs1000940 -0.48 0.86
## 14 rs3736485 -0.51 0.78
## 15 rs12566985 -0.54 0.58
## 16 rs2650492 -0.6 1.05
## 17 rs1516725 -0.62 0.45
## 18 rs11057405 -0.69 0.99
## 19 rs11727676 -0.7 0.92
## 20 rs4256980 -0.72 0.7
## 21 rs7903146 -0.84 0.67
## 22 rs13078960 -0.86 0.59
## 23 rs11126666 -1 0.76
## 24 rs2033732 -1.33 0.93
## 25 rs2075650 -1.46 1.16
## 26 rs6477694 -1.59 0.89
## 27 rs758747 -2.67 1.1
## 28 rs11191560 -3.12 0.97
## 29 rs12286929 0.01 0.63
## 30 rs10182181 0.03 0.46
## 31 rs11688816 0.03 0.83
## 32 rs11165643 0.04 0.64
## 33 rs17405819 0.07 0.68
## 34 rs7599312 0.07 0.72
## 35 rs6804842 0.08 0.76
## 36 rs7141420 0.08 0.58
## 37 rs13107325 0.1 0.88
## 38 rs2121279 0.13 0.81
## 39 rs3888190 0.14 0.85
## 40 rs3101336 0.18 0.44
## 41 rs17024393 0.21 0.68
## 42 rs13021737 0.31 0.31
## 43 rs10968576 0.32 0.6
## 44 rs2112347 0.34 0.56
## 45 rs2207139 0.38 0.41
## 46 rs16851483 0.39 0.6
## 47 rs1558902 0.4 0.17
## 48 rs17001654 0.41 0.61
## 49 rs13191362 0.42 0.74
## 50 rs6567160 0.44 0.29
## 51 rs7243357 0.45 0.84
## 52 rs1808579 0.57 0.85
## 53 rs12016871 0.61 0.59
## 54 rs2365389 0.68 0.71
## 55 rs10132280 0.74 0.72
## 56 rs657452 0.76 0.64
## 57 rs11030104 0.84 0.44
## 58 rs3817334 0.84 0.55
## 59 rs4740619 0.86 0.79
## 60 rs10938397 0.87 0.4
## 61 rs12401738 0.93 0.73
## 62 rs1167827 0.95 0.82
## 63 rs2245368 0.95 1.6
## 64 rs11583200 1.06 0.81
## 65 rs1016287 1.11 0.69
## 66 rs2176598 1.24 0.85
## 67 rs10733682 1.25 0.97
## 68 rs2287019 1.26 0.69
## 69 rs11847697 1.44 0.77
## 70 rs12429545 1.46 0.74
## 71 rs3810291 1.51 0.73
## 72 rs16951275 1.66 0.59
## 73 rs2820292 1.8 0.75
## 74 rs9400239 1.84 0.86
## 75 rs3849570 2.06 1.38
## 76 rs17724992 2.6 1.03
## 77 rs205262 2.79 0.88
## 78 Overall genetic effect 0.276824963891921 0.0673660975621878
## p
## 1 0.966
## 2 0.972
## 3 0.97
## 4 0.793
## 5 0.728
## 6 0.63
## 7 0.766
## 8 0.742
## 9 0.682
## 10 0.67
## 11 0.694
## 12 0.587
## 13 0.574
## 14 0.519
## 15 0.36
## 16 0.566
## 17 0.168
## 18 0.484
## 19 0.444
## 20 0.305
## 21 0.21
## 22 0.142
## 23 0.189
## 24 0.153
## 25 0.209
## 26 0.074
## 27 0.016
## 28 0.001
## 29 0.989
## 30 0.95
## 31 0.971
## 32 0.956
## 33 0.917
## 34 0.919
## 35 0.911
## 36 0.895
## 37 0.907
## 38 0.874
## 39 0.871
## 40 0.68
## 41 0.756
## 42 0.316
## 43 0.588
## 44 0.542
## 45 0.357
## 46 0.511
## 47 0.021
## 48 0.506
## 49 0.574
## 50 0.135
## 51 0.596
## 52 0.503
## 53 0.309
## 54 0.34
## 55 0.307
## 56 0.232
## 57 0.054
## 58 0.125
## 59 0.276
## 60 0.03
## 61 0.203
## 62 0.244
## 63 0.553
## 64 0.193
## 65 0.111
## 66 0.144
## 67 0.195
## 68 0.068
## 69 0.059
## 70 0.049
## 71 0.039
## 72 0.005
## 73 0.016
## 74 0.032
## 75 0.137
## 76 0.012
## 77 0.002
## 78 3.96925063511262e-05
Calculate heterogeneity p value.
p_het <- pchisq(meta_results$Q,meta_results$df.Q,lower.tail=F)
twosampleResults$p_chi<-NA
Add the p value for heterogeneity to the results table.
twosampleResults$p_chi[twosampleResults$SNP=="Overall genetic effect"] <- p_het
write.table(twosampleResults,"./twosample_results_BMI_CHD.txt",sep="\t",col.names=T,row.names=F,quote=F)
twosampleResults
## SNP log_odds se
## 1 rs7138803 -0.02 0.47
## 2 rs12885454 -0.03 0.71
## 3 rs1928295 -0.03 0.74
## 4 rs543874 -0.1 0.37
## 5 rs17094222 -0.24 0.69
## 6 rs12446632 -0.25 0.53
## 7 rs29941 -0.25 0.83
## 8 rs1528435 -0.26 0.79
## 9 rs12940622 -0.32 0.77
## 10 rs9925964 -0.32 0.75
## 11 rs2033529 -0.33 0.83
## 12 rs7899106 -0.41 0.75
## 13 rs1000940 -0.48 0.86
## 14 rs3736485 -0.51 0.78
## 15 rs12566985 -0.54 0.58
## 16 rs2650492 -0.6 1.05
## 17 rs1516725 -0.62 0.45
## 18 rs11057405 -0.69 0.99
## 19 rs11727676 -0.7 0.92
## 20 rs4256980 -0.72 0.7
## 21 rs7903146 -0.84 0.67
## 22 rs13078960 -0.86 0.59
## 23 rs11126666 -1 0.76
## 24 rs2033732 -1.33 0.93
## 25 rs2075650 -1.46 1.16
## 26 rs6477694 -1.59 0.89
## 27 rs758747 -2.67 1.1
## 28 rs11191560 -3.12 0.97
## 29 rs12286929 0.01 0.63
## 30 rs10182181 0.03 0.46
## 31 rs11688816 0.03 0.83
## 32 rs11165643 0.04 0.64
## 33 rs17405819 0.07 0.68
## 34 rs7599312 0.07 0.72
## 35 rs6804842 0.08 0.76
## 36 rs7141420 0.08 0.58
## 37 rs13107325 0.1 0.88
## 38 rs2121279 0.13 0.81
## 39 rs3888190 0.14 0.85
## 40 rs3101336 0.18 0.44
## 41 rs17024393 0.21 0.68
## 42 rs13021737 0.31 0.31
## 43 rs10968576 0.32 0.6
## 44 rs2112347 0.34 0.56
## 45 rs2207139 0.38 0.41
## 46 rs16851483 0.39 0.6
## 47 rs1558902 0.4 0.17
## 48 rs17001654 0.41 0.61
## 49 rs13191362 0.42 0.74
## 50 rs6567160 0.44 0.29
## 51 rs7243357 0.45 0.84
## 52 rs1808579 0.57 0.85
## 53 rs12016871 0.61 0.59
## 54 rs2365389 0.68 0.71
## 55 rs10132280 0.74 0.72
## 56 rs657452 0.76 0.64
## 57 rs11030104 0.84 0.44
## 58 rs3817334 0.84 0.55
## 59 rs4740619 0.86 0.79
## 60 rs10938397 0.87 0.4
## 61 rs12401738 0.93 0.73
## 62 rs1167827 0.95 0.82
## 63 rs2245368 0.95 1.6
## 64 rs11583200 1.06 0.81
## 65 rs1016287 1.11 0.69
## 66 rs2176598 1.24 0.85
## 67 rs10733682 1.25 0.97
## 68 rs2287019 1.26 0.69
## 69 rs11847697 1.44 0.77
## 70 rs12429545 1.46 0.74
## 71 rs3810291 1.51 0.73
## 72 rs16951275 1.66 0.59
## 73 rs2820292 1.8 0.75
## 74 rs9400239 1.84 0.86
## 75 rs3849570 2.06 1.38
## 76 rs17724992 2.6 1.03
## 77 rs205262 2.79 0.88
## 78 Overall genetic effect 0.276824963891921 0.0673660975621878
## p p_chi
## 1 0.966 NA
## 2 0.972 NA
## 3 0.97 NA
## 4 0.793 NA
## 5 0.728 NA
## 6 0.63 NA
## 7 0.766 NA
## 8 0.742 NA
## 9 0.682 NA
## 10 0.67 NA
## 11 0.694 NA
## 12 0.587 NA
## 13 0.574 NA
## 14 0.519 NA
## 15 0.36 NA
## 16 0.566 NA
## 17 0.168 NA
## 18 0.484 NA
## 19 0.444 NA
## 20 0.305 NA
## 21 0.21 NA
## 22 0.142 NA
## 23 0.189 NA
## 24 0.153 NA
## 25 0.209 NA
## 26 0.074 NA
## 27 0.016 NA
## 28 0.001 NA
## 29 0.989 NA
## 30 0.95 NA
## 31 0.971 NA
## 32 0.956 NA
## 33 0.917 NA
## 34 0.919 NA
## 35 0.911 NA
## 36 0.895 NA
## 37 0.907 NA
## 38 0.874 NA
## 39 0.871 NA
## 40 0.68 NA
## 41 0.756 NA
## 42 0.316 NA
## 43 0.588 NA
## 44 0.542 NA
## 45 0.357 NA
## 46 0.511 NA
## 47 0.021 NA
## 48 0.506 NA
## 49 0.574 NA
## 50 0.135 NA
## 51 0.596 NA
## 52 0.503 NA
## 53 0.309 NA
## 54 0.34 NA
## 55 0.307 NA
## 56 0.232 NA
## 57 0.054 NA
## 58 0.125 NA
## 59 0.276 NA
## 60 0.03 NA
## 61 0.203 NA
## 62 0.244 NA
## 63 0.553 NA
## 64 0.193 NA
## 65 0.111 NA
## 66 0.144 NA
## 67 0.195 NA
## 68 0.068 NA
## 69 0.059 NA
## 70 0.049 NA
## 71 0.039 NA
## 72 0.005 NA
## 73 0.016 NA
## 74 0.032 NA
## 75 0.137 NA
## 76 0.012 NA
## 77 0.002 NA
## 78 3.96925063511262e-05 0.00550122
4. Create a forest plot of the results and compare
the genetic and observational associations.
The
observational effect is 1.23 (95% CI: 1.17, 1.29) per 4.56 kg/m2 (i.e.,
per SD) increase in BMI.
Formula for SE from 95% confidence
interval: (log(uci)-log(lci))/(1.96*2)
effect <- c(wald_ratio[order(wald_ratio)],meta_results$TE.fixed,log(1.23))
se <- c(wald_ratio_se[order(wald_ratio)],meta_results$seTE.fixed,(log(1.29)-log(1.17))/(1.96*2))
snps <- c(as.character(merged$BMI_SNP)[order(wald_ratio)],"Overall genetic effect","Observational effect")
metaplot(effect,se,labels=snps,conf.level=0.95,logeffect=T,nn=0.1,boxsize=1,xlab="Odds ratio & 95% confidence interval",ylab="SNP",cex=0.7)
5. Interpret the results.
a. Is the
MR-derived effect similar to the observational association?
b.
Is there evidence of heterogeneity in the genetic effects? How do you
interpret this?
c. Can you think of reasons for caution?
All that is required is summary level results for each SNP (remember
gp, segp, gd, segd from PART 4).
gp <- merged$BMI_Beta (The effect of the SNP on BMI)
segp <- merged$BMI_SE (The standard error of the SNP effect on BMI)
gd <- merged$CHD_flip_log_odd (The log odds ratio for CHD (that were harmonized to reflect an increase in BMI)
segd <- merged$CHD_log_odds_se (Standard error of the log odds ratio)
1. These functions define the IVW, MR-Egger, weighted
median and weighted mode estimators, respectively, and a function that
wraps up the results.
set seed for replication
purposes
set.seed(50)
two.sample.iv.ivw <- function(x, y, sigmax, sigmay) {
beta.ivw.fit = summary(lm(y~x-1, weights=sigmay^-2))
beta.ivw.fit.only = lm(y~x-1, weights=sigmay^-2)
beta.ivw = beta.ivw.fit$coef[1,1]
beta.se.ivw = beta.ivw.fit$coef[1,2]/min(beta.ivw.fit$sigma,1)
beta.df.ivw = length(y) - 1
beta.p.ivw = 2*(1-pt(abs(beta.ivw/beta.se.ivw),beta.df.ivw))
beta.lower.ivw = beta.ivw + (-1*qt(df=beta.df.ivw, 0.975)*beta.se.ivw)
beta.upper.ivw = beta.ivw + (1*qt(df=beta.df.ivw, 0.975)*beta.se.ivw)
return(list(beta.ivw=beta.ivw,beta.se.ivw=beta.se.ivw,beta.lower.ivw=beta.lower.ivw,beta.upper.ivw=beta.upper.ivw,beta.t.ivw=beta.ivw/beta.se.ivw,beta.p.ivw=beta.p.ivw,
beta.ivw.fit.only=beta.ivw.fit.only,beta.df.ivw=beta.df.ivw,beta.ivw.fit=beta.ivw.fit))
}
weighted.median <- function(x, w) {
N = length(x)
ord = order(x);
x = x[ord];
w = w[ord];
Sn = cumsum(w)
S_N = Sn[N]
Pn = (100/S_N)*(Sn-w/2)
if(sort(abs(Pn-50))[1] == 0){M = which(Pn==50); return(x[M])}
Q = length(Pn[sign(Pn-50)==-1])
V1 = Q; V2 = Q+1
M = x[V1] + (50 - Pn[V1])*(x[V2]-x[V1])/(Pn[V2]-Pn[V1])
return(list(beta.median=M,CumSum.median=Sn,ordX.median=x))
}
weighted.median.boot <- function(x, y, sigmax, sigmay, Nsim, alpha, W) {
med = NULL
for (i in 1:Nsim){
y_boot = rnorm(length(y), mean=y, sd=sigmay)
x_boot = rnorm(length(x), mean=x, sd=sigmax)
iv_boot = y_boot/x_boot
run = weighted.median(iv_boot,W)
med[i] = run$beta.median
}
lower = Nsim*alpha/2
upper = Nsim*(1-alpha/2)
Sort = sort(med)
lowerCI = Sort[lower]
upperCI = Sort[upper]
se = sd(med)
t = mean(med)/se
p = 2*(1-pt(abs(t),length(y)-1))
return(list(beta.se.median=se,beta.lower.median=lowerCI,beta.upper.median=upperCI,beta.t.median=t,beta.p.median=p))
}
two.sample.iv.egger <- function(x, y, sigmax, sigmay) {
egger.fit = summary(lm(y~x, weights=sigmay^-2))
df.egger = length(y) - 2
beta.egger = egger.fit$coef[2,1]
beta.se.egger = egger.fit$coef[2,2] / min(egger.fit$sigma, 1)
beta.p.egger = 2*(1-pt(abs(beta.egger/beta.se.egger),df.egger))
beta.lower.egger = beta.egger + (-1*qt(df=df.egger, 0.975)*beta.se.egger)
beta.upper.egger = beta.egger + (1*qt(df=df.egger, 0.975)*beta.se.egger)
alpha.egger = egger.fit$coef[1,1]
alpha.se.egger = egger.fit$coef[1,2] / min(egger.fit$sigma, 1)
alpha.p.egger = 2*(1-pt(abs(alpha.egger/alpha.se.egger),df.egger))
alpha.lower.egger = alpha.egger + (-1*qt(df=df.egger, 0.975)*alpha.se.egger)
alpha.upper.egger = alpha.egger + (1*qt(df=df.egger, 0.975)*alpha.se.egger)
return(list(beta.egger=beta.egger,beta.se.egger=beta.se.egger,beta.lower.egger=beta.lower.egger,beta.upper.egger=beta.upper.egger,beta.t.egger=beta.egger/beta.se.egger,beta.p.egger=beta.p.egger,
alpha.egger=alpha.egger,alpha.se.egger=alpha.se.egger,alpha.lower.egger=alpha.lower.egger,alpha.upper.egger=alpha.upper.egger,alpha.t.egger=alpha.egger/alpha.se.egger,alpha.p.egger=alpha.p.egger))
}
ModeEstimator <- function(x, y, sigmax, sigmay, phi=c(1,0.5,0.25), n_boot=1e4, alpha=0.05) {
beta <- function(BetaIV.in, seBetaIV.in) {
s <- 0.9*(min(sd(BetaIV.in), mad(BetaIV.in)))/length(BetaIV.in)^(1/5)
weights <- seBetaIV.in^-2/sum(seBetaIV.in^-2)
beta <- NULL
for(cur_phi in phi) {
h <- s*cur_phi
densityIV <- density(BetaIV.in, weights=weights, bw=h)
beta[length(beta)+1] <- densityIV$x[densityIV$y==max(densityIV$y)]
}
return(beta)
}
boot <- function(BetaIV.in, seBetaIV.in, beta_Mode.in) {
beta.boot <- matrix(nrow=n_boot, ncol=length(beta_Mode.in))
for(i in 1:n_boot) {
BetaIV.boot <- rnorm(length(BetaIV.in), mean=BetaIV.in, sd=seBetaIV.in[,1])
BetaIV.boot_NOME <- rnorm(length(BetaIV.in), mean=BetaIV.in, sd=seBetaIV.in[,2])
beta.boot[i,1:length(phi)] <- beta(BetaIV.in=BetaIV.boot, seBetaIV.in=rep(1, length(BetaIV)))
beta.boot[i,(length(phi)+1):(2*length(phi))] <- beta(BetaIV.in=BetaIV.boot, seBetaIV.in=seBetaIV.in[,1])
beta.boot[i,(2*length(phi)+1):(3*length(phi))] <- beta(BetaIV.in=BetaIV.boot_NOME, seBetaIV.in=rep(1, length(BetaIV)))
beta.boot[i,(3*length(phi)+1):(4*length(phi))] <- beta(BetaIV.in=BetaIV.boot_NOME, seBetaIV.in=seBetaIV.in[,2])
}
return(beta.boot)
}
BetaIV <- y/x
seBetaIV <- cbind(sqrt((sigmay^2)/(x^2) + ((y^2)*(sigmax^2))/(x^4)), sigmay/abs(x))
beta_SimpleMode <- beta(BetaIV.in=BetaIV, seBetaIV.in=rep(1, length(BetaIV)))
beta_WeightedMode <- beta(BetaIV.in=BetaIV, seBetaIV.in=seBetaIV[,1])
beta_WeightedMode_NOME <- beta(BetaIV.in=BetaIV, seBetaIV.in=seBetaIV[,2])
beta_Mode <- rep(c(beta_SimpleMode, beta_WeightedMode,
beta_SimpleMode, beta_WeightedMode_NOME))
beta_Mode.boot <- boot(BetaIV.in=BetaIV, seBetaIV.in=seBetaIV, beta_Mode.in=beta_Mode)
se_Mode <- apply(beta_Mode.boot, 2, mad)
CIlow_Mode <- beta_Mode-qnorm(1-alpha/2)*se_Mode
CIupp_Mode <- beta_Mode+qnorm(1-alpha/2)*se_Mode
P_Mode <- pt(abs(beta_Mode/se_Mode), df=length(x)-1, lower.tail=F)*2
Method <- rep(c('Simple', 'Weighted', 'Simple (NOME)', 'Weighted (NOME)'), each=length(phi))
Results <- data.frame(Method, phi, beta_Mode, se_Mode, CIlow_Mode, CIupp_Mode, P_Mode)
colnames(Results) <- c('Method', 'phi', 'Estimate', 'SE', 'CI_low', 'CI_upp', 'P')
return(Results)
}
MR_output <- function(ivw,egger,median, mode) {
output = data.frame(matrix(NA, nrow=5, ncol=7))
names(output) = c("test", "parameter", "estimate", "se", "lower_CI", "upper_CI","p_value")
output[1:5,1] = c("IVW","MR-Egger","MR-Egger","Weighted_median","Weighted_mode")
output[1:5,2] = c("beta","beta","alpha","beta","beta")
output[1,3:7] = c(IVW$beta.ivw,IVW$beta.se.ivw,IVW$beta.lower.ivw,IVW$beta.upper.ivw,IVW$beta.p.ivw)
output[2,3:7] = c(Egger$beta.egger,Egger$beta.se.egger,Egger$beta.lower.egger,Egger$beta.upper.egger,Egger$beta.p.egger)
output[3,3:7] = c(Egger$alpha.egger,Egger$alpha.se.egger,Egger$alpha.lower.egger,Egger$alpha.upper.egger,Egger$alpha.p.egger)
output[4,3:7] = c(Median$beta.median,MedianBoot$beta.se.median,MedianBoot$beta.lower.median,MedianBoot$beta.upper.median,MedianBoot$beta.p.median)
output[5,3:7] = c(Mode$Estimate[Mode$Method=="Weighted (NOME)" & Mode$phi==1.00],Mode$SE[Mode$Method=="Weighted (NOME)" & Mode$phi==1.00],Mode$CI_low[Mode$Method=="Weighted (NOME)" & Mode$phi==1.00],Mode$CI_upp[Mode$Method=="Weighted (NOME)" & Mode$phi==1.00],Mode$P[Mode$Method=="Weighted (NOME)" & Mode$phi==1.00])
return(output)
}
2. Use the functions to estimate the results.
IVW <- two.sample.iv.ivw(gp,gd,segp,segd)
Egger <- two.sample.iv.egger(gp,gd,segp,segd)
Median <- weighted.median(wald_ratio,weight)
MedianBoot <- weighted.median.boot(gp,gd,segp,segd,1000,0.05,weight)
Mode <- ModeEstimator(gp,gd,segp,segd)
sensitivity <- MR_output(IVW,Egger,Median,Mode)
sensitivity
## test parameter estimate se lower_CI upper_CI
## 1 IVW beta 0.287658553 0.086237732 0.11590122 0.459415882
## 2 MR-Egger beta 0.375935570 0.209803912 -0.04201525 0.793886395
## 3 MR-Egger alpha -0.002791481 0.006041587 -0.01482694 0.009243978
## 4 Weighted_median beta 0.379652982 0.117839188 0.08299413 0.554455224
## 5 Weighted_mode beta 0.311258179 0.129495876 0.05745093 0.565065432
## p_value
## 1 0.001318512
## 2 0.077191677
## 3 0.645387108
## 4 0.005920437
## 5 0.018671609
write.table(sensitivity,"./twosample_sensitivity_BMI_CHD.txt",sep="\t",col.names=T,row.names=F,quote=F)
IVW - Inverse Variance Weighted - Combines the Wald
ratios using an inverse variance weighted meta-analysis, where the
weight of each ratio is the inverse of the variance of the association
between the SNP and the outcome.
MR-Egger -
combines the Wald ratio’s together into a meta-regression to estimate
the causal effect adjusted for any directional pleiotropy. This approach
is less powered than the IVW.
MR-Egger (alpha)
- the intercept of the MR-egger meta-regression. Provides an indication
of horizontal pleiotropy when it is not null.
Weighted_median - assigns a weight to each SNP derived
from the inverse variance of each SNP’s effect on the outcome. This
robust method requires only 50% of the variants to be valid and not
exhibit horizontal pleiotropy, unmeasured confounding, etc.
Weighted_mode - also assigns a weight to each SNP
derived from the inverse variance of each SNP’s effect on the outcome.
used to provide a robust estimate of the causal effect in the presence
of heterogeneity.