This runs on the GGWS server.
We provide summary statistics for two GWAS of Waist-to-Hip Ratio (WHR) in males and females conducted by the GIANT consortium.
Go to /scratch/module3/
and create a directory where to
run the practical and copy GWAS summary statistics there
cd
cd module3
mkdir prac5/
cd prac5
cp /data/module3/Prac5_LDSC/GIANT_2015_WHR_*_EUR.txt .
Load LDSC software and required libraries (more details available on github: https://github.com/bulik/ldsc)
conda activate ldsc
Format GWAS summary statistics for LDSC analysis
/software/ldsc/munge_sumstats.py \
--sumstats GIANT_2015_WHR_FEMALES_EUR.txt \
--merge-alleles /data/module3/Prac5_LDSC/eur_w_ld_chr/w_hm3.snplist --chunksize 1000000 \
--out giant_whr_females
/software/ldsc/munge_sumstats.py \
--sumstats GIANT_2015_WHR_MALES_EUR.txt \
--merge-alleles /data/module3/Prac5_LDSC/eur_w_ld_chr/w_hm3.snplist --chunksize 1000000 \
--out giant_whr_males
Run the following commands
/software/ldsc/ldsc.py --h2 giant_whr_females.sumstats.gz \
--ref-ld-chr /data/module3/Prac5_LDSC/eur_w_ld_chr/ \
--w-ld-chr /data/module3/Prac5_LDSC/eur_w_ld_chr/ \
--out h2_giant_whr_females
/software/ldsc/ldsc.py --h2 giant_whr_males.sumstats.gz \
--ref-ld-chr /data/module3/Prac5_LDSC/eur_w_ld_chr/ \
--w-ld-chr /data/module3/Prac5_LDSC/eur_w_ld_chr/ \
--out h2_giant_whr_males
Question 1
How many SNPs were used in each regression?
What is the heritability of WHR in males and females?
What is the LD score intercept? Is this expected? If not then what could explain this observation?
Run the following command
/software/ldsc/ldsc.py --rg giant_whr_females.sumstats.gz,giant_whr_males.sumstats.gz \
--ref-ld-chr /data/module3/Prac5_LDSC/eur_w_ld_chr/ \
--w-ld-chr /data/module3/Prac5_LDSC/eur_w_ld_chr/ --out rg_giant_whr
Question 2
How many SNPs were used in each regression?
What is the heritability of WHR in males and females? Is this
different from your previous results? Rerun the LDSC commands from
Question 1 adding the following flag:
--two-step INFINITY
. What can you conclude?
Interpret the bivariate LD score intercept.
What can you conclude regarding the genetic architecture of WHR in males and females.
We now provide summary statistics for two GWAS of Waist-to-Hip Ratio (WHR) in males and females in the UK Biobank (UKB). The files are already formatted for LDSC.
Copy the new GWAS summary statistics
cp /data/module3/Prac5_LDSC/ukb_whr_*.sumstats.gz .
Question 3
Estimate the heritability of WHR in males and females from the UKB. How does it compare with your previous results. Is this expected?
Estimate the genetic correlation between UKB males and UKB females for WHR. How does it compare with you previous analyses using data from GIANT?
Interpret the bivariate LDSC intercept for the two WHR GWAS in the UK Biobank.
Estimate the genetic correlation between UKB (fe)males and GIANT (fe)males for WHR.
Download LDSC files
git clone https://github.com/bulik/ldsc
Download LDSC scores calculated in European ancestries individuals from the 1000 Genomes Project
wget https://data.broadinstitute.org/alkesgroup/LDSCORE/eur_w_ld_chr.tar.bz2
tar -xvf eur_w_ld_chr.tar.bz2
Unfortunately this link is not active at the moment.