GCTA

a tool for Genome-wide Complex Trait Analysis

 

--make-grm

or

--make-grm-bin

Estimate the genetic relationship matrix (GRM) between pairs of individuals from a set of SNPs and save the lower triangle elements of the GRM to binary files, e.g. test.grm.bin, test.grm.N.bin, test.grm.id.

Output files

test.grm.bin (it is a binary file which contains the lower triangle elements of the GRM).

test.grm.N.bin (it is a binary file which contains the number of SNPs used to calculate the GRM).

test.grm.id (no header line; columns are family ID and individual ID, see above).

You can not open test.grm.bin or test.grm.N.bin by a text editor but you can use the following R script to read them in R)

# R script to read the GRM binary file

ReadGRMBin=function(prefix, AllN=F, size=4){

  sum_i=function(i){

    return(sum(1:i))

  }

  BinFileName=paste(prefix,".grm.bin",sep="")

  NFileName=paste(prefix,".grm.N.bin",sep="")

  IDFileName=paste(prefix,".grm.id",sep="")

  id = read.table(IDFileName)

  n=dim(id)[1]

  BinFile=file(BinFileName, "rb");

  grm=readBin(BinFile, n=n*(n+1)/2, what=numeric(0), size=size)

  NFile=file(NFileName, "rb");

  if(AllN==T){

    N=readBin(NFile, n=n*(n+1)/2, what=numeric(0), size=size)

  }

  else N=readBin(NFile, n=1, what=numeric(0), size=size)

  i=sapply(1:n, sum_i)

  return(list(diag=grm[i], off=grm[-i], id=id, N=N))

}

 

--make-grm-gz

Estimate the GRM, save the lower triangle elements to a compressed text file (e.g. test.grm.gz) and save the IDs in a plain text file (e.g. test.grm.id).

Output file format

test.grm.gz (no header line; columns are indices of pairs of individuals (row numbers of the test.grm.id), number of non-missing SNPs and the estimate of genetic relatedness)

1    1    1000    1.0021

2    1     998     0.0231

2    2     999     0.9998

3    1    1000    -0.0031

……

test.grm.id (no header line; columns are family ID and individual ID)

011      0101

012      0102

013      0103

……

 

 

--make-grm-xchr

Estimate the GRM from SNPs on the X-chromosome. The GRM will be saved in the same binary format as above (*.grm.bin, *.grm.N.bin and *.grm.id). Due to the speciality of the GRM for the X-chromosome, it is not recommended to manipulate the matrix by --grm-cutoff or --grm-adj, or merge it with the GRMs for autosomes (see below for the options of manipulating the GRM).

 

--make-grm-xchr-gz

Same as --make-grm-xchr but the GRM will be in compressed text files (see --make-grm-gz for the format of the output files).

 

--ibc

Estimate the inbreeding coefficient from the SNPs by 3 different methods (see the software paper for details).

Output file format

test.ibc (one header line; columns are family ID, individual ID, number of nonmissing SNPs, estimator 1, estimator 2 and estimator 3)

FID       IID           NOMISS      Fhat1          Fhat2              Fhat3

011      0101       999             0.00210      0.00198          0.00229

012      0102       1000          -0.0033        -0.0029          -0.0031

013      0103       988             0.00120      0.00118          0.00134

……

 

Examples

# Estimate the GRM from all the autosomal SNPs

gcta64  --bfile test  --autosome  --make-grm  --out test

# Estimate the GRM from the SNPs on the X-chromosome

gcta64  --bfile test  --make-grm-xchr  --out test_xchr

# Estimate the GRM from the SNPs on chromosome 1 with MAF from 0.1 to 0.4

gcta64  --bfile test  --chr 1  --maf 0.1  --max-maf 0.4  --make-grm  --out test

# Estimate the GRM using a subset of individuals and a subset of autosomal SNPs with MAF < 0.01

gcta64  --bfile test  --keep test.indi.list  --extract test.snp.list  --autosome  --maf 0.01 --make-grm  --out test

# Estimate the GRM from the imputed dosage scores for the SNPs with MAF > 0.01 and imputation R2 > 0.3

gcta64  --dosage-mach  test.mldose.gz  test.mlinfo.gz  --imput-rsq  0.3  --maf 0.01  --make-grm --out test

# Estimate the GRM from the imputed dosage scores for a subset of individuals and a subset of SNPs

gcta64  --dosage-mach  test.mldose.gz  test.mlinfo.gz  --keep test.indi.list  --extract test.snp.list  --make-grm --out test

# Estimate the inbreeding coefficient from all the autosomal SNPs

gcta64  --bfile test  --autosome  --ibc  --out test

 
 

References

 

Method for estimating the GRM: Yang et al. (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 42(7): 565-9. [PubMed ID: 20562875]

 

Method for estimating the inbreeding coefficients and GCTA software: Yang J, Lee SH, Goddard ME and Visscher PM. GCTA: a tool for Genome-wide Complex Trait Analysis. Am J Hum Genet. 2011 Jan 88(1): 76-82. [PubMed ID: 21167468]

 

Overview

Download

Tutorial

FAQ

Options

1. Input and output

2. Data management

3. Estimation of the genetic relationships

4. Manipulation of the genetic relationship matrix

5. Principal component analysis

6. Estimation of the variance explained by all the SNPs

7. Estimation of the LD structure

8. GWAS Simulation

9. Raw genotype data

10. Conditional & joint GWAS analysis

11. Bivariate REML analysis

12. Mixed Linear Model Association Analysis

13. Multi-thread computing

 

 

GCTA-GRM: estimating the genetic relationships between individuals using SNP data