pleioh2g-tutorial

library(pleioh2g)

pleioh2g-tutorial.R ### Installation:

install.packages("devtools")
devtools::install_github("yjzhao1004/pleioh2g")
library(pleioh2g)

Steps

Step 1: Prepare LDSC input-format data for multiple traits.

PHBC needs LDSC .sumstats format data as input, so you first need to reformat the GWAS summary statistics as LDSC requested. We strongly recommend that you use the script munge_sumstats.py included in LDSC python package (https://github.com/bulik/ldsc) to convert your own GWAS summary statistics into LDSC format. (ref. Bulik-Sullivan et al. 2015b Nat Genet) We also provide all LDSC-format .sumstat.gz data used in our analyses. (See Data preparation) * Examples of three phenotypes (“401.1”, “250.2”, “296.22”) have been implemented in our package. You can use codes below to reload them.

library(pleioh2g)
munged_sumstats = list("401.1" = sumstats_munged_example_input(example = "401.1"), "250.2" = sumstats_munged_example_input(example = "250.2"),"296.22" = sumstats_munged_example_input(example = "296.22"))

Step 2: Compute pleiotropic heritability with bias correction

The function pruning_pleioh2g_wrapper() (See example as below) is to compute h2pleio / h2 while performing pruning and bias correction with ldsc-format GWAS summary statistics (.sumstat.gz) as input.

# Specify phenotype names
phenotype<-c("401.1","250.2","296.22")

# First to determine which disease in your list is the target disease
G = 1 # Index of target disease in trait list - this example is to compute pleiotropic heritability for "401.1".

# Input ldsc format .sumstat.gz data
munged_sumstats = list("401.1" = sumstats_munged_example_input(example = "401.1"), "250.2" = sumstats_munged_example_input(example = "250.2"),"296.22" = sumstats_munged_example_input(example = "296.22"))

# Specify reference LD data: ld and wld path; and hapmap 3 SNPs list
ld_path<-fs::path(fs::path_package("extdata/eur_w_ld_chr", package = "pleioh2g"))
wld_path<-fs::path(fs::path_package("extdata/eur_w_ld_chr", package = "pleioh2g"))
hmp3<-fs::path(fs::path_package("extdata/w_hm3.snplist", package = "pleioh2g"))

# If you trait is disease phenotype or the other binary trait, specify prevalence to compute the liability-scale heritability; If you don't specify this, it will compute observed-scale heritability.
sample_prev <- c(0.37,0.1,0.17)
population_prev <- c(0.37,0.1,0.17)

# Specify number of genomic-jackknife block; We use n_block = 5 as example for quick computation, but we recommand to use 200 jackknife blocks; If you don't specify this, the default number is 200.
n_block<-5
 
# Specify number of Monte Carlo sampling iterations in bias correction; If you don't specify this, the default number is 1000.
sample_rep <- 1000 

post_correction_results<-pruning_pleioh2g_wrapper(G,phenotype,munged_sumstats,ld_path, wld_path, sample_prev, population_prev,n_block, hmp3,sample_rep)

An output line will provide your post-correction h2pleio / h2 estimate, along with a result list post_correction_results, containing the following elements: - target_disease (character): The value “401.1”. - target_disease_h2_est (numeric): target disease h2. - target_disease_h2_se (numeric): target disease h2 s.e.. - selected_auxD (character): auxiliary diseases. - h2pleio_uncorr (numeric): pre-correction h2pleio estimate. - h2pleio_uncorr_se (numeric): pre-correction h2pleio jackknife s.e. estimate. - percentage_h2pleio_uncorr (numeric): pre-correction h2pleio / h2 estimate. - percentage_h2pleio_uncorr_se (numeric): pre-correction h2pleio / h2 jackknife s.e. estimate. - percentage_h2pleio_jackknife_uncorr (numeric): vector of all pre-correction h2pleio / h2 jackknife estimates in default 200 blocks. - h2pleio_corr (numeric): post-correction h2pleio estimate. - h2pleio_corr_se (numeric): post-correction h2pleio jackknife s.e. estimate. - percentage_h2pleio_corr (numeric): post-correction h2pleio / h2 estimate. - percentage_h2pleio_corr_se (numeric): post-correction h2pleio / h2 jackknife s.e. estimate. - corrected_weight (numeric): corrected weight ξc in bias correction.

Leave-category-out analysis instruction