CRAN_Status_Badge CRAN_Downloads_Badge

1 Citation

Access the recommendation on F1000Prime Genome Medicine 2016;8:129 (DOI: 10.1186/s13073-016-0384-y)

2 Summary

We introduce XGR (eXploring Genomic Relations), released as an R package (http://cran.r-project.org/package=XGR) and web-app (http://galahad.well.ox.ac.uk/XGR) enabling downstream knowledge discovery from genomic summary data. Biological interpretation of genomic summary data resulting from such as GWAS and eQTL mapping is a major bottleneck in human disease genomics, calling for efficient and integrative tools designed to resolve this problem. In the past, this kind of research has been less appreciated but is becoming increasingly important with the growing availability of ontology, annotation, and network data essential for precision interpretation.

XGR is designed to make a user-defined list of genes or SNPs (or genomic regions) more interpretable by comprehensively utilising ontology and network information to generate more informative results than conventional methods. XGR is unique in supporting a broad range of ontologies (including knowledge of biological and molecular functions, pathways, diseases and phenotypes - in both human and mouse) and different types of networks (including functional, physical and pathway interactions).

In this user manual, you will be guided through the steps necessary to use this tool. After going through particularly the Showcases section which includes several demos with published data, you will be able to: 1) perform enrichment analysis using either built-in or custom ontologies, 2) calculate semantic similarity between genes (or between SNPs) based on their ontology annotation profiles, 3) identify a gene subnetwork given your query list of (significant) genes, SNPs or genomic regions, and 4) interpret genomic regions using co-localised functional genomic annotations and using nearby gene annotations by ontologies. For end-users who are unfamiliar with R, please refer to our user-friendly web app.1

3 Web-app (linkto)

4 Installation

4.1 R

R (http://www.r-project.org) is a language and environment for statistical computing and graphics. The latest version on different platforms can be installed: Windows (download), Mac OS X (download), and Linux (see below).

  • Assume you have a ROOT (sudo) privilege:

    # Below are shell command lines in Terminal
    sudo su
    # here enter your password
    wget http://www.stats.bris.ac.uk/R/src/base/R-3/R-3.2.5.tar.gz
    tar xvfz R-3.2.5.tar.gz
    cd R-3.2.5
    ./configure
    make
    make check
    make install
    R # start R
  • Assume you do NOT have a ROOT (sudo) privilege and want R installation under your home directory ($HOME):

    # Below are shell command lines in Terminal
    wget http://www.stats.bris.ac.uk/R/src/base/R-3/R-3.2.5.tar.gz
    tar xvfz R-3.2.5.tar.gz
    cd R-3.2.5
    ./configure --prefix=$HOME/R-3.2.5
    make
    make check
    make install
    $HOME/R-3.2.5/bin/R # start R

4.2 Packages

For installation of the XGR package, please follow the instructions below:

  • Install XGR (the latest stable release from CRAN):

    source("http://bioconductor.org/biocLite.R")
    biocLite("XGR")
  • Also install the latest development version from GitHub2 (highly recommended):

    if(!("devtools" %in% rownames(installed.packages()))) install.packages("devtools")
    devtools::install_github(c("hfang-bristol/XGR"))

4.3 BugReports

We are grateful to have your feedbacks particularly bugs. To help streamline bug reports and fixes, please file an issue here.

5 Source data

All source data are represented uniformly as well-documented RData-formatted files, taking advantage of the R software environment and its infrastructure packages such as igraph (Csardi and Nepusz 2006) and GenomicRanges (Lawrence et al. 2013). These data are subject to regular updates, and are also regularly supplemented to keep pace with the explosive nature of big data in modern genome biology.

5.1 Ontologies and annotations at the gene level

Ontologies and their identifier codes used in XGR are summarised below
Category Ontology Identifier Codes
Disease Disease Ontology DO
Function Gene Ontology Molecular Function GOMF
Function Gene Ontology Biological Process GOBP
Function Gene Ontology Cellular Component GOCC
Phenotype Human Phenotype Phenotypic Abnormality HPPA
Phenotype Human Phenotype Mode of Inheritance HPMI
Phenotype Human Phenotype Clinical Modifier HPCM
Phenotype Human Phenotype Mortality Aging HPMA
Phenotype Mammalian/Mouse Phenotype MP
Trait Experimental Factor Ontology EF
Druggability DGI druggable gene categories DGIdb
Domain SCOP domain superfamilies SF
Domain Pfam domain families Pfam
MsigDB Hallmark gene sets MsigdbH
MsigDB Chromosome and cytogenetic band positional gene sets MsigdbC1
MsigDB Chemical and genetic perturbation gene sets MsigdbC2CGP
MsigDB All pathway gene sets MsigdbC2CPall
MsigDB Canonical pathway gene sets MsigdbC2CP
MsigDB KEGG pathway gene sets MsigdbC2KEGG
MsigDB Reactome pathway gene sets MsigdbC2REACTOME
MsigDB BioCarta pathway gene sets MsigdbC2BIOCARTA
MsigDB Transcription factor target gene sets MsigdbC3TFT
MsigDB microRNA target gene sets MsigdbC3MIR
MsigDB Cancer gene neighborhood gene sets MsigdbC4CGN
MsigDB Cancer module gene sets MsigdbC4CM
MsigDB GO biological process gene sets MsigdbC5BP
MsigDB GO molecular function gene sets MsigdbC5MF
MsigDB GO cellular component gene sets MsigdbC5CC
MsigDB Oncogenic signature gene sets MsigdbC6
MsigDB Immunologic signature gene sets MsigdbC7
eGenes GTEx eGene tissues GTExV4
Evolution phylostratific age information (our ancestors) PS2
KEGG all pathways KEGG
KEGG Metabolism KEGGmetabolism
KEGG Genetic Information Processing KEGGgenetic
KEGG Environmental Information Processing KEGGenvironmental
KEGG Cellular Processes KEGGcellular
KEGG Organismal Systems KEGGorganismal
KEGG Human Diseases KEGGdisease
REACTOME all pathways REACTOME

5.2 Annotations at the genomic region level

Data types, sources, and identifier codes used in XGR are summarised below
Category Source Genomic Annotations Identifier Codes
TFBS ENCODE Cell-type-specific TFBS uniformly identified Uniform_TFBS
TFBS ENCODE Clustered TFBS ENCODE_TFBS_ClusteredV3
TFBS ENCODE Cell-type-specific clustered TFBS ENCODE_TFBS_ClusteredV3_CellTypes
DHS ENCODE Cell-type-specific DHS uniformly identified Uniform_DNaseI_HS
DHS ENCODE Clustered DHS ENCODE_DNaseI_ClusteredV3
DHS ENCODE Cell-type-specific clustered DHS ENCODE_DNaseI_ClusteredV3_CellTypes
Histone Modifications ENCODE Cell-type-specific histone modifications Broad_Histone
Histone Modifications ENCODE Cell-type-specific histone modifications SYDH_Histone
Histone Modifications ENCODE Cell-type-specific histone modifications UW_Histone
Expressed Enhancers FANTOM5 Cell-type-specific expressed enhancers FANTOM5_Enhancer_Cell
Expressed Enhancers FANTOM5 Tissue-specific expressed enhancers FANTOM5_Enhancer_Tissue
Expressed Enhancers FANTOM5 Extensive enhancers FANTOM5_Enhancer_Extensive
Expressed Enhancers FANTOM5 Full collections of enhancers FANTOM5_Enhancer
Genome Segmentations ENCODE Combined genome segmentation for GM12878 Segment_Combined_Gm12878
Genome Segmentations ENCODE Combined genome segmentation for H1-hESC Segment_Combined_H1hesc
Genome Segmentations ENCODE Combined genome segmentation for HeLa S3 Segment_Combined_Helas3
Genome Segmentations ENCODE Combined genome segmentation for HepG2 Segment_Combined_Hepg2
Genome Segmentations ENCODE Combined genome segmentation for HUVEC Segment_Combined_Huvec
Genome Segmentations ENCODE Combined genome segmentation for K562 Segment_Combined_K562
Conserved TFBS TRANSFAC PWM human/mouse/rat conserved TFBS TFBS_Conserved
miRNA regulatory sites TargetScan miRNA regulatory sites TS_miRNA
Cancer mutations TCGA Tumor-type-specific exome mutations TCGA
TFBS ReMap GSE-derived TFBS ReMap_Public_TFBS
TFBS ReMap Merged TFBS across GSE studies ReMap_Public_mergedTFBS
TFBS ReMap Merged TFBS across GSE studies and ENCODE ReMap_PublicAndEncode_mergedTFBS
TFBS ReMap ENCODE-derived TFBS (ignoring cell type information) ReMap_Encode_TFBS
Histone Modifications Blueprint Bone-marrow-specific histone modifications Blueprint_BoneMarrow_Histone
Histone Modifications Blueprint Cell-line-specific histone modifications Blueprint_CellLine_Histone
Histone Modifications Blueprint Cord-blood-specific histone modifications Blueprint_CordBlood_Histone
Histone Modifications Blueprint Thymus-specific histone modifications Blueprint_Thymus_Histone
Histone Modifications Blueprint Venous-blood-specific histone modifications Blueprint_VenousBlood_Histone
DHS Blueprint Sample-specific DHS Blueprint_DNaseI
Genome Segmentations Roadmap Epigenomics E029 (Primary monocytes from peripheral blood) EpigenomeAtlas_15Segments_E029
Genome Segmentations Roadmap Epigenomics E030 (Primary neutrophils from peripheral blood) EpigenomeAtlas_15Segments_E030
Genome Segmentations Roadmap Epigenomics E031 (Primary B cells from cord blood) EpigenomeAtlas_15Segments_E031
Genome Segmentations Roadmap Epigenomics E032 (Primary B cells from peripheral blood) EpigenomeAtlas_15Segments_E032
Genome Segmentations Roadmap Epigenomics E033 (Primary T cells from cord blood) EpigenomeAtlas_15Segments_E033
Genome Segmentations Roadmap Epigenomics E034 (Primary T cells from peripheral blood) EpigenomeAtlas_15Segments_E034
Genome Segmentations Roadmap Epigenomics E035 (Primary hematopoietic stem cells) EpigenomeAtlas_15Segments_E035
Genome Segmentations Roadmap Epigenomics E036 (Primary hematopoietic stem cells short term culture) EpigenomeAtlas_15Segments_E036
Genome Segmentations Roadmap Epigenomics E037 (Primary T helper memory cells from peripheral blood 2) EpigenomeAtlas_15Segments_E037
Genome Segmentations Roadmap Epigenomics E038 (Primary T helper naive cells from peripheral blood) EpigenomeAtlas_15Segments_E038
Genome Segmentations Roadmap Epigenomics E039 (Primary T helper naive cells from peripheral blood) EpigenomeAtlas_15Segments_E039
Genome Segmentations Roadmap Epigenomics E040 (Primary T helper memory cells from peripheral blood 1) EpigenomeAtlas_15Segments_E040
Genome Segmentations Roadmap Epigenomics E041 (Primary T helper cells PMA-I stimulated) EpigenomeAtlas_15Segments_E041
Genome Segmentations Roadmap Epigenomics E042 (Primary T helper 17 cells PMA-I stimulated) EpigenomeAtlas_15Segments_E042
Genome Segmentations Roadmap Epigenomics E043 (Primary T helper cells from peripheral blood) EpigenomeAtlas_15Segments_E043
Genome Segmentations Roadmap Epigenomics E044 (Primary T regulatory cells from peripheral blood) EpigenomeAtlas_15Segments_E044
Genome Segmentations Roadmap Epigenomics E045 (Primary T cells effector/memory enriched from peripheral blood) EpigenomeAtlas_15Segments_E045
Genome Segmentations Roadmap Epigenomics E046 (Primary Natural Killer cells from peripheral blood) EpigenomeAtlas_15Segments_E046
Genome Segmentations Roadmap Epigenomics E047 (Primary T killer naive cells from peripheral blood) EpigenomeAtlas_15Segments_E047
Genome Segmentations Roadmap Epigenomics E048 (Primary T killer memory cells from peripheral blood) EpigenomeAtlas_15Segments_E048
Genome Segmentations Roadmap Epigenomics E050 (Primary hematopoietic stem cells G-CSF-mobilized Female) EpigenomeAtlas_15Segments_E050
Genome Segmentations Roadmap Epigenomics E051 (Primary hematopoietic stem cells G-CSF-mobilized Male) EpigenomeAtlas_15Segments_E051
Genome Segmentations Roadmap Epigenomics E062 (Primary mononuclear cells from peripheral blood) EpigenomeAtlas_15Segments_E062

5.3 Ontology annotations at the SNP level

SNP annotations are based on the Experimental Factor Ontology (EFO). EFO standardises GWAS traits from the NHGRI GWAS Catalog using well-defined terms (Welter et al. 2014). Knowledge of co-inherited variants is also used to include additional SNPs that are in Linkage Disequilibrium (LD) with GWAS lead SNPs. LD SNPs are calculated based on the 1000 Genomes Project data (1000 Genomes Project Consortium 2012). LD SNPs are defined to be any SNPs having R2>0.8 with GWAS lead SNPs.

List of populations used to calculate LD SNPs
Identifier Code Population Project
AFR African 1000 Genomes Project
AMR Admixed American 1000 Genomes Project
EAS East Asian 1000 Genomes Project
EUR European 1000 Genomes Project
SAS South Asian 1000 Genomes Project

5.4 Interaction networks at the gene level

XGR support networks of different interaction types (functional, physical, and pathway-derived), of varying interaction quality (highest, high, and medium) and of two interaction directions (directed versus undirected). These are mainly sourced from the STRING database (Szklarczyk et al. 2015) and the Pathway Commons database (Cerami et al. 2011). STRING is a meta-integration of undirect interactions from a functional aspect, while Pathway Commons mainly contains both undirect and direct interactions from a physical/pathway aspect. In addition to interaction type, users can choose the interactions of varying quality:

Database, interaction type and quality, and identifier codes used in XGR are summarised below
Identifier Code Interaction (type and quality) Database
STRING_high Functional interactions (with high confidence scores>=700) STRING
STRING_medium Functional interactions (with medium confidence scores>=400) STRING
PCommonsUN_high Physical/undirect interactions (with references & >=2 sources) Pathway Commons
PCommonsUN_medium Physical/undirect interactions (with references & >=1 sources) Pathway Commons
PCommonsDN_high Pathway/direct interactions (with references & >=2 sources) Pathway Commons
PCommonsDN_medium Pathway/direct interactions (with references & >=1 sources) Pathway Commons
KEGG Pathway/direct interactions (all) KEGG
KEGG_metabolism Pathway/direct interactions (Metabolism) KEGG
KEGG_genetic Pathway/direct interactions (Genetic Information Processing) KEGG
KEGG_environmental Pathway/direct interactions (Environmental Information Processing) KEGG
KEGG_cellular Pathway/direct interactions (Cellular Processes) KEGG
KEGG_organismal Pathway/direct interactions (Organismal Systems) KEGG
KEGG_disease Pathway/direct interactions (Human Diseases) KEGG
For the pathway-merged direct interactions, networks sourced individually are also supported
Identifier Code Interaction (source) Database
PCommonsDN_Reactome Pathway/direct interactions (only from Reactome) Pathway Commons
PCommonsDN_KEGG Pathway/direct interactions (only from KEGG) Pathway Commons
PCommonsDN_HumanCyc Pathway/direct interactions (only from HumanCyc) Pathway Commons
PCommonsDN_PID Pathway/direct interactions (only from PID) Pathway Commons
PCommonsDN_PANTHER Pathway/direct interactions (only from PANTHER) Pathway Commons
PCommonsDN_ReconX Pathway/direct interactions (only from ReconX) Pathway Commons
PCommonsDN_PhosphoSite Pathway/direct interactions (only from PhosphoSite) Pathway Commons
PCommonsDN_CTD Pathway/direct interactions (only from CTD) Pathway Commons

5.5 Useful datasets

5.5.1 Human genes

  • org.Hs.eg: contains Entrez Gene information for the human.
  • UCSC_knownCanonical: contains UCSC known canonical genes (together with genomic locations) based on human genome assembly hg19.
  • UCSC_knownGene: contains UCSC known genes (together with genomic locations) based on human genome assembly hg19.

5.5.2 dbSNPs

  • dbSNP_Common: contains common SNPs from dbSNP (version 146) plus GWAS SNPs and their LD SNPs (hg19).
  • dbSNP_GWAS: contains SNPs from dbSNP (version 146) restricted to GWAS SNPs and their LD SNPs (hg19).

5.5.3 ImmunoBase

  • ImmunoBase: contains information on immune-disease associated variants, regions and genes from ImmunoBase (hg19).
  • GWAS_IB: contains GWAS Catalog variants associated with traits that are mappable to immune diseases defined in ImmunoBase.
  • ImmunoBase_LD: contains LD of ImmunoBase SNPs calculated by PLINK based on the 1000 Genomics Project data (phase 3).
  • GWAS_LD: contains LD of GWAS Catalog calculated by PLINK based on the 1000 Genomics Project data (phase 3).

5.5.4 EFO and annotations

  • ig.EF: contains information on Experimental Factor Ontology terms.
  • GWAS2EF: annotates GWAS Catalog SNPs by Experimental Factor Ontology terms.
  • Target2EF: annotates ChEMBL targets by Experimental Factor Ontology terms.
  • org.Hs.egEF: annotates GWAS Catalog Human Entrez Genes by Experimental Factor Ontology terms.

5.5.5 Drugs and targets

  • DrugBank: contains drugs and their target genes from DrugBank.
  • ChEMBL: contains drugs and their target genes from ChEMBL.

6 Functionality

The functions in the package XGR are categorised into five groups according to the tasks they complete. They are summarised below.

6.1 Enrichment functions

Enrichment functions are supposed to do enrichment analysis based on several statistical tests (either Fisher’s exact test or hypergeometric/binomial test). The test is to estimate significance of overlaps between, for example, an input group of genes and a group of genes annotated by an ontology term. By default, all annotatable genes are used as the test background but can be specified by the user. If ontology terms are organised as a tree-like structure, this ontology structure can also be taken into account to produce more informative results. Particularly for a non-structure ontologies (eg a collection of pathways), a filtering procedure is also developed to generate non-redundant but informative results.

6.1.1 xEnricherGenes

xEnricherGenes: conducts gene-based enrichment analysis given a list of genes and the ontology in query. It supports two types of ontologies: 1) structured ontologies including Gene Ontology (Ashburner et al. 2000), Disease Ontology (Schriml et al. 2012), and Phenotype Ontologies in human and mouse (Köhler et al. 2013; C. L. Smith and Eppig 2009), and 2) non-structured ontologies/categories; for example, a collection of pathways, gene expression signatures, transcription factor targets, and gene druggable categories.

6.1.2 xEnricherSNPs

xEnricherSNPs: conducts SNP-based enrichment analysis using GWAS Catalog traits mapped to Experimental Factor Ontology (Welter et al. 2014). Inclusion of additional SNPs that are in linkage disequilibrium (LD) with input SNPs are also allowed for enrichment analysis.

6.1.3 xEnricherYours

xEnricherYours: conducts custom-based enrichment analysis provided with an entity file and an annotation file.

6.1.4 xEnricher

xEnricher: acts as a template for enrichment analysis. It is an internal function upon which high-level functions (ie xEnricherGenes, xEnricherSNPs and xEnricherYours) rely.

6.1.5 xEnrichViewer

xEnrichViewer: views enrichment results as a data frame that is also useful for the subsequent file saving.

6.1.6 xEnrichConciser

xEnrichConciser: makes enrichment results much clearer by removing redundant terms. A redundant term is claimed if its overlapped part with a more significant term meets both criteria: covers more than 95% of this redundant term and also more than 50% of the more significant term. In doing so, only non-redundant but informative terms will be left.

6.1.7 xEnrichBarplot

xEnrichBarplot: visualises enrichment results using a barplot.

6.1.8 xEnrichDAGplot

xEnrichDAGplot: visualises enrichment results using a DAG plot. This function is only useful for tree-like structured ontologies. Significant terms (of interest) are highlighted by box-shaped nodes, and the others by ellipse nodes.

6.1.9 xEnrichCompare

xEnrichCompare: compares enrichment results using side-by-side barplots. This function is useful when comparing enrichment results for different inputs but based on the same ontology.

6.1.10 xEnrichDAGplotAdv

xEnrichDAGplotAdv: visualises comparative enrichment results using a DAG plot. This function takes input the output of the function xEnrichCompare to further illustrate differences and commonalities of comparative enrichment results in the context of ontology tree.

6.2 Similarity functions

Similarity functions serve to conduct similarity analysis calculating semantic similarity - a type of comparison to assess the degree of relatedness between two entities (eg genes or SNPs) based on their annotation profiles (by ontology terms) (Pesquita et al. 2009). To do so, information content (IC) of a term is first defined to measure how informative a term is to being used for annotating genes: –log10(frequency of genes annotated to this term). Similarity between two terms are then measured based on IC, usually at the most informative common ancester (MICA). Finally, similarity between two entities (eg genes) are derived from pairwise term similarity using best-matching based methods: average, maximum, and complete.

6.2.1 xSocialiserGenes

xSocialiserGenes: conducts gene-based similarity analysis given a list of genes and the ontology in query. It supports several structured ontologies including Gene Ontology, Disease Ontology, and Phenotype Ontologies (in human and mouse), and returns socialised genes represented as a network with nodes for input genes and edges for pair-wise semantic similarity between them.

6.2.2 xSocialiserSNPs

xSocialiserSNPs: conducts SNP-based similarity analysis using GWAS Catalog traits mapped to Experimental Factor Ontology. Inclusion of additional SNPs that are in linkage disequilibrium (LD) with input SNPs are also allowed for similarity analysis. It returns socialised SNPs represented as a network with nodes for input SNPs and edges for pair-wise semantic similarity between them.

6.2.3 xSocialiser

xSocialiser: acts as a template for similarity analysis. It is an internal function upon which high-level functions (ie xSocialiserGenes and xSocialiserSNPs) rely.

6.2.4 xCircos

xCircos: visualises the similarity results using a circos plot. The degree of similarity between SNPs (or genes) is visualised by the colour of links. This function can be used either to visualise the most similar links or to plot links involving an input SNP (or gene).

6.2.5 xSocialiserDAGplot

xSocialiserDAGplot: visualises terms used to annotate an input SNP (or gene) using a DAG plot. Terms used for direct/original annotations by box-shaped nodes, and terms for indirect/inherited annotations by ellipse nodes. This function is part of utilities in understanding calculated similarity.

6.2.6 xSocialiserDAGplotAdv

xSocialiserDAGplotAdv: uses a DAG plot to visualise and compare two sets of terms used to annotate two input SNPs (or genes) that are predicted to be similar. This function is part of utilities in understanding calculated similarity.

6.3 Network functions

Network functions are supposed to identify a gene subnetwork from a gene interaction network with node/gene significant information. The node/gene information can be directly provided (eg user-defined genes with the significance level; p-values or FDR); see the function xSubneterGenes. The node/gene information can also be indirectly provided, for example, nearby genes of user-defined SNPs with the significance level (eg GWAS reported p-values; see the function xSubneterSNPs), or more generally, nearby genes of user-defined genomic regions with the significance level (eg differentially methylated regions together with FDR; see the function xSubneterGR). From a gene interaction network with nodes labelled with gene information, the algorithm searching for a maximum-scoring gene subnetwork has been reported in our previous publication (Fang and Gough 2014), briefed as follows:

  1. score transformation, that is, given the threshold of tolerable p-value, nodes with p-values below this threshold (nodes of interest) are scored positively, and negative scores for nodes with threshold-above p-values (intolerable);

  2. subnetwork identification, that is, to find an interconnected gene subnetwork enriched with positive-score nodes, but allowing for a few negative-score nodes as linkers;

  3. controlling the subnetwork size, that is, an iterative procedure is provided to finetune tolerable thresholds for identifying the gene subnetwork with a desired number of nodes.

6.3.1 xSubneterGenes

xSubneterGenes: takes as input a list of user-defined genes with the significance level (p-values), superposes these genes onto a gene interaction network, and outputs a maximum-scoring gene subnetwork that contains as many most significant (highly scored) genes as possible but also a few lesser significant (scored) genes as linkers.

6.3.2 xSNP2GeneScores

xSNP2GeneScores: takes as input a list of user-defined SNPs with the significance level (eg GWAS reported p-values), and defines and scores nearby genes that take into account the distance to and the significance of input SNPs.

6.3.3 xSubneterSNPs

xSubneterSNPs: identifies a gene subnetwork that is likely modulated by input SNPs and/or their Linkage Disequilibrium (LD) SNPs, including two major steps. The first step is to use xSNP2GeneScores for defining and scoring nearby genes that are located within distance window of input and/or LD SNPs. The second step is to use xSubneterGenes for identifying a maximum-scoring gene subnetwork.

6.3.4 xGR2GeneScores

xGR2GeneScores: takes as input a list of user-defined genomic regions (GR) with the significance level (eg p-values), and defines and scores nearby genes that take into account the distance to and the significance of input GR.

6.3.5 xSubneterGR

xSubneterGR: identifies a gene subnetwork that is likely modulated by input genomic regions (GR), including two major steps. The first step is to use xGR2GeneScores for defining and scoring nearby genes that are located within distance window of input genomic regions. The second step is to use xSubneterGenes for identifying a maximum-scoring gene subnetwork.

6.4 Annotation functions

Annotation functions are supposed to interpret a user-defined list of genomic regions either via looking at nearby gene annotations by ontologies or via looking at co-localised functional genomic annotations.

6.