Introduction
ultragsea is a novel, ultrafast and memory optimized gene set enrichment scoring algorithm not unlike fGSEA. ultragsea demonstrates highly correlated gene set scores and p-values and typically is 100 times faster than fGSEA.
Example
Preparing gensets
For this vignette, we retrieve the Hallmark gene sets using the
msigdbr package. Of course you can use your own gene sets
or download from somewhere else.
Performing the statistical test
Just for this example we use some random values for fc
but in general this is a fold change from some statistical comparison of
your experiment.
G <- gmt2mat(gmt)
fc <- rnorm(nrow(G))
names(fc) <- rownames(G)
res <- ultragsea(G, fc, format='as.gsea', method='cor')
res <- ultragsea(G, fc, format='as.gsea', method='ztest')
head(res)
#> pathway pval padj log2err ES
#> <char> <num> <num> <num> <num>
#> 1: HALLMARK_APICAL_JUNCTION 0.8346855 0.9696221 NA 0.08236827
#> 2: HALLMARK_HEME_METABOLISM 0.8726599 0.9696221 NA 0.06325971
#> 3: HALLMARK_INFLAMMATORY_RESPONSE 0.7118918 0.9696221 NA 0.14576208
#> 4: HALLMARK_KRAS_SIGNALING_UP 0.9757536 0.9757536 NA -0.01199556
#> 5: HALLMARK_OXIDATIVE_PHOSPHORYLATION 0.8677728 0.9696221 NA -0.06570964
#> 6: HALLMARK_ADIPOGENESIS 0.2364614 0.7389418 NA 0.46725361
#> NES size leadingEdge
#> <num> <int> <list>
#> 1: 0.19461526 200 CRAT,HRAS,INSIG1,PIK3R3,ACTN2,ADAM9,...[32]
#> 2: 0.14946660 200 CA2,BNIP3L,CDC27,H1-0,BCAM,BLVRA,...[33]
#> 3: 0.34439870 200 IL4R,CCL2,HBEGF,INHBA,ITGB3,EREG,...[37]
#> 4: -0.02834246 200 BMP2,CCND2,LIF,CFB,CSF2,PLAT,...[33]
#> 5: -0.15525515 200 ACAA1,SDHC,ECHS1,GPI,GRPEL1,ATP6V1F,...[27]
#> 6: 1.10400136 200 ALDOA,ECH1,GPX4,CRAT,DHCR7,GADD45A,...[40]For the fastest speed, especially when computing enrichment for
multiple foldchanges, it is better to call the low-level functions for
geneset correlation or geneset z-test directly. They do not return the
nicely formatted output tables as ultragsea() but they are
much faster as they are optimized for matrix inputs F.
F <- cbind(fc,fc,fc,fc,fc)
res1 <- gset.cortest(G, F, compute.p=TRUE)
res2 <- gset.ztest(G, F)Session info
sessionInfo()
#> R version 4.5.2 (2025-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] msigdbr_25.1.1 ultragsea_0.1.12 BiocStyle_2.38.0
#>
#> loaded via a namespace (and not attached):
#> [1] Matrix_1.7-4 babelgene_22.9 jsonlite_2.0.0
#> [4] dplyr_1.1.4 compiler_4.5.2 BiocManager_1.30.27
#> [7] tidyselect_1.2.1 slam_0.1-55 parallel_4.5.2
#> [10] assertthat_0.2.1 jquerylib_0.1.4 systemfonts_1.3.1
#> [13] textshaping_1.0.4 yaml_2.3.12 fastmap_1.2.0
#> [16] lattice_0.22-7 R6_2.6.1 generics_0.1.4
#> [19] qlcMatrix_0.9.9 curl_7.0.0 knitr_1.51
#> [22] tibble_3.3.0 bookdown_0.46 desc_1.4.3
#> [25] bslib_0.9.0 pillar_1.11.1 rlang_1.1.6
#> [28] cachem_1.1.0 xfun_0.55 fs_1.6.6
#> [31] sass_0.4.10 otel_0.2.0 cli_3.6.5
#> [34] pkgdown_2.2.0 withr_3.0.2 magrittr_2.0.4
#> [37] grid_4.5.2 digest_0.6.39 docopt_0.7.2
#> [40] lifecycle_1.0.4 vctrs_0.6.5 sparsesvd_0.2-3
#> [43] data.table_1.18.0 evaluate_1.0.5 glue_1.8.0
#> [46] ragg_1.5.0 rmarkdown_2.30 matrixStats_1.5.0
#> [49] tools_4.5.2 pkgconfig_2.0.3 htmltools_0.5.9