It is our mission to bring attention to complex challenges concerning personalized care and medication for diverse cancer patients. Such multifaceted problems brings with themselves a number of obstacles, many of which are too substantial to tackle alone. In fostering a collaborative environment and freely sharing resources, we believe that we can accelerate our collective efforts. 

This is what we hope to improve prospects for potential patients, ensuring they receive the best and most effective care, tailored to each individual. This commitment is not just about improved care in the current age of technology, but rather the beginnings of a brighter future still to come. Feel free to refer to the rest of this page to get to know the tools we have been working on throughout the years

GeneLLM: Unveiling Zero shot prediction for gene attributes through interpretable AI.

GeneLLM, an interpretable transformer-based model that integrates textual information through contrastive learning to refine gene representations.

Bipotent Target Activity Score: AI predictor of immunotherapy response.

The website provides an interface for users to our computational approach where users can predict patient response to immunotherapy in arbitrary melanoma cohorts using gene expression from patients as input.

BipotentR: Method to identify gene targets that kill tumor by two mechanisms con-currently.

BipotentR identifies targets for bipotent drugs - single drugs that can kill cancer by multiple mechanisms. BipotentR was developed to address the low response rates to existing therapies targeting single mechanisms that also suffer from high relapse rates due to evolved resistance to single mechanisms.

DeepImmune: AI predictor of immunotherapy response.

DeepImmune is an AI predictor that employs drug-induced transcriptomic changes to identify drugs that synergize with immune check blockade (ICB) and prioritize them for clinical trials. The parameters of DeepImmune are trained on 40,000 patient tumors. It uses transfer learning to predict response to ICB. Then it implements a perturbation model to estimate the effect of drugs on ICB response.

MSSC: Differential gene analysis across multiple scRNAseq datasets.

A major challenge in scRNA data analysis is that cells from the same sample are not independent. Commonly used differential expression methods fail to account for this pseudoreplication bias, leading to inflated false positives. Several published scRNA results reported differential genes that are not driven by biological differences but by pseudoreplication, which has exacerbated the replicability crisis in the field. To control for false positives due to pseudoreplication, this pipeline(Difficult) uses a Bayesian model (MSSC) to identify differentially expressed genes in scRNA by modeling pseudoreplication.

scdiffpop: Single cell differential immune cell population.

There is a need to determine the role of specific immune cells that drive disease severity in COVID-19 patients. This involves comparing immune fraction between samples (e.g., between COVID-infected vs. healthy samples) using a nonparametric test. However, such predictions are not robust, and the test is underpowered due to small sample sizes in single-cell RNAseq datasets. To address this, we are devising a statistical tool, scDiffPop, to robustly identify differential immune populations between any two phenotypes (infected vs. healthy, or severe vs. asymptomatic).

TRIM: Statistical approach to identify Transcription Regulator of Immune-Metabolism.

Abnormal energy metabolism is a common theme for immune evasion in tumors. Targeting cancer energy metabolism can reinvigorate tumor immunity. Focusing on energy metabolism by oxidative phosphorylation (OXPHOS), TRIM identifies transcriptional regulators of immune-metabolism. In particular, TRIM analyzes 21,000 ChIP-seq experiments, and then determined their immune-modulatory potential in 11,000 tumors and 160,000 single-cells from cancer patients.

INCISOR: Identifying synthetic rescue interactions in humans.

INCISOR prioritizes clinically relevant SRs by analyzing functional genomic and clinical survival data in an integrated manner. Due to the scarcity of published gold standards of SR interactions, we conducted new large-scale in vitro experiments to validate our predictions. The paucity of known rescue interactions in the literature further underscores the importance of developing tools like INCISOR. Overall, INCISOR attained precision levels of an average 48% (at 50% recall) in the identification of true SR interaction across all published and new experiments. Finally, we show that SR mediates both primary and adaptive resistance in patients.

GOAL: Bayesian tool to find causal expression regulatory polymorphisms by integration of genetic and epigenetic data.

GOAL implementA the eQTeL model that integrates genetic and epigenetic data to find SNPs causal to expression variance.

ISLE: Identification of clinically relevant synthetic lethal interactions.

ISLE takes lab-screened SL interactions as inputs and analyzes tumor molecular profiles, patient clinical data, and gene phylogeny relations to identify SLi that are predictive of patients’ drug response. The ISLE-identified SL interactions are shown to predict drug response to a wide variety of drugs both in vitro and in vivo, providing a basis for rational design of synergistic drug combinations.

Hridaya: SVM predictor of driver genes in idiopathic dilated Image result for cardiomyopathy.

Hridaya is a novel machine learning approach to predict functional genes of idiopathic dilated cardiomyopathy (DCM). Hridaya-potential for a gene is the probability of a gene to be functionally linked to DCM (the higher the value, the more likely it is to be functional). Here the term ‘functional’ is used to refer to the genes that are involved in processes and pathways whose disruption is functionally linked to DCM.

CellToPhenotype: Statistical method for identifying interesting regions in histone modification libraries.


CellToPhenotype predictors consists of two expressions based supervised regression – one for predicting cell migration and other for predicting cell proliferation.These predictors were constructed using least absolute shrinkage and selection operator (LASSO) based regression, considering as features the genes whose expression is significantly associated with survival in the METABRIC breast cancer collection. Given an input tumor sample, each predictor (migration or proliferation) receives as input the levels of expression of its feature genes in that sample and outputs the predicted migration or proliferation levels.

ChIPnorm: Statistical method for normalizing and identifying differential regions in histone modification ChIP-seq libraries

The advent of high-throughput technologies such as ChIP-seq has made possible the study of histone modifications. A problem of particular interest is the identification of regions of the genome where different cell types from the same organism exhibit different patterns of histone enrichment.ChIPnorm method removes most of the noise and bias in the data and outperforms other normalization methods.