Shicheng Guo
The Story

About

Human data, genomics, and agentic AI for end-to-end drug discovery.


Dr. Shicheng Guo

I am Shicheng Guo, Ph.D., Senior Director of Translational Genetics & Data Science at Arrowhead Pharmaceuticals, where I lead human-genetics-driven target discovery and biomarker strategy for RNA-based therapeutics. My work sits where human data, genomics, and AI meet drug development — using population-scale biology, biobanks, and real-world evidence to power an end-to-end, AI-native discovery engine, increasingly orchestrated by agentic AI systems that reason, plan, and act across the target-to-medicine pipeline.

Background — the road to an AI-native, end-to-end discovery vision

My vision is an end-to-end drug-discovery engine built on human data and run by AI — where genomics, biobanks, and real-world evidence feed agentic systems that move autonomously from biology to validated medicine. Every stage of my career has been a building block toward it.

Learning the language of human data. I earned my Ph.D. at Fudan University in 2014 under Prof. Li Jin, with the support of Prof. Jiucun Wang and Prof. Momiao Xiong — where I learned to read population-scale human biology as data, the raw material of everything that follows.

Genomics & epigenomics at scale. In postdoctoral training at the University of Texas Health Science Center at Houston (2014–2015) and the University of California, San Diego (2015–2017), I contributed to landmark studies of the human PBMC methylome, the silkworm methylome, hepatocellular and pancreatic cancer methylomes, CD4+ methylomes in rheumatoid arthritis, and tissue-of-origin mapping from cell-free circulating DNA methylation — turning genomic and epigenomic signal into diagnosis, prognosis, and biomarkers.

Biobanks, RWE & disease genes. From 2017, at the Marshfield Clinic Research Institute and the University of Wisconsin–Madison, I worked at the intersection of genetics and the clinic — linking genome data to electronic health records and registries, the foundation of real-world evidence. I applied large-scale bioinformatics and data-mining to the Personalized Medicine Research Project (PMRP) cohort alongside Roadmap, eMERGE, GTEx, UK Biobank, TCGA, and other biobanks to map disease susceptibility genes and biomarkers. In 2019, I introduced a novel gene-based recessive diplotype approach that identified FGF6 as a new iron-metabolism gene — work published in Blood.

From human genetics to medicines, now run by AI. At Arrowhead, these threads converge: human-genetics-driven target discovery, biomarkers, and patient stratification for RNA therapeutics — and the build-out of AI and agentic systems that connect human data, genomics, and RWE into one autonomous, end-to-end discovery loop. That is the vision the rest of my work is in service of.

Focus today

My work centers on a simple thesis: the best medicines start in human data, and AI is how we read that data end to end. I build the bridge from population-scale genetics to validated programs, and from manual analysis to autonomous, agent-driven discovery.

Human data, genomics & biobanks

  • Human-genetics-first target discovery — anchoring targets in genetic causal evidence across the druggable genome, drawing on biobanks (UK Biobank, FinnGen, All of Us, eMERGE, PMRP) and rare-variant consortia such as BRaVa.
  • Multi-omic & functional genomics — integrating GWAS, exome/rare-variant, eQTL, methylation, and single-cell data to move from association to mechanism.
  • Real-world evidence (RWE) — linking EHR, claims, and registry data to genetics for patient stratification, indication selection, and biomarker strategy.

AI, agents & agentic systems

  • AI/ML for discovery — predictive models for target prioritization, variant effect, and biomarker discovery across multi-omic and clinical data.
  • Agentic AI & autonomous systems — designing AI agents and multi-agent workflows that plan, retrieve, analyze, and verify across the discovery stack — turning fragmented, manual steps into reproducible, self-checking pipelines.
  • End-to-end drug discovery — connecting human-data evidence, target validation, biomarker and patient-selection strategy, and program decisions into one AI-native loop.

How I got here

Across my career I have applied case-control, pedigree-based linkage, association, and transmission disequilibrium analyses under additive, dominant, recessive, and compound-heterozygous models — with mixed-model correction for population structure and relatedness, multi-phenotype joint analysis, and causal inference — to find and validate disease genes. That statistical-genetics foundation is what I now scale with AI and agentic systems.


Connect