Who Am I?

Hi, my name is Duo Peng, I am a computational (cell) biologist tailblazing deeper and scalable exploration of cellular biology.

Currently, I am a senior computational biologist at the Chan Zuckerberg Biohub San Francisco.

My work focuses on developing species-level data-mining and standardized processing tools for both human scientists and AI agents to fully harness the potential of vast public datasets. Additionally, I develop innovative computational tools that allow us to (1) design experiments to collect cellular oragnization/architecture data at scale and (2) effectively analyze the high dimensional measurements, empowering deeper and scalable exploration of cellular biology.

Education

University of Georgia, Athens, Georgia, U.S.A.

2012-2017

Dissertation: Developing CRISPR/Cas9 for Genome-Wide Gene Editing in the Human Pathogen Trypanosoma cruzi

University of Georgia, Athens, Georgia, U.S.A.

2012-2016

Dissertation: Frequent Intra-Family Recombination in the Largest Repository of Antigen Variants in The Protozoan Pathogen Trypanosoma cruzi

Wuhan University, Wuhan, Hubei, P.R.China

2006-2010

Thesis: Predicting Trans-splicing by Analysis of RNA-seq Sequencing Data

Selected Publications

See Google Scholar for a complete list

1.  D. Peng*, M. Vangipuram, J. Wong, M.D. Leonetti* (2024) protoSpaceJAM: an open-source, customizable and web-accessible design platform for CRISPR/Cas insertional knock-in. Nucleic Acids Research (* corresponding authors) [link]

2.  D. Peng, E.G. Kakani, E.Mameli, C. Vidoudez, S.N. Mitchell, G.E. Merrihew, M.J. MacCoss, K. Adams, T.A. Rinvee, W.R Shaw, F. Catteruccia. (2022) A male steroid controls female sexual behaviour in the malaria mosquito. Nature [link]

3.  D. Peng, R. Tarleton. (2015) EuPaGDT: A Web Tool Tailored to Design CRISPR Guide RNAs for Eukaryotic Pathogens. Microbial Genomics [link]

4.  D. Peng, S.P. Kurup, P.Y. Yao, T.A. Minning, R.L. Tarleton. (2014) CRISPR-Cas9-mediated Single-gene and Gene Family Disruption in Trypanosoma cruzi. mBio [link]

5.  D. Peng, X. Gu, L.J. Xue, J.H. Leebens-Mack, C.J. Tsai. (2014) Bayesian phylogeny of sucrose transporters: Ancient Origins, Differential Expansion and Convergent Evolution in Monocots and Dicots. Frontiers in Plant Science [link]

6.  D.B. Weatherly*, D. Peng*, RL Tarleton. (2016) Recombination-driven Generation of the Largest Pathogen Repository of Antigen Variants in the Protozoan Trypanosoma cruzi. BMC Genomics (* equal contribution) [link]

7.  Z. Zuo*, D. Peng*, X. Yin, X. Zhou, H. Cheng, R. Zhou. (2013) Genome-wide Analysis Reveals Origin of Transfer RNA Genes From tRNA Halves. Molecular Biology and Evolution (* equal contribution) [link]

8.  K. Werling, R. Shaw, M. Itoe, K. Westervelt, P. Marcenac, D. Paton, D. Peng, N. Singh, A. Smidler, A. South, A. Deik, L. Mancio-Silva, A. Demas, E. Calvo, S. Bhatia, C. Clish, F. Catteruccia (2018) Steroid Hormone Function Controls Non-competitive Plasmodium Development in Anopheles. Cell [link]

9.  W. Wang, D. Peng, RP Baptista, Y Li, JC Kissinger, RL Tarleton. (2021) Strain-specific genome evolution in Trypanosoma cruzi, the agent of Chagas disease. PLOS Pathogens [link]

Work Experience

Senior computational biologist
2024.07-present

1. Build machine learning models to resolve host gene signatures at different resolutions, and predict host response under perturbations.
2. Data-driven understanding of the landscape of cellular responses from multimodal assays

Bioinformatics data scientist II
2023.01-2024.06

1. Data-driven understanding of subcellular architecture (preprint).
2. Species-wide data mining for paired host and viral gene expression, build machine learning models to resolve host gene signatures at different resolutions.

Bioinformatics data scientist I
2021.11-2022.12

1. ProtoSpaceJAM: Genome-wide CRISPR knock-in design at scale using biologically informed algorithms (paper, webapp).
2. DeepGenotype: Calculate frequencies of protein-level mutations from deep-sequencing reads of CRISPR-edited cells (codebase).

Software developed

1.  Data portal for paper: Global organelle profiling reveals the human proteome’s subcellular landscape and its dynamic remodeling
    online access (hosted by the Chan Zuckerberg Biohub San Francisco)

2.  Web-based App: ProtoSpaceJAM - CRISPR knock-in design at scale
    online access (hosted by the Chan Zuckerberg Biohub San Francisco)
    code base

3.  Web-based App: Eukaryotic Pathogen gRNA design tool (This webserver had 24,907 users, 49,267 visits, 17,972 job requests from 91 countries [Google Analytics, 2021])
    online access (hosted by the University of Georgia)

4.  Automated Image Preprocessing and Malaria-oocyst Recognition Tool
    online access (hosted by AWS cloud)
    code base:    Preprocessing    Recognition