📒 Goldberg 2017

Emerging whole-cell modeling principles and methods 1

Whole-cell (WC) computational models aim to predict cellular phenotypes from genotype and the environment by representing the function of each gene, gene product, and metabolite.

We propose that WC models aim to represent all of the chemical reactions in a cell and all of the physical processes that influence their rates

  • the sequence of each chromosome, RNA, and protein
  • the structure of each molecule
  • the subcellular organization of cells into organelles and microdomains
  • the participants and effect of each molecular interaction
  • the kinetic parameters of each interaction
  • the concentration of each species in each organelle and microdomain
  • the concentration of each species in the extracellular environment

The things WC models aim to proedict:

  • stochastic dynamics of each molecular interaction
  • temporal dynamics of the concentration of each species
  • spatial dynamics of the concentration of each species in each organelle and microdomain
  • complex phenotypes such as cell shape, growth rate, motility, and fate, as well as the variation in the behavior of single cells within clonal populations

enables WC models to generate predictions that could be embedded into higher-order multiscale models.


WC models require a ton of parameters.

Measurement methods

single-cell and genomic measurement

Data repositories

A list of public repositories (in Excel file).

Prediction tools

However, many current prediction tools lack sufficient accuracy for WC modeling.

Published models

A list of public model repos (in Excel file).

Emerging methods and tools, Data aggregation and organization

  • (a) Data should be aggregated from thousands of publications, repositories, and prediction tools and organized into a PGDB. (e.g. Pathway Tools, Whole Cell KB)
  • (b) Models should be designed, calibrated, and validated from PGDBs and described using rules.
  • (c) Models should be simulated using parallel, network-free, multi-algorithmic simulators and their results should be stored in a database.
  • (d) Simulation results should be visualized and analyzed.
  • (e) Results should be validated by comparison to experimental measurements. Importantly, all of these steps should be collaborative.

Scalable model design

Model languages

Simulation tools


  • saCeSS : distributed calibration of large biochemical models
  • surrogate models to calibrate large models


Simulation results analysis

  • Virtual Cell
  • WholeCellSimDB and WholeCellViz


  • Experimental measurement: Measuring kinetic parameters at the interactome scale.
  • Prediction tools: Accurately predict the molecular effects of insertions, deletions, and structural variants.
  • Data aggregation
  • Scalable, data-driven model design
  • Rule-based model representation: No existing language supports all of the biological processes that WC models must represent.
  • Scalable multi-algorithmic simulation
  • Calibration and verification
  • Simulation analysis
  • Collaboration


  • Whole-cell models predict phenotype from genotype by representing each gene function
  • Whole-cell models could transform bioscience, bioengineering, and medicine
  • There are many challenges to achieve whole-cell models
  • New measurement and modeling technologies are rapidly enabling whole-cell modeling
  • Ongoing efforts to build scalable modeling tools will accelerate whole-cell modeling

  1. Goldberg AP, Szigeti B, Chew YH, Sekar JA, Roth YD, Karr JR. Emerging whole-cell modeling principles and methods. Curr Opin Biotechnol. 2018;51:97–102.PMC5997489 ↩︎