Notable examples:
Only present in Gram-negative bacteria.
- www.cell.com/cell/fulltext/S0092-8674(15)00568-1 2015. Using Genome-scale Models to Predict Biological Capabilities. Edward J. O'Brien, Jonathan M. Monk, Bernhard O. Palsson.
- www.quora.com/What-are-some-good-books-on-Escherichia-Coli-E-Coli
Size: 1-2 micrometers long and about 0.25 micrometer in diameter, so:
2 * 0.5 * 0.5 * 10e-18 and thus 0.5 micrometer square.Reference strain: E. Coli K-12 MG1655.
Genome:
- 4k genes
- 5 Mbps
- www.ncbi.nlm.nih.gov/genome/167
wget ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/005/845/GCF_000005845.2_ASM584v2/GCF_000005845.2_ASM584v2_genomic.fna.gzwget -O NC_000913.3.fasta 'https://www.ncbi.nlm.nih.gov/search/api/sequence/NC_000913.3/?report=fasta'
Synthesis project: www.sciencemag.org/news/2016/08/biologists-are-close-reinventing-genetic-code-life
Omics modeling: www.ncbi.nlm.nih.gov/pmc/articles/PMC5611438/ Tools for Genomic and Transcriptomic Analysis of Microbes at Single-Cell Level Zixi Chen, Lei Chen, Weiwen Zhang.
20 minutes in optimal conditions, with a crazy multiple start sites mechanism: E. Coli starts DNA replication before the previous one finished.
Otherwise, naively, would take 60-90 minutes just to replicate and segregate the full DNA otherwise. So it starts copying multiple times.
- biology.stackexchange.com/questions/30080/how-can-e-coli-proliferate-so-rapidly
- stochasticscientist.blogspot.co.uk/2012/02/how-e-coli-grows-so-fast.html
- www.ncbi.nlm.nih.gov/pmc/articles/PMC2063475/ Organization of sister origins and replisomes during multifork DNA replication in Escherichia coli by Fossum et al (2007)
E. Coli starts DNA replication before the previous one finished by
Ciro Santilli 37 Updated 2025-07-16
The conventional starting point is not at the E. Coli K-12 MG1655 origin of replication.
biocyc.org/ECOLI/NEW-IMAGE?type=EXTRAGENIC-SITE&object=G0-10506 explains:If it is a bit hard to understand what they mean by "origin of transfer" though, as that term is usually associated with the origin of transfer of bacterial conjugation.
This site is the origin of replication of the E. coli chromosome. It contains the binding sites for DnaA, which is critical for initiation of replication. Replication proceeds bidirectionally. For historical reasons, the numbering of E. coli's circular chromosome does not start at the origin of replication, but at the origin of transfer during conjugation.
github.com/CovertLab/WholeCellEcoliRelease is a whole cell simulation model created by Covert Lab and other collaborators.
The project is written in Python, hurray!
But according to te README, it seems to be the use a code drop model with on-request access to master. Ciro Santilli asked at rationale on GitHub discussion, and they confirmed as expected that it is to:
- to prevent their publication ideas from being stolen. Who would steal publication ideas with public proof in an issue tracker without crediting original authors? Academia is broken. Academia should be the most open form of knowledge sharing. But instead we get this silly competition for publication points.
- to prevent noise from non-collaborators. But they only get like 2 issues as year on such a meganiche subject... Did you know that you can ignore people, and even block them if they are particularly annoying? Much more likely is that no one will every hear about your project and that it will die with its last graduate student slave.
The project is a followup to the earlier M. genitalium whole cell model by Covert lab which modelled Mycoplasma genitalium. E. Coli has 8x more genes (500 vs 4k), but it the undisputed bacterial model organism and as such has been studied much more thoroughly. It also reproduces faster than Mycoplasma (20 minutes vs a few hours), which is a huge advantages for validation/exploratory experiments.
The project has a partial dependency on the proprietary optimization software CPLEX which is freeware, for students, not sure what it is used for exactly, from the comment in the
requirements.txt the dependency is only partial.This project makes Ciro Santilli think of the E. Coli as an optimization problem. Given such external nutrient/temperature condition, which DNA sequence makes the cell grow the fastest? Balancing metabolites feels like designing a Factorio speedrun.
There is one major thing missing thing in the current model: promoters/transcription factor interactions are not modelled due to lack/low quality of experimental data: github.com/CovertLab/WholeCellEcoliRelease/issues/21. They just have a magic direct "transcription factor to gene" relationship, encoded at reconstruction/ecoli/flat/foldChanges.tsv in terms of type "if this is present, such protein is expressed 10x more". Transcription units are not implemented at all it appears.
Everything in this section refers to version 7e4cc9e57de76752df0f4e32eca95fb653ea64e4, the code drop from November 2020, and was tested on Ubuntu 21.04 with a docker install of
docker.pkg.github.com/covertlab/wholecellecolirelease/wcm-full with image id 502c3e604265, unless otherwise noted. E. Coli Whole Cell Model by Covert Lab Install and first run by
Ciro Santilli 37 Updated 2025-07-16
At 7e4cc9e57de76752df0f4e32eca95fb653ea64e4 you basically need to use the Docker image on Ubuntu 21.04 due to pip breaking changes... (not their fault). Perhaps pyenv would solve things, but who has the patience for that?!?!
The Docker setup from README does just work. The image download is a bit tedius, as it requires you to create a GitHub API key as described in the README, but there must be reasons for that.
Once the image is downloaded, you really want to run is from the root of the source tree:This mounts the host source under The meaning of each of the analysis commands is described at Section "Output overview".
sudo docker run --name=wcm -it -v "$(pwd):/wcEcoli" docker.pkg.github.com/covertlab/wholecellecolirelease/wcm-full/wcEcoli, so you can easily edit and view output images from your host. Once inside Docker we can compile, run the simulation, and analyze results with:make clean compile &&
python runscripts/manual/runFitter.py &&
python runscripts/manual/runSim.py &&
python runscripts/manual/analysisVariant.py &&
python runscripts/manual/analysisCohort.py &&
python runscripts/manual/analysisMultigen.py &&
python runscripts/manual/analysisSingle.pyAs a Docker refresher, after you stop the container, e.g. by restarting your computer or running
sudo docker stop wcm, you can get back into it with:sudo docker start wcm
sudo docker run -it wcm bashrunscripts/manual/runFitter.py takes about 15 minutes, and it generates files such as reconstruction/ecoli/dataclasses/process/two_component_system.py (related) which is required to run the simulation, it is basically a part of the build.runSim.py does the main simulation, progress output contains lines of type:Time (s) Dry mass Dry mass Protein RNA Small mol Expected
(fg) fold change fold change fold change fold change fold change
======== ======== =========== =========== =========== =========== ===========
0.00 403.09 1.000 1.000 1.000 1.000 1.000
0.20 403.18 1.000 1.000 1.000 1.000 1.000 2569.18 783.09 1.943 1.910 2.005 1.950 1.963
Simulation finished:
- Length: 0:42:49
- Runtime: 0:09:13Run output is placed under
out/:Some of the output data is stored as
.cpickle files. To observe those files, you need the original Python classes, and therefore you have to be inside Docker, from the host it won't work.We can list all the plots that have been produced under Plots are also available in SVG and PDF formats, e.g.:
out/ withfind -name '*.png'The output directory has a hierarchical structure of type:where:
./out/manual/wildtype_000000/000000/generation_000000/000000/wildtype_000000: variant conditions.wildtypeis a human readable label, and000000is an index amongst the possiblewildtypeconditions. For example, we can have different simulations with different nutrients, or different DNA sequences. An example of this is shown at run variants.000000: initial random seed for the initial cell, likely fed to NumPy'snp.random.seedgenereation_000000: this will increase with generations if we simulate multiple cells, which is supported by the model000000: this will presumably contain the cell index within a generation
We also understand that some of the top level directories contain summaries over all cells, e.g. the
massFractionSummary.pdf plot exists at several levels of the hierarchy:./out/manual/plotOut/massFractionSummary.pdf
./out/manual/wildtype_000000/plotOut/massFractionSummary.pdf
./out/manual/wildtype_000000/000000/plotOut/massFractionSummary.pdf
./out/manual/wildtype_000000/000000/generation_000000/000000/plotOut/massFractionSummary.pdfEach of thoes four levels of
plotOut is generated by a different one of the analysis scripts:./out/manual/plotOut: generated bypython runscripts/manual/analysisVariant.py. Contains comparisons of different variant conditions. We confirm this by looking at the results of run variants../out/manual/wildtype_000000/plotOut: generated bypython runscripts/manual/analysisCohort.py --variant_index 0. TODO not sure how to differentiate between two different labels e.g.wildtype_000000andsomethingElse_000000. If-vis not given, a it just picks the first one alphabetically. TODO not sure how to automatically generate all of those plots without inspecting the directories../out/manual/wildtype_000000/000000/plotOut: generated bypython runscripts/manual/analysisMultigen.py --variant_index 0 --seed 0./out/manual/wildtype_000000/000000/generation_000000/000000/plotOut: generated bypython runscripts/manual/analysisSingle.py --variant_index 0 --seed 0 --generation 0 --daughter 0. Contains information about a single specific cell.
It would be boring if we could only simulate the same condition all the time, so let's have a look at the different boundary conditions that we can apply to the cell!
We are able to alter things like the composition of the external medium, and the genome of the bacteria, which will make the bacteria behave differently.
The variant selection is a bit cumbersome as we have to use indexes instead of names, but one you know what you are doing, it is fine.
Of course, genetic modification is limited only to experimentally known protein interactions due to the intractability of computational protein folding and computational chemistry in general, solving those would bsai.
The default run variant, if you don't pass any options, just has the minimal growth conditions set. What this means can be seen at condition.
Notably, this implies a growth medium that includes glucose and salt. It also includes oxygen, which is not strictly required, but greatly benefits cell growth, and is of course easier to have than not have as it is part of the atmosphere!
But the medium does not include amino acids, which the bacteria will have to produce by itself.
Pinned article: Introduction to the OurBigBook Project
Welcome to the OurBigBook Project! Our goal is to create the perfect publishing platform for STEM subjects, and get university-level students to write the best free STEM tutorials ever.
Everyone is welcome to create an account and play with the site: ourbigbook.com/go/register. We belive that students themselves can write amazing tutorials, but teachers are welcome too. You can write about anything you want, it doesn't have to be STEM or even educational. Silly test content is very welcome and you won't be penalized in any way. Just keep it legal!
Intro to OurBigBook
. Source. We have two killer features:
- topics: topics group articles by different users with the same title, e.g. here is the topic for the "Fundamental Theorem of Calculus" ourbigbook.com/go/topic/fundamental-theorem-of-calculusArticles of different users are sorted by upvote within each article page. This feature is a bit like:
- a Wikipedia where each user can have their own version of each article
- a Q&A website like Stack Overflow, where multiple people can give their views on a given topic, and the best ones are sorted by upvote. Except you don't need to wait for someone to ask first, and any topic goes, no matter how narrow or broad
This feature makes it possible for readers to find better explanations of any topic created by other writers. And it allows writers to create an explanation in a place that readers might actually find it.Figure 1. Screenshot of the "Derivative" topic page. View it live at: ourbigbook.com/go/topic/derivativeVideo 2. OurBigBook Web topics demo. Source. - local editing: you can store all your personal knowledge base content locally in a plaintext markup format that can be edited locally and published either:This way you can be sure that even if OurBigBook.com were to go down one day (which we have no plans to do as it is quite cheap to host!), your content will still be perfectly readable as a static site.
- to OurBigBook.com to get awesome multi-user features like topics and likes
- as HTML files to a static website, which you can host yourself for free on many external providers like GitHub Pages, and remain in full control
Figure 3. Visual Studio Code extension installation.Figure 4. Visual Studio Code extension tree navigation.Figure 5. Web editor. You can also edit articles on the Web editor without installing anything locally.Video 3. Edit locally and publish demo. Source. This shows editing OurBigBook Markup and publishing it using the Visual Studio Code extension.Video 4. OurBigBook Visual Studio Code extension editing and navigation demo. Source. - Infinitely deep tables of contents:
All our software is open source and hosted at: github.com/ourbigbook/ourbigbook
Further documentation can be found at: docs.ourbigbook.com
Feel free to reach our to us for any help or suggestions: docs.ourbigbook.com/#contact





