Source: cirosantilli/e-coli-whole-cell-model-by-covert-lab

= E. Coli Whole Cell Model by Covert Lab
{c}
{numbered}
{scope}
{tag=articles}
{title2=CovertLab/WholeCellEcoliRelease}

https://github.com/CovertLab/WholeCellEcoliRelease is a <whole cell simulation> model created by <Covert Lab> and other collaborators.

The project is written in <Python>, hurray!

But according to te <README>, it seems to be the use a <code drop> model with on-request access to master. <Ciro Santilli> asked at https://github.com/CovertLab/WholeCellEcoliRelease/discussions/23[rationale on GitHub discussion], and they confirmed as expected that it is to:
* to prevent their <academic publishing>[publication] ideas from being stolen. Who would steal publication ideas with public proof in an issue tracker without crediting original authors? <Academia is broken>. Academia should be the most open form of knowledge sharing. But instead we get this silly competition for publication points.
* to prevent noise from non-collaborators. But they only get like 2 issues as year on such a meganiche subject... Did you know that you can ignore people, and even block them if they are particularly annoying? Much more likely is that no one will every hear about your project and that it will die with its last graduate student slave.

The project is a followup to the earlier <M. genitalium whole cell model by Covert lab> which modelled <Mycoplasma genitalium>. <E. Coli> has 8x more genes (500 vs 4k), but it the undisputed <bacterial> <model organism> and as such has been studied much more thoroughly. It also reproduces faster than Mycoplasma (20 minutes vs a few hours), which is a huge advantages for validation/exploratory <experiments>.

The project has a partial dependency on the <proprietary software>[proprietary] <optimization software> <CPLEX> which is <freeware>, for students, not sure what it is used for exactly, from the comment in the `requirements.txt` the dependency is only partial.

This project makes <Ciro Santilli> think of the <E. Coli> as an <optimization problem>. Given such external nutrient/temperature condition, which <DNA> sequence makes the cell grow the fastest? Balancing <metabolites> feels like designing a <Factorio> speedrun.

There is one major thing missing thing in the current model: <promoters>/<transcription factor> interactions are not modelled due to lack/low quality of experimental data: https://github.com/CovertLab/WholeCellEcoliRelease/issues/21[]. They just have a magic direct "<transcription factor> to <gene>" relationship, encoded at https://github.com/CovertLab/WholeCellEcoliRelease/blob/7e4cc9e57de76752df0f4e32eca95fb653ea64e4/reconstruction/ecoli/flat/foldChanges.tsv[reconstruction/ecoli/flat/foldChanges.tsv] in terms of type "if this is present, such protein is expressed 10x more". <Transcription units> are not implemented at all it appears.

Everything in this section refers to version https://github.com/CovertLab/WholeCellEcoliRelease/tree/7e4cc9e57de76752df0f4e32eca95fb653ea64e4[7e4cc9e57de76752df0f4e32eca95fb653ea64e4], the code drop from November 2020, and was tested on <Ubuntu> 21.04 with a docker install of `docker.pkg.github.com/covertlab/wholecellecolirelease/wcm-full` with image id 502c3e604265, unless otherwise noted.