E. Coli Whole Cell Model by Covert Lab Condition Updated 2025-07-16
reconstruction/ecoli/flat/condition/nutrient/minimal.tsvcontains the nutrients in a minimal environment in which the cell survives:If we compare that to"molecule id" "lower bound (units.mmol / units.g / units.h)" "upper bound (units.mmol / units.g / units.h)" "ADP[c]" 3.15 3.15 "PI[c]" 3.15 3.15 "PROTON[c]" 3.15 3.15 "GLC[p]" NaN 20 "OXYGEN-MOLECULE[p]" NaN NaN "AMMONIUM[c]" NaN NaN "PI[p]" NaN NaN "K+[p]" NaN NaN "SULFATE[p]" NaN NaN "FE+2[p]" NaN NaN "CA+2[p]" NaN NaN "CL-[p]" NaN NaN "CO+2[p]" NaN NaN "MG+2[p]" NaN NaN "MN+2[p]" NaN NaN "NI+2[p]" NaN NaN "ZN+2[p]" NaN NaN "WATER[p]" NaN NaN "CARBON-DIOXIDE[p]" NaN NaN "CPD0-1958[p]" NaN NaN "L-SELENOCYSTEINE[c]" NaN NaN "GLC-D-LACTONE[c]" NaN NaN "CYTOSINE[c]" NaN NaNreconstruction/ecoli/flat/condition/nutrient/minimal_plus_amino_acids.tsv, we see that it adds the 20 amino acids on top of the minimal condition:so we guess that"L-ALPHA-ALANINE[p]" NaN NaN "ARG[p]" NaN NaN "ASN[p]" NaN NaN "L-ASPARTATE[p]" NaN NaN "CYS[p]" NaN NaN "GLT[p]" NaN NaN "GLN[p]" NaN NaN "GLY[p]" NaN NaN "HIS[p]" NaN NaN "ILE[p]" NaN NaN "LEU[p]" NaN NaN "LYS[p]" NaN NaN "MET[p]" NaN NaN "PHE[p]" NaN NaN "PRO[p]" NaN NaN "SER[p]" NaN NaN "THR[p]" NaN NaN "TRP[p]" NaN NaN "TYR[p]" NaN NaN "L-SELENOCYSTEINE[c]" NaN NaN "VAL[p]" NaN NaNNaNin theupper moundlikely means infinite.We can try to understand the less obvious ones:ADP: TODOPI: TODOPROTON[c]: presumably a measure of pHGLC[p]: glucose, this can be seen by comparingminimal.tsvwithminimal_no_glucose.tsvAMMONIUM: ammonium. This appears to be the primary source of nitrogen atoms for producing amino acids.CYTOSINE[c]: hmmm, why is external cytosine needed? Weird.
reconstruction/ecoli/flat/reconstruction/ecoli/flat/condition/timeseries/contains sequences of conditions for each time. For example:reconstruction/ecoli/flat/reconstruction/ecoli/flat/condition/timeseries/000000_basal.tsvcontains:which means just using"time (units.s)" "nutrients" 0 "minimal"reconstruction/ecoli/flat/condition/nutrient/minimal.tsvuntil infinity. That is the default one used byrunSim.py, as can be seen from./out/manual/wildtype_000000/000000/generation_000000/000000/simOut/Environment/attributes/nutrientTimeSeriesLabelwhich contains just000000_basal.reconstruction/ecoli/flat/reconstruction/ecoli/flat/condition/timeseries/000001_cut_glucose.tsvis more interesting and contains:so we see that this will shift the conditions half-way to a condition that will eventually kill the bacteria because it will run out of glucose and thus energy!"time (units.s)" "nutrients" 0 "minimal" 1200 "minimal_no_glucose"
Timeseries can be selected with--variant nutrientTimeSeries X Y, see also: run variants.We can use that variant with:VARIANT="condition" FIRST_VARIANT_INDEX=1 LAST_VARIANT_INDEX=1 python runscripts/manual/runSim.pyreconstruction/ecoli/flat/condition/condition_defs.tsvcontains lines of form:"condition" "nutrients" "genotype perturbations" "doubling time (units.min)" "active TFs" "basal" "minimal" {} 44.0 [] "no_oxygen" "minimal_minus_oxygen" {} 100.0 [] "with_aa" "minimal_plus_amino_acids" {} 25.0 ["CPLX-125", "MONOMER0-162", "CPLX0-7671", "CPLX0-228", "MONOMER0-155"]conditionrefers to entries inreconstruction/ecoli/flat/condition/condition_defs.tsvnutrientsrefers to entries underreconstruction/ecoli/flat/condition/nutrient/, e.g.reconstruction/ecoli/flat/condition/nutrient/minimal.tsvorreconstruction/ecoli/flat/condition/nutrient/minimal_plus_amino_acids.tsvgenotype perturbations: there aren't any in the file, but this suggests that genotype modifications can also be incorporated heredoubling time: TODO experimental data? Because this should be a simulation output, right? Or do they cheat and fix doubling by time?active TFs: this suggests that they are cheating transcription factors here, as those would ideally be functions of other more basic inputs
E. Coli Whole Cell Model by Covert Lab Output overview Updated 2025-07-16
Run output is placed under
out/:Some of the output data is stored as
.cpickle files. To observe those files, you need the original Python classes, and therefore you have to be inside Docker, from the host it won't work.We can list all the plots that have been produced under Plots are also available in SVG and PDF formats, e.g.:
out/ withfind -name '*.png'The output directory has a hierarchical structure of type:where:
./out/manual/wildtype_000000/000000/generation_000000/000000/wildtype_000000: variant conditions.wildtypeis a human readable label, and000000is an index amongst the possiblewildtypeconditions. For example, we can have different simulations with different nutrients, or different DNA sequences. An example of this is shown at run variants.000000: initial random seed for the initial cell, likely fed to NumPy'snp.random.seedgenereation_000000: this will increase with generations if we simulate multiple cells, which is supported by the model000000: this will presumably contain the cell index within a generation
We also understand that some of the top level directories contain summaries over all cells, e.g. the
massFractionSummary.pdf plot exists at several levels of the hierarchy:./out/manual/plotOut/massFractionSummary.pdf
./out/manual/wildtype_000000/plotOut/massFractionSummary.pdf
./out/manual/wildtype_000000/000000/plotOut/massFractionSummary.pdf
./out/manual/wildtype_000000/000000/generation_000000/000000/plotOut/massFractionSummary.pdfEach of thoes four levels of
plotOut is generated by a different one of the analysis scripts:./out/manual/plotOut: generated bypython runscripts/manual/analysisVariant.py. Contains comparisons of different variant conditions. We confirm this by looking at the results of run variants../out/manual/wildtype_000000/plotOut: generated bypython runscripts/manual/analysisCohort.py --variant_index 0. TODO not sure how to differentiate between two different labels e.g.wildtype_000000andsomethingElse_000000. If-vis not given, a it just picks the first one alphabetically. TODO not sure how to automatically generate all of those plots without inspecting the directories../out/manual/wildtype_000000/000000/plotOut: generated bypython runscripts/manual/analysisMultigen.py --variant_index 0 --seed 0./out/manual/wildtype_000000/000000/generation_000000/000000/plotOut: generated bypython runscripts/manual/analysisSingle.py --variant_index 0 --seed 0 --generation 0 --daughter 0. Contains information about a single specific cell.