Ciro Santilli @cirosantilli 37

 Incoming links: Model organism

DNA sequencing milestone Updated 2025-07-16

Most of these are going to be Whole-genome sequencing of some model organism:

1975 by Sanger et al.: 5 kbp of the single-stranded bacteriophage ΦX174 using Sanger's radiolabelling method
1981 by Sanger et al.: 17 kbp of human mitochondrial DNA via Sanger method, known as the Cambridge Reference Sequence
2003: Human Genome Project (3 Gbp)

en.wikipedia.org/wiki/Whole_genome_sequencing#History lists them all. Basically th big "firsts" all happened in the 1990s and early 2000s.

 Read the full article

E. Coli Whole Cell Model by Covert Lab Updated 2025-07-16

 View more

github.com/CovertLab/WholeCellEcoliRelease is a whole cell simulation model created by Covert Lab and other collaborators.

The project is written in Python, hurray!

But according to te README, it seems to be the use a code drop model with on-request access to master. Ciro Santilli asked at rationale on GitHub discussion, and they confirmed as expected that it is to:

to prevent their publication ideas from being stolen. Who would steal publication ideas with public proof in an issue tracker without crediting original authors? Academia is broken. Academia should be the most open form of knowledge sharing. But instead we get this silly competition for publication points.
to prevent noise from non-collaborators. But they only get like 2 issues as year on such a meganiche subject... Did you know that you can ignore people, and even block them if they are particularly annoying? Much more likely is that no one will every hear about your project and that it will die with its last graduate student slave.

The project is a followup to the earlier M. genitalium whole cell model by Covert lab which modelled Mycoplasma genitalium. E. Coli has 8x more genes (500 vs 4k), but it the undisputed bacterial model organism and as such has been studied much more thoroughly. It also reproduces faster than Mycoplasma (20 minutes vs a few hours), which is a huge advantages for validation/exploratory experiments.

The project has a partial dependency on the proprietary optimization software CPLEX which is freeware, for students, not sure what it is used for exactly, from the comment in the requirements.txt the dependency is only partial.

This project makes Ciro Santilli think of the E. Coli as an optimization problem. Given such external nutrient/temperature condition, which DNA sequence makes the cell grow the fastest? Balancing metabolites feels like designing a Factorio speedrun.

There is one major thing missing thing in the current model: promoters/transcription factor interactions are not modelled due to lack/low quality of experimental data: github.com/CovertLab/WholeCellEcoliRelease/issues/21. They just have a magic direct "transcription factor to gene" relationship, encoded at reconstruction/ecoli/flat/foldChanges.tsv in terms of type "if this is present, such protein is expressed 10x more". Transcription units are not implemented at all it appears.

Everything in this section refers to version 7e4cc9e57de76752df0f4e32eca95fb653ea64e4, the code drop from November 2020, and was tested on Ubuntu 21.04 with a docker install of docker.pkg.github.com/covertlab/wholecellecolirelease/wcm-full with image id 502c3e604265, unless otherwise noted.

 Read the full article

Model protein Created 2025-06-17 Updated 2025-07-16

 View more

Ciro Santilli defines a "model protein" as a protein which has been significantly used in the history of protein science, in analogy to the term model organism.

Key characteristics of model proteins include:

they are easy to obtain and are stable
they are important to medical applications
they are small and easier to understand for early studies

Important model proteins include:

insulin: as a peptide hormone, this was small. Also it was useful and widely available even at pharmacies, The Eighth Day of Creation says you could get it a Boots, a major British pharmacy chain, and as such was a natural choice for the first sequencing by Frederick Sanger published in 1951
hemoglobin
keratin

 Read the full article

Molecular biology technologies Updated 2025-07-16

 View more

As of 2019, the silicon industry is ending, and molecular biology technology is one of the most promising and growing field of engineering.

Figure 1.
42 years of microprocessor trend data by Karl Rupp
. Source. Only transistor count increases, which also pushes core counts up. But what you gonna do when atomic limits are reached? The separation between two silicon atoms is 0.23nm and 2019 technology is at 5nm scale.

Such advances could one day lead to both biological super-AGI and immortality.

Ciro Santilli is especially excited about DNA-related technologies, because DNA is the centerpiece of biology, and it is programmable.

First, during the 2000's, the cost of DNA sequencing fell to about 1000 USD per genome in the end of the 2010's: Figure 2. "Cost per genome vs Moore's law from 2000 to 2019", largely due to "Illumina's" technology.

The medical consequences of this revolution are still trickling down towards medical applications of 2019, inevitably, but somewhat slowly due to tight privacy control of medical records.

Ciro Santilli predicts that when the 100 dollar mark is reached, every person of the First world will have their genome sequenced, and then medical applications will be closer at hand than ever.

But even 100 dollars is not enough. Sequencing power is like computing power: humankind can never have enough. Sequencing is not a one per person thing. For example, as of 2019 tumors are already being sequenced to help understand and treat them, and scientists/doctors will sequence as many tumor cells as budget allows.

Then, in the 2010's, CRISPR/Cas9 gene editing started opening up the way to actually modifying the genome that we could now see through sequencing.

What's next?

Ciro believes that the next step in the revolution could be could be: de novo DNA synthesis.

This technology could be the key to the one of the ultimate dream of biologists: cheap programmable biology with push-button organism bootstrap!

Just imagine this: at the comfort of your own garage, you take some model organism of interest, maybe start humble with Escherichia coli. Then you modify its DNA to your liking, and upload it to a 3D printer sized machine on your workbench, which automatically synthesizes the DNA, and injects into a bootstrapped cell.

You then make experiments to check if the modified cell achieves your desired new properties, e.g. production of some protein, and if not reiterate, just like a software engineer.

Of course, even if we were able to do the bootstrap, the debugging process then becomes key, as visibility is the key limitation of biology, maybe we need other cheap technologies to come in at that point.

This a place point we see the beauty of evolution the brightest: evolution does not require observability. But it also implies that if your changes to the organism make it less fit, then your mutation will also likely be lost. This has to be one of the considerations done when designing your organism.