E.g. for E. Coli K-12 MG1655: biocyc.org/group?id=:ALL-PROMOTERS&orgid=ECOLI For some context see e. Coli K-12 MG1655 gene thrL + e. Coli K-12 MG1655 gene thrA + thrB + thrC all of which are in the same transcription unit.
NCBI taxonomy entry: www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=511145 This links to:
- Interactively browse the sequence on the browser viewer: "Reference genome: Escherichia coli str. K-12 substr. MG1655" which eventually leads to: www.ncbi.nlm.nih.gov/nuccore/556503834?report=graphIf we zoom into the start, we hover over the very first gene/protein: the famous (just kidding) e. Coli K-12 MG1655 gene thrL, at position 190-255.The second one is the much more interesting e. Coli K-12 MG1655 gene thrA.
- Gene list, with a total of 4,629 as of 2021: www.ncbi.nlm.nih.gov/gene/?term=txid511145
Contains the genes: e. Coli K-12 MG1655 gene thrL, e. Coli K-12 MG1655 gene thrA, e. Coli K-12 MG1655 gene thrB and e. Coli K-12 MG1655 gene thrC, all of which have directly linked functionality.
We can find it by searching for the species in the BioCyc promoter database. This leads to: biocyc.org/group?id=:ALL-PROMOTERS&orgid=ECOLI.
That page lists several components of the promoter, which we should try to understand!
Some of the transcription factors are proteins:
After the first gene in the codon, thrL, there is a rho-independent termination. By comparing:we understand that the presence of threonine or isoleucine variants, L-threonyl and L-isoleucyl, makes the rho-independent termination become more efficient, so the control loop is quite direct! Not sure why it cares about isoleucine as well though.
Multiple genes coding for multiple proteins in one transcription unit, e.g. e. Coli K-12 MG1655 gene thrL and e. Coli K-12 MG1655 gene thrA are both prat of the E. Coli K-12 MG1655 operon thrLABC.