Source: /cirosantilli/e-coli-k-12-mg1655-gene-thra

= E. Coli K-12 MG1655 gene thrA
{title2=337-2799}
{title2=fused aspartate kinase/homoserine dehydrogenase 1}

<UniProt> entry: https://www.uniprot.org/uniprot/P00561[].

<NCBI> entry: https://www.ncbi.nlm.nih.gov/gene/945803[].

The second <gene> in the <E. Coli K-12 MG1655> genome. Part of the <E. Coli K-12 MG1655 operon thrLABC>.

Part of a reaction that produces <threonine>.

This <protein> is an <enzyme>{parent}. The <UniProt> entry clearly shows the <chemical reactions> that it <catalyses>. In this case, there are actually two! It can either transforming the <metabolite>:
* "L-homoserine" into "L-aspartate 4-semialdehyde"
* "L-aspartate" into "4-phospho-L-aspartate"
Also interestingly, we see that both of those reaction require some extra energy to catalyse, one needing <adenosine triphosphate> and the other <nADP+>.

TODO: any mention of how much faster it makes the reaction, numerically?

Since this is an <enzyme>, it would also be interesting to have a quick search for it in the <KEGG> entry starting from the organism: https://www.genome.jp/pathway/eco01100+M00022 We type in the search bar "thrA", it gives a long list, but the last entry is our "thrA". Selecting it highlights two pathways in the large <graph>, so we understand that it catalyzes two different reactions, as suggested by the protein name itself (fused blah blah). We can now hover over:
* the <edge (graph)>: it shows all the enzymes that catalyze the given reaction. Both edges actually have multiple enzymes, e.g. the L-Homoserine path is also catalyzed by another enzyme called metL.
* the <node (graph)>: they are the <metabolites>, e.g. one of the paths contains "L-homoserine" on one node and "L-aspartate 4-semialdehyde"
Note that common <cofactor (biochemistry)> are omitted, since we've learnt from the UniProt entry that this reaction uses ATP.

If we can now click on the L-Homoserine edge, it takes us to: https://www.genome.jp/entry/eco:b0002+eco:b3940[]. Under "Pathway" we see an interesting looking pathway "Glycine, serine and threonine metabolism": https://www.genome.jp/pathway/eco00260+b0002 which contains a small manually selected and extremely clearly named subset of the larger graph!

But looking at the bottom of this subgraph (the UI is not great, can't Ctrl+F and enzyme names not shown, but the selected enzyme is slightly highlighted in red because it is in the URL https://www.genome.jp/pathway/eco00260+b0002[] vs https://www.genome.jp/pathway/eco00260[]) we clearly see that thrA, thrB and thrC for a sequence that directly transforms "L-aspartate 4-semialdehyde" into "Homoserine" to "O-Phospho-L-homoserine" and finally to<threonine>. This makes it crystal clear that they are not just located adjacently in the genome by chance: they are actually functionally related, and likely controlled by the same transcription factor: when you want one of them, you basically always want the three, because you must be are lacking <threonine>. TODO find transcription factor!

The UniProt entry also shows an interactive browser of the <tertiary structure> of the protein. We note that there are currently two sources available: <X-ray crystallography> and <AlphaFold>. To be honest, the <AlphaFold> one looks quite off!!!

By inspecting the <FASTA> for the entire genome, or by using the <NCBI open reading frame tool>, we see that this gene lies entirely in its own <open reading frame>, so it is quite boring

From the <FASTA> we see that the very first three <Codons> at position 337 are
``
ATG CGA GTG
``
where `ATG` is the <start codon>, and CGA GTG should be the first two that actually go into the protein:
* CGA: <arginine>
* GTG: <valine>

https://ecocyc.org/gene?orgid=ECOLI&id=ASPKINIHOMOSERDEHYDROGI-MONOMER[] mentions that the enzime is most active as <protein complex> with four copies of the same protein:
\Q[Aspartate kinase I / homoserine dehydrogenase I comprises a <protein dimer>[dimer] of ThrA dimers. Although the dimeric form is catalytically active, the binding equilibrium dramatically favors the tetrameric form. The aspartate kinase and homoserine dehydrogenase activities of each ThrA monomer are catalyzed by independent domains connected by a linker region.]
TODO image?