Source: cirosantilli/e-coli-k-12-mg1655

= E. Coli K-12 MG1655
{c}

<NCBI> taxonomy entry: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=511145 This links to:
* <genome>: https://www.ncbi.nlm.nih.gov/genome/?term=txid511145 From there there are links to either:
  * Download the <FASTA>: "Download sequences in FASTA format for genome, protein"

    For the genome, you get a compressed <FASTA> file with extension `.fna` called `GCF_000005845.2_ASM584v2_genomic.fna` that starts with:
    ``
    >NC_000913.3 Escherichia coli str. K-12 substr. MG1655, complete genome
    AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTG
    ``

    Using <wc (unix)> as in `wc GCF_000005845.2_ASM584v2_genomic.fna` gives 58022 lines, in <Vim> we see that each line is 80 characters, except for the final one which is 52. So we have 58020 * 80 + 52 = 4641652 =~ 4.6 Mbp

  * Interactively browse the sequence on the browser viewer: "Reference genome: Escherichia coli str. K-12 substr. MG1655" which eventually leads to: https://www.ncbi.nlm.nih.gov/nuccore/556503834?report=graph

    If we zoom into the start, we hover over the very first <gene>/<protein>: the famous (just kidding) <e. Coli K-12 MG1655 gene thrL>, at position 190-255.

    The second one is the much more interesting <e. Coli K-12 MG1655 gene thrA>.
  * Gene list, with a total of 4,629 as of 2021: https://www.ncbi.nlm.nih.gov/gene/?term=txid511145

<KEGG> entry: https://www.genome.jp/pathway/eco01100+M00022

<BioCyc promoter database> query URL: https://biocyc.org/group?id=:ALL-PROMOTERS&orgid=ECOLI