x86 Paging Tutorial / CAM Updated +Created
Using the TLB makes translation faster, because the initial translation takes one access per TLB level, which means 2 on a simple 32 bit scheme, but 3 or 4 on 64 bit architectures.
The TLB is usually implemented as an expensive type of RAM called content-addressable memory (CAM). CAM implements an associative map on hardware, that is, a structure that given a key (linear address), retrieves a value.
Mappings could also be implemented on RAM addresses, but CAM mappings may required much less entries than a RAM mapping.
For example, a map in which:
  • both keys and values have 20 bits (the case of a simple paging schemes)
  • at most 4 values need to be stored at each time
could be stored in a TLB with 4 entries:
linear  physical
------  --------
00000   00001
00001   00010
00010   00011
FFFFF   00000
However, to implement this with RAM, it would be necessary to have 2^20 addresses:
linear  physical
------  --------
00000   00001
00001   00010
00010   00011
... (from 00011 to FFFFE)
FFFFF   00000
which would be even more expensive than using a TLB.
x86 Paging Tutorial / Copy-on-write Updated +Created
Besides a missing page, a very common source of page faults is copy-on-write (COW).
Page tables have extra flags that allow the OS to mark a page a read-only.
Those page faults only happen when a process tries to write to the page, and not read from it.
When Linux forks a process:
  • instead of copying all the pages, which is unnecessarily costly, it makes the page tables of the two process point to the same physical address.
  • it marks those linear addresses as read-only
  • whenever one of the processes tries to write to a page, the makes a copy of the physical memory, and updates the pages of the two process to point to the two different physical addresses
Chemiosmosis Updated +Created
How to cite a book on Wikipedia Updated +Created
A good big sample definition:
<ref name="googleStory">{{cite book |last1=Vise |first1=David |author-link1=David A. Vise |last2=Malseed |first2=Mark |author-link2=Mark Malseed |title=The Google Story |date=2008 |publisher=Delacorte Press |url=https://archive.org/details/isbn_9780385342728}}</ref>
There is also title-link to link to a wiki page. But it is incompatible with url= for Internet Archive Open Library links which is a shame.
Nested set model in SQL Updated +Created
How to implement Nested set model in SQL:
SQLite isolation levels Updated +Created
The default isolation level for SQLite is SERIALIZABLE
It does not appear possible to achieve the other two levels besides SERIALIZABLE and READ UNCOMMITED
x86 Paging Tutorial / Hardware implementation Updated +Created
Paging is implemented by the CPU hardware itself.
Paging could be implemented in software, but that would be too slow, because every single RAM memory access uses it!
Operating systems must setup and control paging by communicating to the CPU hardware. This is done mostly via:
Processes can however make requests to the OS that cause the page tables to be modified, notably:
The kernel then decides if the request will be granted or not in a controlled manner.
Node.js SQLite bindings Updated +Created
x86 Paging Tutorial / How the K-ary tree is used in x86 Updated +Created
x86's multi-level paging scheme uses a 2 level K-ary tree with 2^10 bits on each level.
Addresses are now split as:
| directory (10 bits) | table (10 bits) | offset (12 bits) |
Then:
  • the top 10 bits are used to walk the top level of the K-ary tree (level0)
    The top table is called a "directory of page tables".
    cr3 now points to the location on RAM of the page directory of the current process instead of page tables.
    Page directory entries are very similar to page table entries except that they point to the physical addresses of page tables instead of physical addresses of pages.
    Each directory entry also takes up 4 bytes, just like page entries, so that makes 4 KiB per process minimum.
    Page directory entries also contain a valid flag: if invalid, the OS does not allocate a page table for that entry, and saves memory.
    Each process has one and only one page directory associated to it (and pointed to by cr3), so it will contain at least 2^10 = 1K page directory entries, much better than the minimum 1M entries required on a single-level scheme.
  • the next 10 bits are used to walk the second level of the K-ary tree (level1)
    Second level entries are also called page tables like the single level scheme.
    Page tables are only allocated only as needed by the OS.
    Each page table has only 2^10 = 1K page table entries instead of 2^20 for the single paging scheme.
    Each process can now have up to 2^10 page tables instead of 2^20 for the single paging scheme.
  • the offset is again not used for translation, it only gives the offset within a page
One reason for using 10 bits on the first two levels (and not, say, 12 | 8 | 12 ) is that each Page Table entry is 4 bytes long. Then the 2^10 entries of Page directories and Page Tables will fit nicely into 4Kb pages. This means that it faster and simpler to allocate and deallocate pages for that purpose.
Deletionism on Wikipedia Updated +Created
Some examples by Ciro Santilli follow.
Of the tutorial-subjectivity type:
Notability constraints, which are are way too strict:
There are even a Wikis that were created to remove notability constraints: Wiki without notability requirements.
For these reasons reason why Ciro basically only contributes images to Wikipedia: because they are either all in or all out, and you can determine which one of them it is. And this allows images to be more attributable, so people can actually see that it was Ciro that created a given amazing image, thus overcoming Wikipedia's lack of reputation system a little bit as well.
Wikipedia is perfect for things like biographies, geography, or history, which have a much more defined and subjective expository order. But when it comes to "tutorials of how to actually do stuff", which is what mathematics and physics are basically about, Wikipedia has a very hard time to go beyond dry definitions which are only useful for people who already half know the stuff. But to learn from zero, newbies need tutorials with intuition and examples.
Bibliography:
Group axiom Updated +Created
Lattice Microbes Updated +Created
GPU accelerated, simulates the Craig's minimized M. genitalium, JCVI-syn3A at a particle basis of some kind.
Lab head is the cutest-looking lady ever: chemistry.illinois.edu/zan, Zaida (Zan) Luthey-Schulten.
List of systems programmers Updated +Created
Spy Updated +Created
x86 Paging Tutorial / Invalidating TLB entries Updated +Created
When the process changes, cr3 change to point to the page table of the new current process.
This creates a problem: the TLB is now filled with a bunch of cached entries for the old process.
A simple and naive solution would be to completely invalidate the TLB whenever the cr3 changes.
However, this is would not be very efficient, because it often happens that we switch back to process 1 before process 2 has completely used up the entire TLB cache entries.
Basically, the OS assigns a different ASID for each process, and then TLB entries are automatically also tagged with that ASID. This way when the process makes an access, the TLB can determine if a hit is actually for the current process, or if it is an old address coincidence with another process.
The x86 also offers the invlpg instruction which explicitly invalidates a single TLB entry. Other architectures offer even more instructions to invalidated TLB entries, such as invalidating all entries on a given range.
x86 Paging Tutorial / K-ary trees to the rescue Updated +Created
The algorithmically minded will have noticed that paging requires associative array (like Java Map of Python dict()) abstract data structure where:
  • the keys are linear pages addresses, thus of integer type
  • the values are physical page addresses, also of integer type
The single level paging scheme uses a simple array implementation of the associative array:
  • the keys are the array index
  • this implementation is very fast in time
  • but it is too inefficient in memory
and in C pseudo-code it looks like this:
linear_address[0]      = physical_address_0
linear_address[1]      = physical_address_1
linear_address[2]      = physical_address_2
...
linear_address[2^20-1] = physical_address_N
But there another simple associative array implementation that overcomes the memory problem: an (unbalanced) k-ary tree.
A K-ary tree, is just like a binary tree, but with K children instead of 2.
Using a K-ary tree instead of an array implementation has the following trade-offs:
  • it uses way less memory
  • it is slower since we have to de-reference extra pointers
In C-pseudo code, a 2-level K-ary tree with K = 2^10 looks like this:
level0[0] = &level1_0[0]
    level1_0[0]      = physical_address_0_0
    level1_0[1]      = physical_address_0_1
    ...
    level1_0[2^10-1] = physical_address_0_N
level0[1] = &level1_1[0]
    level1_1[0]      = physical_address_1_0
    level1_1[1]      = physical_address_1_1
    ...
    level1_1[2^10-1] = physical_address_1_N
...
level0[N] = &level1_N[0]
    level1_N[0]      = physical_address_N_0
    level1_N[1]      = physical_address_N_1
    ...
    level1_N[2^10-1] = physical_address_N_N
and we have the following arrays:
  • one directory, which has 2^10 elements. Each element contains a pointer to a page table array.
  • up to 2^10 pagetable arrays. Each one has 2^10 4 byte page entries.
and it still contains 2^10 * 2^10 = 2^20 possible keys.
K-ary trees can save up a lot of space, because if we only have one key, then we only need the following arrays:
  • one directory with 2^10 entries
  • one pagetable at directory[0] with 2^10 entries
  • all other directory[i] are marked as invalid, don't point to anything, and we don't allocate pagetable for them at all
x86 Paging Tutorial / Kernel vs process memory layout Updated +Created
The Linux Kernel reserves two zones of virtual memory:
  • one for kernel memory
  • one for programs
The exact split is configured by CONFIG_VMSPLIT_.... By default:
  • on 32-bit:
    • the bottom 3/4 is program space: 00000000 to BFFFFFFF
    • the top 1/4 is kernel memory: C0000000 to FFFFFFFF, like this:
      ------------------ FFFFFFFF
      Kernel
      ------------------ C0000000
      ------------------ BFFFFFFF
      
      
      Process
      
      
      ------------------ 00000000
  • on 64-bit: currently only 48-bits are actually used, split into two equally sized disjoint spaces. The Linux kernel just assigns:
    • the bottom part to processes 00000000 00000000 to 008FFFFF FFFFFFFF
    • the top part to the kernel: FFFF8000 00000000 to FFFFFFFF FFFFFFFF, like this:
      ------------------ FFFFFFFF
      Kernel
      ------------------ C0000000
      
      
      (not addressable)
      
      
      ------------------ BFFFFFFF
      Process
      ------------------ 00000000
Kernel memory is also paged.
x86 Paging Tutorial / Memory management unit Updated +Created
Paging is done by the Memory Management Unit (MMU) part of the CPU.
Like many others (e.g. x87 co-processor, APIC), this used to be by separate chip on early days.
It was later integrated into the CPU, but the term MMU still used.
NLab Updated +Created
Decent encyclopedia of mathematics. Not much motivation, mostly statements though.
Unlike Wikipedia, they have a more sane forum commenting system, e.g. a page/forum pair:

There are unlisted articles, also show them or only show them.