Topics Top articles New articles Updated articles Top users New users New discussions Top discussions New comments+ New article
The exact format of table entries is fixed _by the hardware_.
Each page entry can be seen as a
struct
with many fields.The page table is then an array of
struct
.On this simplified example, the page table entries contain only two fields:so in this example the hardware designers could have chosen the size of the page table to b
bits function
----- -----------------------------------------
20 physical address of the start of the page
1 present flag
21
instead of 32
as we've used so far.All real page table entries have other fields, notably fields to set pages to read-only for Copy-on-write. This will be explained elsewhere.
It would be impractical to align things at 21 bits since memory is addressable by bytes and not bits. Therefore, even in only 21 bits are needed in this case, hardware designers would probably choose 32 to make access faster, and just reserve bits the remaining bits for later usage. The actual value on x86 is 32 bits.
Here is a screenshot from the Intel manual image "Formats of CR3 and Paging-Structure Entries with 32-Bit Paging" showing the structure of a page table in all its glory: Figure 1. "x86 page entry format".
The fields are explained in the manual just after.
The problem with single-level paging by Ciro Santilli 35 Updated 2025-01-06 +Created 1970-01-01
The problem with a single-level paging scheme is that it would take up too much RAM: 4G / 4K = 1M entries _per_ process.
If each entry is 4 bytes long, that would make 4M _per process_, which is too much even for a desktop computer:
ps -A | wc -l
says that I am running 244 processes right now, so that would take around 1GB of my RAM!For this reason, x86 developers decided to use a multi-level scheme that reduces RAM usage.
The downside of this system is that is has a slightly higher access time, as we need to access RAM more times for each translation.
Learned readers will ask themselves: so why use an unbalanced tree instead of balanced one, which offers better asymptotic times en.wikipedia.org/wiki/Self-balancing_binary_search_tree?
Likely:
- the maximum number of entries is small enough due to memory size limitations, that we won't waste too much memory with the root directory entry
- different entries would have different levels, and thus different access times
- tree rotations would likely make caching more complicated
If either PAE and PSE are active, different paging level schemes are used:
- no PAE and no PSE:
10 | 10 | 12
- no PAE and PSE:
10 | 22
.22 is the offset within the 4Mb page, since 22 bits address 4Mb. - PAE and no PSE:
2 | 9 | 9 | 12
The design reason why 9 is used twice instead of 10 is that now entries cannot fit anymore into 32 bits, which were all filled up by 20 address bits and 12 meaningful or reserved flag bits.The reason is that 20 bits are not enough anymore to represent the address of page tables: 24 bits are now needed because of the 4 extra wires added to the processor.Therefore, the designers decided to increase entry size to 64 bits, and to make them fit into a single page table it is necessary reduce the number of entries to 2^9 instead of 2^10.The starting 2 is a new Page level called Page Directory Pointer Table (PDPT), since it _points_ to page directories and fill in the 32 bit linear address. PDPTs are also 64 bits wide.cr3
now points to PDPTs which must be on the fist four 4GB of memory and aligned on 32 bit multiples for addressing efficiency. This means that nowcr3
has 27 significative bits instead of 20: 2^5 for the 32 multiples * 2^27 to complete the 2^32 of the first 4GB. - PAE and PSE:
2 | 9 | 21
Designers decided to keep a 9 bit wide field to make it fit into a single page.This leaves 23 bits. Leaving 2 for the PDPT to keep things uniform with the PAE case without PSE leaves 21 for offset, meaning that pages are 2M wide instead of 4M.
Using the TLB makes translation faster, because the initial translation takes one access _per TLB level_, which means 2 on a simple 32 bit scheme, but 3 or 4 on 64 bit architectures.
The TLB is usually implemented as an expensive type of RAM called content-addressable memory (CAM). CAM implements an associative map on hardware, that is, a structure that given a key (linear address), retrieves a value.
Mappings could also be implemented on RAM addresses, but CAM mappings may required much less entries than a RAM mapping.
For example, a map in which:could be stored in a TLB with 4 entries:
- both keys and values have 20 bits (the case of a simple paging schemes)
- at most 4 values need to be stored at each time
linear physical
------ --------
00000 00001
00001 00010
00010 00011
FFFFF 00000
However, to implement this with RAM, _it would be necessary to have 2^20 addresses_:which would be even more expensive than using a TLB.
linear physical
------ --------
00000 00001
00001 00010
00010 00011
... (from 00011 to FFFFE)
FFFFF 00000
Play with physical addresses in Linux by Ciro Santilli 35 Updated 2025-01-06 +Created 1970-01-01
Convert virtual addresses to physical from user space with
/proc/<pid>/pagemap
and from kernel space with virt_to_phys
:Dump all page tables from userspace with
/proc/<pid>/maps
and /proc/<pid>/pagemap
:Read and write physical addresses from userspace with
/dev/mem
:The Linux Kernel reserves two zones of virtual memory:
- one for kernel memory
- one for programs
The exact split is configured by
CONFIG_VMSPLIT_...
. By default:- on 32-bit:
- the bottom 3/4 is program space:
00000000
toBFFFFFFF
- the top 1/4 is kernel memory:
C0000000
toFFFFFFFF
, like this:------------------ FFFFFFFF Kernel ------------------ C0000000 ------------------ BFFFFFFF Process ------------------ 00000000
- the bottom 3/4 is program space:
- on 64-bit: currently only 48-bits are actually used, split into two equally sized disjoint spaces. The Linux kernel just assigns:
- the bottom part to processes
00000000 00000000
to008FFFFF FFFFFFFF
- the top part to the kernel:
FFFF8000 00000000
toFFFFFFFF FFFFFFFF
, like this:------------------ FFFFFFFF Kernel ------------------ C0000000 (not addressable) ------------------ BFFFFFFF Process ------------------ 00000000
- the bottom part to processes
Kernel memory is also paged.
In previous versions, the paging was continuous, but with HIGHMEM this changed.
There is no clear physical memory split: stackoverflow.com/questions/30471742/physical-memory-userspace-kernel-split-on-linux-x86-64
The first chapter of the New Testament.
Information about ARM paging can be found at: cirosantilli.com/linux-kernel-module-cheat#arm-paging
Free:
- rutgers-pxk-416 chapter "Memory management: lecture notes"Good historical review of memory organization techniques used by older OS.
Non-free:
- bovet05 chapter "Memory addressing"Reasonable intro to x86 memory addressing. Missing some good and simple examples.
The first thing you must understand is the Classic RISC pipeline with a concrete example.
The good:
- slick UI! But very hard to read characters, they're way too small.
- attempts to show state diffs with a flash. But it goes by too fast, would be better if it were more permanent
- Reverse debugging
Equation "Hydrogen spectral series mnemonic" gives for example from principal quantum number 1 to 2 a difference:which with Planck-Einstein relation gives about 121.6 nm ( Hz), which is a reasonable match with the value of 121.567... from the NIST Atomic Spectra Database.
Was a closed source project by "Roboti LLC", which was then acquired by DeepMind in October 2021 and open sourced March 2022: www.deepmind.com/blog/open-sourcing-mujoco
This library is quite cool. Feel very brutally lean and mean.
The good:
- Reverse debugging
- circuit diagram
The bad:
- Clunky UI
- circuit diagram doesn't show any state??
Pinned article: ourbigbook/introduction-to-the-ourbigbook-project
Welcome to the OurBigBook Project! Our goal is to create the perfect publishing platform for STEM subjects, and get university-level students to write the best free STEM tutorials ever.
Everyone is welcome to create an account and play with the site: ourbigbook.com/go/register. We belive that students themselves can write amazing tutorials, but teachers are welcome too. You can write about anything you want, it doesn't have to be STEM or even educational. Silly test content is very welcome and you won't be penalized in any way. Just keep it legal!
We have two killer features:
- topics: topics group articles by different users with the same title, e.g. here is the topic for the "Fundamental Theorem of Calculus" ourbigbook.com/go/topic/fundamental-theorem-of-calculusArticles of different users are sorted by upvote within each article page. This feature is a bit like:
- a Wikipedia where each user can have their own version of each article
- a Q&A website like Stack Overflow, where multiple people can give their views on a given topic, and the best ones are sorted by upvote. Except you don't need to wait for someone to ask first, and any topic goes, no matter how narrow or broad
This feature makes it possible for readers to find better explanations of any topic created by other writers. And it allows writers to create an explanation in a place that readers might actually find it. - local editing: you can store all your personal knowledge base content locally in a plaintext markup format that can be edited locally and published either:This way you can be sure that even if OurBigBook.com were to go down one day (which we have no plans to do as it is quite cheap to host!), your content will still be perfectly readable as a static site.
- to OurBigBook.com to get awesome multi-user features like topics and likes
- as HTML files to a static website, which you can host yourself for free on many external providers like GitHub Pages, and remain in full control
- Internal cross file references done right:
- Infinitely deep tables of contents:
All our software is open source and hosted at: github.com/ourbigbook/ourbigbook
Further documentation can be found at: docs.ourbigbook.com
Feel free to reach our to us for any help or suggestions: docs.ourbigbook.com/#contact