JavaScript CPU microarchitecture simulator by Ciro Santilli 35 Updated 2025-01-29 +Created 1970-01-01
Besides a missing page, a very common source of page faults is copy-on-write (COW).
Page tables have extra flags that allow the OS to mark a page a read-only.
Those page faults only happen when a process tries to write to the page, and not read from it.
When Linux forks a process:
- instead of copying all the pages, which is unnecessarily costly, it makes the page tables of the two process point to the same physical address.
- it marks those linear addresses as read-only
- whenever one of the processes tries to write to a page, the makes a copy of the physical memory, and updates the pages of the two process to point to the two different physical addresses
Composed mostly of the Virgo cluster and the Local group.
The definition does not include homonuclear molecules which is a pain.
After a translation between linear and physical address happens, it is stored on the TLB. For example, a 4 entry TLB starts in the following state:
valid linear physical
----- ------ --------
> 0 00000 00000
0 00000 00000
0 00000 00000
0 00000 00000
The
>
indicates the current entry to be replaced.And after a page linear address and after a second translation of
00003
is translated to a physical address 00005
, the TLB becomes: valid linear physical
----- ------ --------
1 00003 00005
> 0 00000 00000
0 00000 00000
0 00000 00000
00007
to 00009
it becomes: valid linear physical
----- ------ --------
1 00003 00005
1 00007 00009
> 0 00000 00000
0 00000 00000
Now if
00003
needs to be translated again, hardware first looks up the TLB and finds out its address with a single RAM access 00003 --> 00005
.Of course,
00000
is not on the TLB since no valid entry contains 00000
as a key.The algorithmically minded will have noticed that paging requires associative array (like Java
Map
of Python dict()
) abstract data structure where:- the keys are linear pages addresses, thus of integer type
- the values are physical page addresses, also of integer type
The single level paging scheme uses a simple array implementation of the associative array:and in C pseudo-code it looks like this:
- the keys are the array index
- this implementation is very fast in time
- but it is too inefficient in memory
linear_address[0] = physical_address_0
linear_address[1] = physical_address_1
linear_address[2] = physical_address_2
...
linear_address[2^20-1] = physical_address_N
But there another simple associative array implementation that overcomes the memory problem: an (unbalanced) k-ary tree.
A K-ary tree, is just like a binary tree, but with K children instead of 2.
Using a K-ary tree instead of an array implementation has the following trade-offs:
- it uses way less memory
- it is slower since we have to de-reference extra pointers
In C-pseudo code, a 2-level K-ary tree with and we have the following arrays:and it still contains
K = 2^10
looks like this:level0[0] = &level1_0[0]
level1_0[0] = physical_address_0_0
level1_0[1] = physical_address_0_1
...
level1_0[2^10-1] = physical_address_0_N
level0[1] = &level1_1[0]
level1_1[0] = physical_address_1_0
level1_1[1] = physical_address_1_1
...
level1_1[2^10-1] = physical_address_1_N
...
level0[N] = &level1_N[0]
level1_N[0] = physical_address_N_0
level1_N[1] = physical_address_N_1
...
level1_N[2^10-1] = physical_address_N_N
- one
directory
, which has2^10
elements. Each element contains a pointer to a page table array. - up to 2^10
pagetable
arrays. Each one has2^10
4 byte page entries.
2^10 * 2^10 = 2^20
possible keys.K-ary trees can save up a lot of space, because if we only have one key, then we only need the following arrays:
- one
directory
with 2^10 entries - one
pagetable
atdirectory[0]
with 2^10 entries - all other
directory[i]
are marked as invalid, don't point to anything, and we don't allocatepagetable
for them at all
FFFFF 000
points to its own physical address FFFFF 000
. This kind of translation is called an "identity mapping", and can be very convenient for OS-level debugging.Bagic jump between orbitals in the Bohr model. Analogous to the later wave function collapse in the Schrödinger equation.
Paging is implemented by the CPU hardware itself.
Paging could be implemented in software, but that would be too slow, because every single RAM memory access uses it!
Operating systems must setup and control paging by communicating to the CPU hardware. This is done mostly via:
- the CR3 register, which tells the CPU where the page table is in RAM memory
- writing the correct paging data structures to the RAM pointed to the CR3 register.Using RAM data structures is a common technique when lots of data must be transmitted to the CPU as it would cost too much to have such a large CPU register.The format of the configuration data structures is fixed _by the hardware_, but it is up to the OS to set up and manage those data structures on RAM correctly, and to tell the hardware where to find them (via
cr3
).Then some heavy caching is done to ensure that the RAM access will be fast, in particular using the TLB.Another notable example of RAM data structure used by the CPU is the IDT which sets up interrupt handlers.The OS makes it impossible for programs to change the paging setup directly without going through the OS: - CR3 cannot be modified in ring 3. The OS runs in ring 0. See also:
- the page table structures are made invisible to the process using paging itself!
Processes can however make requests to the OS that cause the page tables to be modified, notably:
- stack size changes
brk
andmmap
calls, see also: stackoverflow.com/questions/6988487/what-does-brk-system-call-do/31082353#31082353
The kernel then decides if the request will be granted or not in a controlled manner.
The new default homepage for a logged out user how shows a list of the topics with the most articles.
This is a reasonable choice for default homepage, and it immediately exposes users to this central feature of the website: the topic system.
Doing this required in particular calculating the best title for a topic, since it is possible to have different titles with the same ID, the most common way being with capitalization changes, e.g.:would both have topic ID
JavaScript
Javascript
javascript
.With this in place we also added the preferred topic title to the top topic page.
The algorithm chosen is to pick the top 10 most upvoted topics, and select the most common title from amongst them. This should make topic title vandalism quite hard. This was made in a single SQL query, and became the most complext SQL query Ciro Santilli has ever written so far: twitter.com/cirosantilli2/status/1549721815832043522
Pinned article: ourbigbook/introduction-to-the-ourbigbook-project
Welcome to the OurBigBook Project! Our goal is to create the perfect publishing platform for STEM subjects, and get university-level students to write the best free STEM tutorials ever.
Everyone is welcome to create an account and play with the site: ourbigbook.com/go/register. We belive that students themselves can write amazing tutorials, but teachers are welcome too. You can write about anything you want, it doesn't have to be STEM or even educational. Silly test content is very welcome and you won't be penalized in any way. Just keep it legal!
We have two killer features:
- topics: topics group articles by different users with the same title, e.g. here is the topic for the "Fundamental Theorem of Calculus" ourbigbook.com/go/topic/fundamental-theorem-of-calculusArticles of different users are sorted by upvote within each article page. This feature is a bit like:
- a Wikipedia where each user can have their own version of each article
- a Q&A website like Stack Overflow, where multiple people can give their views on a given topic, and the best ones are sorted by upvote. Except you don't need to wait for someone to ask first, and any topic goes, no matter how narrow or broad
This feature makes it possible for readers to find better explanations of any topic created by other writers. And it allows writers to create an explanation in a place that readers might actually find it. - local editing: you can store all your personal knowledge base content locally in a plaintext markup format that can be edited locally and published either:This way you can be sure that even if OurBigBook.com were to go down one day (which we have no plans to do as it is quite cheap to host!), your content will still be perfectly readable as a static site.
- to OurBigBook.com to get awesome multi-user features like topics and likes
- as HTML files to a static website, which you can host yourself for free on many external providers like GitHub Pages, and remain in full control
- Internal cross file references done right:
- Infinitely deep tables of contents:
All our software is open source and hosted at: github.com/ourbigbook/ourbigbook
Further documentation can be found at: docs.ourbigbook.com
Feel free to reach our to us for any help or suggestions: docs.ourbigbook.com/#contact