x86's multi-level paging scheme uses a 2 level K-ary tree with 2^10 bits on each level.
Addresses are now split as:
| directory (10 bits) | table (10 bits) | offset (12 bits) |
Then:
- the top 10 bits are used to walk the top level of the K-ary tree (
level0
)The top table is called a "directory of page tables".cr3
now points to the location on RAM of the page directory of the current process instead of page tables.Page directory entries are very similar to page table entries except that they point to the physical addresses of page tables instead of physical addresses of pages.Each directory entry also takes up 4 bytes, just like page entries, so that makes 4 KiB per process minimum.Page directory entries also contain a valid flag: if invalid, the OS does not allocate a page table for that entry, and saves memory.Each process has one and only one page directory associated to it (and pointed to bycr3
), so it will contain at least2^10 = 1K
page directory entries, much better than the minimum 1M entries required on a single-level scheme. - the next 10 bits are used to walk the second level of the K-ary tree (
level1
)Second level entries are also called page tables like the single level scheme.Page tables are only allocated only as needed by the OS.Each page table has only2^10 = 1K
page table entries instead of2^20
for the single paging scheme.Each process can now have up to2^10
page tables instead of2^20
for the single paging scheme. - the offset is again not used for translation, it only gives the offset within a page
One reason for using 10 bits on the first two levels (and not, say,
12 | 8 | 12
) is that each Page Table entry is 4 bytes long. Then the 2^10 entries of Page directories and Page Tables will fit nicely into 4Kb pages. This means that it faster and simpler to allocate and deallocate pages for that purpose.This is how the memory could look like in a single level paging scheme:
Links Data Physical address
+-----------------------+ 2^32 - 1
| |
. .
| |
+-----------------------+ page0 + 4k
| data of page 0 |
+---->+-----------------------+ page0
| | |
| . .
| | |
| +-----------------------+ pageN + 4k
| | data of page N |
| +->+-----------------------+ pageN
| | | |
| | . .
| | | |
| | +-----------------------+ CR3 + 2^20 * 4
| +--| entry[2^20-1] = pageN |
| +-----------------------+ CR3 + 2^20 - 1 * 4
| | |
| . many entires .
| | |
| +-----------------------+ CR3 + 2 * 4
| +--| entry[1] = page1 |
| | +-----------------------+ CR3 + 1 * 4
+-----| entry[0] = page0 |
| +-----------------------+ <--- CR3
| | |
| . .
| | |
| +-----------------------+ page1 + 4k
| | data of page 1 |
+->+-----------------------+ page1
| |
. .
| |
+-----------------------+ 0
Notice that:
- the CR3 register points to the first entry of the page table
- the page table is just a large array with 2^20 page table entries
- each entry is 4 bytes big, so the array takes up 4 MiB
- each page table contains the physical address a page
- each page is a 4 KiB aligned 4KiB chunk of memory that user processes may use
- we have 2^20 table entries. Since each page is 4KiB == 2^12, this covers the whole 4GiB (2^32) of 32-bit memory
Dire times require dire methods: cia-2010-covert-communication-websites/cdx-tor.sh.
First we must start the tor servers with the and then use it on a newline separated domain name list to check;This creates a directory
tor-army
command from: stackoverflow.com/questions/14321214/how-to-run-multiple-tor-processes-at-once-with-different-exit-ips/76749983#76749983tor-army 100
./cdx-tor.sh infile.txt
infile.txt.cdx/
containing:infile.txt.cdx/out00
,out01
, etc.: the suspected CDX lines from domains from each tor instance based on the simple criteria that the CDX can handle directly. We split the input domains into 100 piles, and give one selected pile per tor instance.infile.txt.cdx/out
: the final combined CDX output ofout00
,out01
, ...infile.txt.cdx/out.post
: the final output containing only domain names that match further CLI criteria that cannot be easily encoded on the CDX query. This is the cleanest domain name list you should look into at the end basically.
Since archive is so abysmal in its data access, e.g. a Google BigQuery would solve our issues in seconds, we have to come up with creative ways of getting around their IP throttling.
The CIA doesn't play fair. They're actually the exact opposite of fair. So neither shall we.
Distilled into an answer at: stackoverflow.com/questions/14321214/how-to-run-multiple-tor-processes-at-once-with-different-exit-ips/76749983#76749983
This should allow a full sweep of the 4.5M records in 2013 DNS Census virtual host cleanup in a reasonable amount of time. After JAR/SWF/CGI filtering we obtained 5.8k domains, so a reduction factor of about 1 million with likely very few losses. Not bad.
5.8k is still a bit annoying to fully go over however, so we can also try to count CDX hits to the domains and remove anything with too many hits, since the CIA websites basically have very few archives:This gives us something like:sorted by increasing hit counts, so we can go down as far as patience allows for!
cd 2013-dns-census-a-novirt-domains.txt.cdx
./cdx-tor.sh -d out.post domain-list.txt
cd out.post.cdx
cut -d' ' -f1 out | uniq -c | sort -k1 -n | awk 'match($2, /([^,]+),([^)]+)/, a) {printf("%s.%s %d\n", a[2], a[1], $1)}' > out.count
12654montana.com 1
aeronet-news.com 1
atohms.com 1
av3net.com 1
beechstreetas400.com 1
New results from a full CDX scan of 2013-dns-census-a-novirt.csv:
- 219.90.61.123 journeystravelled.com
The exact format of table entries is fixed _by the hardware_.
Each page entry can be seen as a
struct
with many fields.The page table is then an array of
struct
.On this simplified example, the page table entries contain only two fields:so in this example the hardware designers could have chosen the size of the page table to b
bits function
----- -----------------------------------------
20 physical address of the start of the page
1 present flag
21
instead of 32
as we've used so far.All real page table entries have other fields, notably fields to set pages to read-only for Copy-on-write. This will be explained elsewhere.
It would be impractical to align things at 21 bits since memory is addressable by bytes and not bits. Therefore, even in only 21 bits are needed in this case, hardware designers would probably choose 32 to make access faster, and just reserve bits the remaining bits for later usage. The actual value on x86 is 32 bits.
Here is a screenshot from the Intel manual image "Formats of CR3 and Paging-Structure Entries with 32-Bit Paging" showing the structure of a page table in all its glory: Figure 1. "x86 page entry format".
x86 page entry format
. The fields are explained in the manual just after.
Why are pages 4KiB anyways?
There is a trade-off between memory wasted in:
- page tables
- extra padding memory within pages
This can be seen with the extreme cases:
- if the page size were 1 byte:
- granularity would be great, and the OS would never have to allocate unneeded padding memory
- but the page table would have 2^32 entries, and take up the entire memory!
- if the page size were 4GiB:
- we would need to swap 4GiB to disk every time a new process becomes active
- the page size would be a single entry, so it would take almost no memory at all
x86 designers have found that 4KiB pages are a good middle ground.
Besides a missing page, a very common source of page faults is copy-on-write (COW).
Page tables have extra flags that allow the OS to mark a page a read-only.
Those page faults only happen when a process tries to write to the page, and not read from it.
When Linux forks a process:
- instead of copying all the pages, which is unnecessarily costly, it makes the page tables of the two process point to the same physical address.
- it marks those linear addresses as read-only
- whenever one of the processes tries to write to a page, the makes a copy of the physical memory, and updates the pages of the two process to point to the two different physical addresses
Information about ARM paging can be found at: cirosantilli.com/linux-kernel-module-cheat#arm-paging
After a translation between linear and physical address happens, it is stored on the TLB. For example, a 4 entry TLB starts in the following state:
valid linear physical
----- ------ --------
> 0 00000 00000
0 00000 00000
0 00000 00000
0 00000 00000
The
>
indicates the current entry to be replaced.And after a page linear address and after a second translation of
00003
is translated to a physical address 00005
, the TLB becomes: valid linear physical
----- ------ --------
1 00003 00005
> 0 00000 00000
0 00000 00000
0 00000 00000
00007
to 00009
it becomes: valid linear physical
----- ------ --------
1 00003 00005
1 00007 00009
> 0 00000 00000
0 00000 00000
Now if
00003
needs to be translated again, hardware first looks up the TLB and finds out its address with a single RAM access 00003 --> 00005
.Of course,
00000
is not on the TLB since no valid entry contains 00000
as a key.They have a tutorial at: jena.apache.org/tutorials/sparql.html
Once you've done the Apache Jena CLI tools setup we can query all users with Full Name (FN) "John Smith" directly fom the rdf/vcard.ttl Turtle RDF file with the rdf/vcard.rq SPARQL query:and that outputs:
sparql --data=rdf/vcard.ttl --query=rdf/vcard.rq
---------------------------------
| x |
=================================
| <http://somewhere/JohnSmith/> |
---------------------------------
Higher spin multiplicity means lower energy. I.e.: you want to keep all spins pointin in the same direction.
Exist because double bonds don't rotate freely. Have different properties of course, unlike enantiomer.
Mirror images.
Key exmaple: d and L amino acids. Enantiomers have identical physico-chemical properties. But their biological roles can be very different, because an enzyme might only be able to act on one of them.
TODO definition. Appears to be isomers
Example:
- the three most table polymorphs of calcium carbonate polymorphs are:
These come with pre-installed drivers, so e.g. nvidia-smi just works on them out of the box, tested on g5.xlarge which has an Nvidia A10G GPU. Good choice as a starting point for deep learning experiments.
Symmetry in Condensed Matter Physics course of the University of Oxford Updated 2025-02-26 +Created 1970-01-01
www2.physics.ox.ac.uk/students/course-materials/symmetry-in-condensed-matter-physics# Archive: web.archive.org/web/20230804204137/https://www2.physics.ox.ac.uk/students/course-materials/symmetry-in-condensed-matter-physics Lecture notes from 2019.
Professor: Radu Coldea
Condensed matter physics course of the University of Oxford Updated 2025-02-26 +Created 1970-01-01
This could refer to several more specific courses, see the tagged articles for a list.
Article likely written by him: www.theguardian.com/commentisfree/2010/feb/22/home-schooling-register-families
“Especially my father. He was doing most of it and he is a savoury, strong character. He has strong beliefs about the world and in himself, and he was helping me a lot, even when I was at university as an undergraduate.”An only child, Arran was born in 1995 in Glasgow, where his parents were studying at the time. His father has Spanish lineage, having a great grandfather who was a sailor who moved from Spain to St Vincent in the Carribean. A son later left the islands for the UK where he married an English woman. Arran’s mother is Norwegian.“My father was writing and my mother is an economist. They both worked from home which also made things easier,” Arran says.
One of the articles says his father has a PhD. TODO where did he work? What's his PhD on? Photo: www.topfoto.co.uk/asset/1357880/
www.thetimes.co.uk/article/the-everyday-genius-pxsq5c50kt9:
Neil, a political economist, attended state and private schools in Hampshire but was also taught for a period at home by his mother.
It’s strange because for most people maths is a real turn-off, yet maths is all about patterns and children of two or three love patterns. It just shows that schools are doing something seriously wrong.”
There are unlisted articles, also show them or only show them.