Ciro Santilli @cirosantilli 37

 Incoming links: Associative array

Computer science Updated 2025-07-16

A branch of mathematics that attempts to prove stuff about computers.

Unfortunately, all software engineers already know the answer to the useful theorems though (except perhaps notably for cryptography), e.g. all programmers obviously know that iehter P != NP or that this is unprovable or some other "for all practical purposes practice P != NP", even though they don't have proof.

And 99% of their time, software engineers are not dealing with mathematically formulatable problems anyways, which is sad.

The only useful "computer science" subset every programmer ever needs to know is:

for arrays: dynamic array vs linked list
for associative array: binary search tree vs hash table. See also Heap vs Binary Search Tree (BST). No need to understand the algorithmic details of the hash function, the NSA has already done that for you.
don't use Bubble sort for sorting
you can't parse HTML with regular expressions: stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 because of formal language theory

Funnily, due to the formalization of mathematics, mathematics can be seen as a branch of computer science, just like computer science can be seen as a branch of Mathematics!

 Read the full article

x86 Paging Tutorial / K-ary trees to the rescue Updated 2025-07-16

 View more

The algorithmically minded will have noticed that paging requires associative array (like Java Map of Python dict()) abstract data structure where:

the keys are linear pages addresses, thus of integer type
the values are physical page addresses, also of integer type

The single level paging scheme uses a simple array implementation of the associative array:

the keys are the array index
this implementation is very fast in time
but it is too inefficient in memory

and in C pseudo-code it looks like this:

linear_address[0]      = physical_address_0
linear_address[1]      = physical_address_1
linear_address[2]      = physical_address_2
...
linear_address[2^20-1] = physical_address_N

But there another simple associative array implementation that overcomes the memory problem: an (unbalanced) k-ary tree.

A K-ary tree, is just like a binary tree, but with K children instead of 2.

Using a K-ary tree instead of an array implementation has the following trade-offs:

it uses way less memory
it is slower since we have to de-reference extra pointers

In C-pseudo code, a 2-level K-ary tree with K = 2^10 looks like this:

level0[0] = &level1_0[0]
    level1_0[0]      = physical_address_0_0
    level1_0[1]      = physical_address_0_1
    ...
    level1_0[2^10-1] = physical_address_0_N
level0[1] = &level1_1[0]
    level1_1[0]      = physical_address_1_0
    level1_1[1]      = physical_address_1_1
    ...
    level1_1[2^10-1] = physical_address_1_N
...
level0[N] = &level1_N[0]
    level1_N[0]      = physical_address_N_0
    level1_N[1]      = physical_address_N_1
    ...
    level1_N[2^10-1] = physical_address_N_N

and we have the following arrays:

one directory, which has 2^10 elements. Each element contains a pointer to a page table array.
up to 2^10 pagetable arrays. Each one has 2^10 4 byte page entries.

and it still contains 2^10 * 2^10 = 2^20 possible keys.

K-ary trees can save up a lot of space, because if we only have one key, then we only need the following arrays:

one directory with 2^10 entries
one pagetable at directory[0] with 2^10 entries
all other directory[i] are marked as invalid, don't point to anything, and we don't allocate pagetable for them at all

 Read the full article