Bibliography:
E.g.: thecollegestore.co.uk/products/ladies-oxford-college-puffer-jacket?variant=40590030864549 Black with 5 rows, on left chest "colege name", logo, "Oxford", and right chest optional initials (or sometimes other identifiers/nicknames) to help distinguish from all the other people's identical clothes.
This has a whitelabel version: www.workweargiant.co.uk/product/result-urban-holkham-down-feel-jacket/, the name appears to be "Holkham Down Feel Jacket".
Circa 2020, these are likely given out by each college for free, and are widely used.
If you look 20 and wear one of those, it's almost an ID, you can get anywhere that does not require a key card, porters won't look at you twice!
- www.oxfordstudent.com/2019/03/25/in-opposition-to-stash/ In opposition to stash by Morgan Jones (2019), basically because university is your last chance to wear what you want on many professions.
University of Oxford student newspaper by Ciro Santilli 35 Updated 2024-12-23 +Created 1970-01-01
They actually have two The Oxford Student and Cherwell. As brilliantly highlighted in this first of April piece:
Related:
- www.thestudentroom.co.uk/showthread.php?t=1167619 "OxStu vs. Cherwell"
Sometimes it just feels like Ubuntu devs don't actually use Ubuntu as a desktop.
Some extremelly anoying problems are introduced and just never get fixed, even though they feel so obvious!
Would never happen on Mac...
Like the U.S.' spring term.
Like the U.S.' summer term.
E-learning system prior to Canvas: weblearn.ox.ac.uk/portal. Appears fully custom and closed source?
Closed in 2023 in favour of Canvas.
activatedgeek/LeNet-5 use ONNX for inference by Ciro Santilli 35 Updated 2024-12-23 +Created 1970-01-01
Note that the images must be drawn with white on black. If you use black on white, it the accuracy becomes terrible. This is a good very example of brittleness in AI systems!
We can try the code adapted from thenewstack.io/tutorial-using-a-pre-trained-onnx-model-for-inferencing/ at python/onnx_cheat/infer_mnist.py:and it works pretty well! The protram outputs:as desired.
cd python/onnx_cheat
./infer_mnist.py lenet.onnx infer_mnist_9.png
9
We can also try with images directly from Extract MNIST images.and the accuracy is great as expected.
for f in /home/ciro/git/mnist_png/out/testing/1/*.png; do echo $f; infer.py $f ; done
By default, the setup runs on CPU only, not GPU, as could be seen by running htop. But by the magic of PyTorch, modifying the program to run on the GPU is trivial:and leads to a faster runtime, with less
cat << EOF | patch
diff --git a/run.py b/run.py
index 104d363..20072d1 100644
--- a/run.py
+++ b/run.py
@@ -24,7 +24,8 @@ data_test = MNIST('./data/mnist',
data_train_loader = DataLoader(data_train, batch_size=256, shuffle=True, num_workers=8)
data_test_loader = DataLoader(data_test, batch_size=1024, num_workers=8)
-net = LeNet5()
+device = 'cuda'
+net = LeNet5().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=2e-3)
@@ -43,6 +44,8 @@ def train(epoch):
net.train()
loss_list, batch_list = [], []
for i, (images, labels) in enumerate(data_train_loader):
+ labels = labels.to(device)
+ images = images.to(device)
optimizer.zero_grad()
output = net(images)
@@ -71,6 +74,8 @@ def test():
total_correct = 0
avg_loss = 0.0
for i, (images, labels) in enumerate(data_test_loader):
+ labels = labels.to(device)
+ images = images.to(device)
output = net(images)
avg_loss += criterion(output, labels).sum()
pred = output.detach().max(1)[1]
@@ -84,7 +89,7 @@ def train_and_test(epoch):
train(epoch)
test()
- dummy_input = torch.randn(1, 1, 32, 32, requires_grad=True)
+ dummy_input = torch.randn(1, 1, 32, 32, requires_grad=True).to(device)
torch.onnx.export(net, dummy_input, "lenet.onnx")
onnx_model = onnx.load("lenet.onnx")
EOF
user
as now we are spending more time on the GPU than CPU:real 1m27.829s
user 4m37.266s
sys 0m27.562s
They can't even make this basic stuff just work!
Later on, we've also come across some stylistic hits in IP ranges with apparent slight variations of the CGI comms pattern:Since these are so rare, it is still a bit hard to classify them for sure, but they are of great interest no doubt, as as we start to notice these patterns more tend to come if it is a thing.
- no .cgi, but also http on subdomain:
- no subdomain, no https, no .cgi
- live
- dead
The CGI comms websites contain the only occurrence of HTTPS, so it might open up the door for a certificate fingerprint as proposed by user joelcollinsdc at: news.ycombinator.com/item?id=36280801!
crt.sh appears to be a good way to look into this:They all appear to use either of:
- backstage.musical-fortune.net:
- clients.smart-travel-consultant.com
- members.it-proonline.com
- members.metanewsdaily.com
- miembros.todosperuahora.com
- secure.altworldnews.com
- secure.driversinternationalgolf.com
- secure.freshtechonline.com
- secure.globalnewsbulletin.com
- secure.negativeaperture.com
- secure.riskandrewardnews.com
- secure.theworld-news.net
- secure.topbillingsite.com
- secure.worldnewsandent.com
- ssl.beyondnetworknews.com
- ssl.newtechfrontier.com
- www.businessexchangetoday.com
- heal.conquermstoday.com
- Go Daddy
- Thawte DV SSL CA
- Starfield Technologies, Inc.
crt.sh/?q=globalnewsbulletin.com has a hit to: crt.sh/?id=774803. With login we can see: search.censys.io/certificates/5078bce356a8f8590205ae45350b27f58f4ac04478ed47a389a55b539065cee8. Issued by www.thawte.com/repository/index.html. No hits for certificates with same public key: search.censys.io/search?resource=certificates&q=parsed.subject_key_info.fingerprint_sha256%3A+714b4a3e8b2f555d230a92c943ced4f34b709b39ed590a6a230e520c273705af or any other "same" queries though.
Let's try another one for secure.altworldnews.com: search.censys.io/certificates/e88f8db87414401fd00728db39a7698d874dbe1ae9d88b01c675105fabf69b94. Nope, no direct mega hits here either.
2013 DNS Census virtual host cleanup by Ciro Santilli 35 Updated 2024-12-23 +Created 1970-01-01
We've noticed that often when there is a hit range:and that this does not seem to be that common. Let's see if that is a reasonable fingerprint or not.
- there is only one IP for each domain
- there is a range of about 20-30 of those
Note that although this is the most common case, we have found multiple hits that viewdns.info maps to the same IP.
First we create a table The
u
(unique
) that only have domains which are the only domain for an IP, let's see by how much that lowers the 191 M total unique domains:time sqlite3 u.sqlite 'create table t (d text, i text)'
time sqlite3 av.sqlite -cmd "attach 'u.sqlite' as u" "insert into u.t select min(d) as d, min(i) as i from t where d not like '%.%.%' group by i having count(distinct d) = 1"
not like '%.%.%'
removes subdomains from the counts so that CGI comms are still included, and distinct
in count(distinct
is because we have multiple entries at different timestamps for some of the hits.Let's start with the 208 subset to see how it goes:OK, after we fixed bugs with the above we are down to 4 million lines with unique domain/IP pairs and which contains all of the original hits! Almost certainly more are to be found!
time sqlite3 av.sqlite -cmd "attach 'u.sqlite' as u" "insert into u.t select min(d) as d, min(i) as i from t where i glob '208.*' and d not like '%.%.%' and (d like '%.com' or d like '%.net') group by i having count(distinct d) = 1"
This data is so valuable that we've decided to upload it to: archive.org/details/2013-dns-census-a-novirt.csv Format:The numbers of the first column are the IPs as a 32-bit integer representation, which is more useful to search for ranges in.
8,chrisjmcgregor.com
11,80end.com
28,fine5.net
38,bestarabictv.com
49,xy005.com
50,cmsasoccer.com
80,museemontpellier.net
100,newtiger.com
108,lps-promptservice.com
111,bridesmaiddressesshow.com
To make a histogram with the distribution of the single hostname IPs:Which gives the following useless noise, there is basically no pattern:
#!/usr/bin/env bash
bin=$((2**24))
sqlite3 2013-dns-census-a-novirt.sqlite -cmd '.mode csv' >2013-dns-census-a-novirt-hist.csv <<EOF
select i, sum(cnt) from (
select floor(i/${bin}) as i,
count(*) as cnt
from t
group by 1
union
select *, 0 as cnt from generate_series(0, 255)
)
group by i
EOF
gnuplot \
-e 'set terminal svg size 1200, 800' \
-e 'set output "2013-dns-census-a-novirt-hist.svg"' \
-e 'set datafile separator ","' \
-e 'set tics scale 0' \
-e 'unset key' \
-e 'set xrange[0:255]' \
-e 'set title "Counts of IPs with a single hostname"' \
-e 'set xlabel "IPv4 first byte"' \
-e 'set ylabel "count"' \
-e 'plot "2013-dns-census-a-novirt-hist.csv" using 1:2:1 with labels' \
;
Let' see if there's anything in records/mx.xz.
mx.csv is 21GB.
They do have
"
in the files to escape commas so:mx.pyWould have been better with csvkit: stackoverflow.com/questions/36287982/bash-parse-csv-with-quotes-commas-and-newlines
import csv
import sys
writer = csv.writer(sys.stdout)
with open('mx.csv', 'r') as f:
reader = csv.reader(f)
for row in reader:
writer.writerow([row[0], row[3]])
then:
# uniq not amazing as there are often two or three slightly different records repeated on multiple timestamps, but down to 11 GB
python3 mx.py | uniq > mx-uniq.csv
sqlite3 mx.sqlite 'create table t(d text, m text)'
# 13 GB
time sqlite3 mx.sqlite ".import --csv --skip 1 'mx-uniq.csv' t"
# 41 GB
time sqlite3 mx.sqlite 'create index td on t(d)'
time sqlite3 mx.sqlite 'create index tm on t(m)'
time sqlite3 mx.sqlite 'create index tdm on t(d, m)'
# Remove dupes.
# Rows: 150m
time sqlite3 mx.sqlite <<EOF
delete from t
where rowid not in (
select min(rowid)
from t
group by d, m
)
EOF
# 15 GB
time sqlite3 mx.sqlite vacuum
Let's see what the hits use:
awk -F, 'NR>1{ print $2 }' ../media/cia-2010-covert-communication-websites/hits.csv | xargs -I{} sqlite3 mx.sqlite "select distinct * from t where d = '{}'"
At around 267 total hits, only 84 have MX records, and from those that do, almost all of them have exactly:with only three exceptions:We need to count out of the totals!which gives, ~18M, so nope, it is too much by itself...
smtp.secureserver.net
mailstore1.secureserver.net
dailynewsandsports.com|dailynewsandsports.com
inews-today.com|mail.inews-today.com
just-kidding-news.com|just-kidding-news.com
sqlite3 mx.sqlite "select count(*) from t where m = 'mailstore1.secureserver.net'"
Let's try to use that to reduce where
av.sqlite
from 2013 DNS Census virtual host cleanup a bit further:time sqlite3 mx.sqlite '.mode csv' "attach 'aiddcu.sqlite' as 'av'" '.load ./ip' "select ipi2s(av.t.i), av.t.d from av.t inner join t as mx on av.t.d = mx.d and mx.m = 'mailstore1.secureserver.net' order by av.t.i asc" > avm.csv
avm
stands for av
with mx
pruning. This leaves us with only ~500k entries left. With one more figerprint we could do a Wayback Machine CDX scanning scan.Let's check that we still have most our hits in there:At 267 hits we got 81, so all are still present.
grep -f <(awk -F, 'NR>1{print $2}' /home/ciro/bak/git/media/cia-2010-covert-communication-websites/hits.csv) avm.csv
secureserver is a hosting provider, we can see their blank page e.g. at: web.archive.org/web/20110128152204/http://emmano.com/. security.stackexchange.com/questions/12610/why-did-secureserver-net-godaddy-access-my-gmail-account/12616#12616 comments:
secureserver.net is the name GoDaddy use as the reverse DNS for IP addresses used for dedicated/virtual server hosting
There are unlisted articles, also show them or only show them.