 Backing up CIA website archives for research and posterity (source code)

= Backing up <CIA> website archives for research and posterity

This is a quick update to the article: <CIA 2010 covert communication websites>{full}

I've downloaded and uploaded copies of the archives of the CIA websites as follows:
* all <CIA 2010 covert communication websites/cqcounter>[cqcounter screenshots] where <cqcounter> was the best source to: https://github.com/cirosantilli/media/tree/master/cia-2010-covert-communication-websites/screenshots/cqcounter[]. That commercial website does not inspire much trust, e.g. now the main pages like http://cqcounter.com/site/internationalwhiskylounge.com.html were giving  an error:
  ``
  [1114: The table 'access' is full] ( 1114 : The table 'access' is full )
  ``
  so I'm glad to have saved their precious screenshots at a safer place.
* all <Wayback Machine> archives to: https://github.com/cirosantilli/cia-2010-websites-dump[]. The exports were done with https://github.com/StrawberryMaster/wayback-machine-downloader by Felipe https://x.com/opapeldetrouxa which is an up-to-date fork of https://github.com/hartator/wayback-machine-downloader and the tool seemed to work very well. I've also edited that better working fork at the top answer of: https://superuser.com/questions/828907/how-to-download-a-website-from-the-archive-org-wayback-machine/957298#957298

The cqcounter screenshots don't offer too much information, but having the wayback machine ones could actually reveal new fingerprints and other website information leaks.

We've had a very quick look, and while there was nothing mind blowing, there were some small finds.

Starting https://web.archive.org/web/20041215182457/http://alljohnny.com/index.html[December 2004, the "Submit your favored carlson quote" of alljohnny.com] was mind blowingly switched to point to https://web.archive.org/web/20050328224840/https://washington.serversecured.net/~alljohnn/cgi-bin/memlog.cgi[\https://washington.serversecured.net/~alljohnn/cgi-bin/memlog.cgi] thus likely leaking the control site URL. Beauty. It previously pointed to https://web.archive.org/web/20040901162621/https://secure.alljohnny.com/cgi-bin/memlog.cgi[]

https://web.archive.org/web/20110203000411/http://mynepalnews.com/[mynepalnews.com] actually has several archives for a https://web.archive.org/web/20110204095803/http://mynepalnews.com:80/stats/[/stats] path which contains HTML reports generated by Webalizer, an analytic tracker that tracks the source of incoming traffic!!! It is hard to believe that the CIA would have left that there. Particularly ridiculous is the presence of `inurl:cgi server_software` at https://web.archive.org/web/20110204095809/http://mynepalnews.com:80/stats/usage_200805.html which is almost certainly a <#Google dork> search, which we know is something that the Iranians used to find the websites. That search hits under https://web.archive.org/web/20110204095351/http://mynepalnews.com:80/cgi-bin/check.cgi[/cgi-bin/check.cgi]. That page is itself os some interest containing `SERVER_ADMIN = mmadev@mmadev.com`. https://web.archive.org/web/20110204095815/http://mynepalnews.com:80/stats/usage_200806.html also reveals several request IPs. Even if this is not a CIA website, there's a chance we could find the IP of the Iranian counter-intelligence in these IP list, it's mind blowing. There's lots of referrer spam too as well. Further HTML inspection however seems to show close relationship to that HTML and other confirmed hits.

https://web.archive.org/web/20101024140137/http://globaltourist.net/[globaltourist.net], if is actually a hit, likely has a https://web.archive.org/web/20031221163711/http://globaltourist.net/[a 2003 archive], which would be our earliest hit archive so far.

A fun fact is that looking at the source code of: https://web.archive.org/web/20130828122833/http://euronewsonline.net/euro_bus.php we noticed an interesting comment:
``
<!-- ImageReady Slices (enewsweather.psd) -->
``
which clarifies that the CIA likely used Adobe ImageReady to cut up the images for <CIA 2010 covert communication websites/Split header images>:
> Adobe ImageReady was a bitmap graphics editor that was shipped with Adobe Photoshop for six years. It was available for Windows, Classic Mac OS and Mac OS X from 1998 to 2007. ImageReady was designed for web development and closely interacted with Photoshop
We also understand that the tool likely outputs the layout to HTML directly, and leaks the adobe projects filenames (.pds files) in the process.
 Back to article page