Internet Archive

Internet Archive Open Library

Previously called "Lending Library" it seems: help.archive.org/hc/en-us/articles/360016554912-Borrowing-From-The-Lending-Library

You can borrow online books from them for a few hours/days: help.archive.org/hc/en-us/articles/360016554912-Borrowing-From-The-Lending-Library This is the most amazing thing ever made!!! You can even link to specific pages, e.g. archive.org/details/supermenstory00murr/page/80/mode/2up

They seem to a have a separate URL with the same content as well for some reason: openlibrary.org/, classic messy Internet Archive style.

Bastards are suing them www.theverge.com/2020/6/1/21277036/internet-archive-publishers-lawsuit-open-library-ebook-lending: Hachette, Penguin Random House, Wiley, and HarperCollins

It is quite hard to decide if an upload is from the official legal lending library, or just some illegal upload, e.g.:

archive.org/details/TheGoogleStory likely illegal
archive.org/details/isbn_9780385342728 likely legal

so the URLs are basically the same style. Some legality indicators:

Access-restricted-item: true
present in the collection: archive.org/details/internetarchivebooks?tab=about

Hachette v. Internet Archive (2023)

 1  0

Wayback Machine

 1  0

cirosantilli.com/china-dictatorship/wayback-machine

 Tagged

CIA 2010 covert communication websites / Wayback Machine

Wayback Machine save screen shot

 0  0

Feature added in 2019 apparently: www.reddit.com/r/DataHoarder/comments/dj6ot5/you_can_now_save_a_screenshot_of_your_saved_pages/
github.com/ourbigbook/template/archive/refs/heads/master.zip
But TODO: how to access the screenshot afterwards?

archive.org/details/toomanyrequests_20191110 says 15 archives / minute, but apparently aslo 15 retrievals per minutes on Wikipedia, after which 5 min blacklist. After that, you start getting some 429s, and after that, server refuses to connect at al.

CDX: no limits apparently, they might just throttle you? Made 10k requets on bash loop and was going fine. But not that if you get blacklisted by create/fetch requests blacklist, server fails to connect here as well.

Search Wayback Machine by IP

 0  0

archive.org/post/1025445/is-there-a-way-to-search-by-ip-address-not-http

Wayback Machine full text search

 0  0

List all domains from the Wayback Machine

 0  0

archive.org/post/1055220/how-to-query-for-all-the-websites-that-end-in-combr
archive.org/details/WebArchiveDomainFiles only a random list with per-ccTLDs upon request of (paid presumably) partners. As of 2023 only contains the Netherlands: archive.org/details/Dotnl-2016-present-domains-in-wayback-domainyear-of-last-capture

Wayback Machine pages don't after you just finished archiving them

 0  0

Pages seem to take some time after they say they have "archived it" to when you can actually see what was archived.

Their system is that bad unsurprisingly.

Archive Team

 0  0

 Articles by others on the same topic (1)

Internet Archive by

Wikipedia Bot 1

 View more

The Internet Archive is a non-profit digital library that aims to preserve and provide access to a vast collection of digital content, including websites, books, music, software, and other media. It was founded in 1996 and is best known for its Wayback Machine, which allows users to view archived versions of websites as they appeared at different points in time.

 Read the full article

  See all articles in the same topic Create my own version

Internet Archive