OurBigBook About$ Donate
 Sign in+ Sign up
by Ciro Santilli (@cirosantilli, 37)

Public domain archive

 ... Area of technology Social technology Law License Free license Public domain
 0 By others on same topic  0 Discussions  Updated 2025-05-13  +Created 1970-01-01  See my version
  • Table of contents
    • Project Gutenberg Public domain archive
      • Project Gutenberg remove line breaks Project Gutenberg
    • Wikisource Public domain archive

Project Gutenberg

 0  0 
Public domain archive Tags: Wikimedia Foundation project

Project Gutenberg remove line breaks

 0  0 
Project Gutenberg
ubuntuforums.org/archive/index.php/t-1132578.html
Their txt formats are so crap!
E.g. for;
wget -O pap.txt https://www.gutenberg.org/ebooks/1342.txt.utf-8
a good one is:
perl -0777 -pe 's/(?<!\r\n)\r\n(?!\r\n)( +)?/ /g' pap.txt
The ( +)? is for the endlessly many quoted letters they have, which use four leading spaces per line as a quote marker.

Wikisource

 0  0 
Public domain archive Tags: Wikimedia project

 Ancestors (8)

  1. Public domain
  2. Free license
  3. License
  4. Law
  5. Social technology
  6. Area of technology
  7. Technology
  8.  Home

 View article source

 Discussion (0)

+ New discussion

There are no discussions about this article yet.

 Articles by others on the same topic (0)

There are currently no matching articles.
  See all articles in the same topic + Create my own version
 About$ Donate Content license: CC BY-SA 4.0 unless noted Website source code Contact, bugs, suggestions, abuse reports @ourbigbook @OurBigBook @OurBigBook