OurBigBook About$ Donate
 Sign in+ Sign up
by Ciro Santilli (@cirosantilli, 37)

Project Gutenberg

 ... Social technology Law License Free license Public domain Public domain archive
 0 By others on same topic  0 Discussions  Updated 2025-06-17  +Created 1970-01-01  See my version
Tags: Wikimedia Foundation project
  • Table of contents
    • Project Gutenberg remove line breaks Project Gutenberg

Project Gutenberg remove line breaks

 0  0 
Project Gutenberg
ubuntuforums.org/archive/index.php/t-1132578.html
Their txt formats are so crap!
E.g. for;
wget -O pap.txt https://www.gutenberg.org/ebooks/1342.txt.utf-8
a good one is:
perl -0777 -pe 's/(?<!\r\n)\r\n(?!\r\n)( +)?/ /g' pap.txt
The ( +)? is for the endlessly many quoted letters they have, which use four leading spaces per line as a quote marker.

 Ancestors (9)

  1. Public domain archive
  2. Public domain
  3. Free license
  4. License
  5. Law
  6. Social technology
  7. Area of technology
  8. Technology
  9.  Home

 Incoming links (1)

  • Emile, or On Education

 View article source

 Discussion (0)

+ New discussion

There are no discussions about this article yet.

 Articles by others on the same topic (0)

There are currently no matching articles.
  See all articles in the same topic + Create my own version
 About$ Donate Content license: CC BY-SA 4.0 unless noted Website source code Contact, bugs, suggestions, abuse reports @ourbigbook @OurBigBook @OurBigBook