Re: combine html pages for printing?

From: Chris F.A. Johnson (cfajohnson_at_gmail.com)
Date: 09/29/04


Date: 29 Sep 2004 15:24:41 GMT

On 2004-09-28, Dances With Crows wrote:
> On Tue, 28 Sep 2004 21:19:50 -0000, Mike staggered into the Black Sun
> and said:
>> In article <2rtt2aF1eddghU1@uni-berlin.de>, Chris F.A. Johnson wrote:
>>> On 2004-09-28, Mike wrote:
>>>> I have a bunch (100+) html pages I want to easily combine into a
>>>> single 'something' for printing. Anyone know of what program easily
>>>> does this?
>>> cat
>> Then I have tags '</html><html>...' and the printer still doesn't
>> understand what to do.
>
> I think Chris was being facetious.

    Not at all.

    I just tried it. I downloaded an e-book in HTML format (1632 from
    http://www.baen.com/library/), one file per chapter, 63 chapters
    in all.

cat 0671319728_*[0-9].htm > all.html

    I loaded all.html into Firefox and read the entire book without
    any problem. I printed a few chapters of the book without any
    problem as well. (Actaully, I had to rename a few of the chapters
    first to get them in the right order.)

    There were extraneous headings between the chapters, but the
    printed pages were quite readable.

> Or he didn't remember about the
> begin and end tags for the HTML document. Whatever, this is easily
> fixed with sed:
>
> echo "<html>" > start.txt
> echo "</html>" > end.txt
> cat *.html | sed -e 's/<html>//i' -e 's/</html>//i' > temp.html
> cat start.txt temp.html end.txt > final.html
> rm temp.html start.txt end.txt
>
> ...there may be a more efficient way of doing this, but that'll get rid
> of the spurious open and close tags. Other work (getting rid of
> redundant <head> tags?) may be more difficult. Whatever,

    Those extra tags don't affect anything.

-- 
    Chris F.A. Johnson                  http://cfaj.freeshell.org/shell
    ===================================================================
    My code (if any) in this post is copyright 2004, Chris F.A. Johnson
    and may be copied under the terms of the GNU General Public License


Relevant Pages

  • Re: combine html pages for printing?
    ... I think Chris was being facetious. ... begin and end tags for the HTML document. ... cat start.txt temp.html end.txt> final.html ... ...there may be a more efficient way of doing this, but that'll get rid ...
    (comp.os.linux.misc)
  • Re: [PHP] strip_tags and nl2br
    ... Chris wrote: ... Supply the tags you want to keep when you call strip_tags. ... rid of \n's, not tags. ... If I run strip_tags, followed by nl2br, the tags that nl2br generates should be there. ...
    (php.general)
  • Re: error report: image.dll
    ... "chris" wrote in message ... > I got rid of the bug but now every time i start my ... > How can I get rid of this annoyance permanantly? ...
    (microsoft.public.windowsxp.general)
  • Re: optimizing an asm-like language
    ... Chris F Clark ha scritto: ... register for the current position in the input stream, ... that will be used to implement all the tags machinery. ... So far I've optimized the code removing dead code, useless jump and ...
    (comp.compilers)
  • Re: "Clearcoated" Gameplay
    ... Fred has me shaking my head. ... it'll help get rid of ... some of the dust that's collected there today...;>). ... Chris ...
    (rec.games.pinball)