Re: [opensuse] PDF File conversion



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



The Tuesday 2008-06-24 at 14:03 +0200, Carlos E. R. wrote:

...

If you want, you could create a sample file in OOo similar to your
newsletter, a page or two, and let us play with it to see how much we can
reduce it.

[ received via private mail; I report results here for comments ]


First quick test:

Via OO export menu, at 300 dpi -> 947K (1)
at 150 dpi -> 1022K (larger!)
Via OO print to ps,
convert via ps2pdf -> 397K (2)


You see already that via ps2pdf there is an important size reduction, more that 50%.

Font info (pdffonts):

(1)

name type emb sub uni object ID
- ------------------------------------ ----------------- --- --- --- ---------
URWBookmanL-DemiBold Type 1 yes no yes 334 0
URWBookmanL-DemiBoldItal Type 1 yes no yes 339 0
URWBookmanL-Ligh Type 1 yes no yes 349 0
EAAAAA+DejaVuSans-BoldOblique TrueType yes yes yes 319 0
FAAAAA+DejaVuSans-Bold TrueType yes yes yes 314 0
GAAAAA+DejaVuSans TrueType yes yes yes 304 0
HAAAAA+Arial-BoldMT TrueType yes yes yes 324 0
URWBookmanL-LighItal Type 1 yes no yes 344 0
JAAAAA+DejaVuSans-Oblique TrueType yes yes yes 309 0
KAAAAA+Arial-BoldItalicMT TrueType yes yes yes 329 0


(2)

name type emb sub uni object ID
- ------------------------------------ ----------------- --- --- --- ---------
OWDFYY+URWBookmanL-DemiBold Type 1C yes yes no 8 0
VYXGFR+URWBookmanL-DemiBoldItal Type 1C yes yes no 10 0
STEJSX+URWBookmanL-Ligh Type 1C yes yes no 18 0
VWKDOV+DejaVuSans-BoldObliqueFID2HGSet1 TrueType yes yes no 30 0
PUHKTK+DejaVuSans-BoldFID1HGSet1 TrueType yes yes no 45 0
HIRYKA+DejaVuSansFID5HGSet1 TrueType yes yes no 47 0
KPSHBO+OpenSymbolFID190HGSet2 TrueType yes yes no 49 0
FHWBCU+URWBookmanL-LighItal Type 1C yes yes no 61 0
JVHKIS+DejaVuSans-ObliqueFID4HGSet1 TrueType yes yes no 64 0
QEGMJO+OpenSymbolFID190HGSet1 TrueType yes yes no 72 0


The point to notice is the "sub" column, meaning that OO embeds the entire font definition, not the subset used. This is a bug, IMO.



The next step is changing the fonts, but that will be later.

[...]

Ok, I took a smaller piece of your file:

- -rw-r--r-- 1 cer users 97K 2008-06-24 22:11 4-pages.odt

I converted it to PDF via the OO converter:

- -rw-r--r-- 1 cer users 481K 2008-06-24 22:12 4-pages.pdf


and via intermediate .ps and ps2pdf:

- -rw-r--r-- 1 cer users 157K 2008-06-24 22:16 4-pages-via_print_to_ps.pdf
- -rw-r--r-- 1 cer users 4.3M 2008-06-24 22:15 4-pages-via_print_to_ps.ps


So, the ps2pdf converter produces a pdf file about three times smaller. The font info is:

cer@nimrodel:~/tmp/OOPDF> pdffonts 4-pages.pdf
name type emb sub uni object ID
- ------------------------------------ ----------------- --- --- --- ---------
URWBookmanL-DemiBold Type 1 yes no yes 169 0
URWBookmanL-DemiBoldItal Type 1 yes no yes 174 0
URWBookmanL-Ligh Type 1 yes no yes 164 0
EAAAAA+DejaVuSans-BoldOblique TrueType yes yes yes 159 0

cer@nimrodel:~/tmp/OOPDF> pdffonts 4-pages-via_print_to_ps.pdf
name type emb sub uni object ID
- ------------------------------------ ----------------- --- --- --- ---------
WOOWVU+URWBookmanL-DemiBold Type 1C yes yes no 8 0
UMPCZU+URWBookmanL-DemiBoldItal Type 1C yes yes no 10 0
ZIOPXQ+URWBookmanL-Ligh Type 1C yes yes no 18 0
VWKDOV+DejaVuSans-BoldObliqueFID2HGSet1 TrueType yes yes no 30 0


The different size is probably that ps2pdf embeds a subset of the fonts, and OO embeds the complete set (plus graphic defaults).


Now, I analyze the .odt file.

The first detail I notice is that you are not using styles. The default
style uses the DejaVuSans font, but then you modify each paragraph to use a
variation of URW Bookman L. It is more efficient to first modify the default
style to use your font of choice, and any other thing you modify, and then
create more styles for each type of paragraph you use. This way, by simply
modifying the style, in one stroke you modify the settings for the whole
file, in a consistent way.


The next step is choosing the fonts that do not need to be embed. Have a look at the list here:

<http://lists.opensuse.org/opensuse/2008-04/msg01610.html>
<http://lists.opensuse.org/opensuse/2008-04/msg01616.html>

These fonts do not need to be embedded, or rather should not:

| 19. Bookman-Demi
| 20. Bookman-DemiItalic
| 21. Bookman-Light
| 22. Bookman-LightItalic

But those are not the same that you are using, those are the so called printer fonts. To see them in OO you need to go to option, writer, compatibility options, and enable "use printer metrics". Then, you can select "Bookman" instead which might, only might, do the trick (it doesn't).

- -rw-r--r-- 1 cer users 160K 2008-06-24 22:17 4-pages-mod-via_print_to_ps.pdf

name type emb sub uni object ID
- ------------------------------------ ----------------- --- --- --- ---------
WOOWVU+Bookman-Demi Type 1C yes yes no 8 0
XQKHNR+Bookman-DemiItalic Type 1C yes yes no 10 0
ZIOPXQ+Bookman-Light Type 1C yes yes no 19 0
OOHPAU+URWBookmanL-DemiBoldItal Type 1C yes yes no 21 0
VWKDOV+DejaVuSans-BoldObliqueFID2HGSet1 TrueType yes yes no 27

You see that the dejavu font is still used? I find it on blank lines. I
change them to Bookman, and the change in size is:

- -rw-r--r-- 1 cer users 152K 2008-06-24 22:23 4-pages-mod-via_print_to_ps.pdf

name type emb sub uni object ID
- ------------------------------------ ----------------- --- --- --- ---------
WOOWVU+Bookman-Demi Type 1C yes yes no 8 0
UMPCZU+Bookman-DemiItalic Type 1C yes yes no 10 0
ZIOPXQ+Bookman-Light Type 1C yes yes no 19 0


See? DejaVuSans-BoldOblique uses 8 Kb for some useless white lines.

Now I do a quick change, and switch to "Times":

- -rw-r--r-- 1 cer users 135K 2008-06-25 00:03 4-pages-mod-via_print_to_ps-times.pdf

name type emb sub uni object ID
- ------------------------------------ ----------------- --- --- --- ---------
Times-BoldItalic Type 1 no no no 8 0
Times-Roman Type 1 no no no 7 0
Times-Bold Type 1 no no no 9 0

Notice that this font is not embedded at all, it uses the internal font of
the reader. We have gone from the initial 160 down to 135 just playing withfonts.

Of course, the alignment and placements have gone out of the board, but this
was intended as a test only :-)


The internal OO converter gives a much larger size:

- -rw-r--r-- 1 cer users 474K 2008-06-25 00:09 4-pages-mod-2.pdf

name type emb sub uni object ID
- ------------------------------------ ----------------- --- --- --- ---------
NimbusRomNo9L-Medi Type 1 yes no yes 158 0
NimbusRomNo9L-MediItal Type 1 yes no yes 163 0
NimbusRomNo9L-Regu Type 1 yes no yes 168 0

What??

This is broken, it should be using "Times", not nimbus. Another Bug :-/


Oh, well... my point is proven ;-)



- -- Cheers,
Carlos E. R.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4-svn0 (GNU/Linux)

iD8DBQFIYYLbtTMYHG2NR9URAtU4AKCHqEKcDHHCH4kVL731QIucl3HTBACdF/ne
n32rpxPq/JgY3Oa/hz26Hgs=
=l79z
-----END PGP SIGNATURE-----
--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx



Relevant Pages

  • Re: Fonts for books
    ... Bookman Old Style has a very assertive boldface! ... and headings could be left at a smaller ... Publisher 98, and TrueType Font Pack. ... Suzanne S. Barnhill ...
    (microsoft.public.word.formatting.longdocs)
  • Re: Where do you put your eight?
    ... ]excellent readability (it's not fancy at all, ... ]older font, feels classy). ... I've been using Bookman for a while because it's a good-looking font, ... Charter for print purposes ...
    (rec.games.frp.dnd)
  • Re: Where do you put your eight?
    ... ]I've been using Bookman for a while because it's a good-looking font, ... ]Bookman 10 with the reader set to 'full page zoom'. ... My grandmother had a mechanical typewriter, ... The 'lower echelon' secretaries used Tandy ...
    (rec.games.frp.dnd)
  • Re: Fonts for books
    ... chosen Century Gothic 11 point bold for this purpose and am pleased with the ... Bookman Old Style has a very assertive boldface! ... > Publisher 98, and TrueType Font Pack. ... > would TNR and also allow more leading. ...
    (microsoft.public.word.formatting.longdocs)
  • Re: Quick Style that Greys out Text
    ... grayed out in modify. ... change the font size to blank as I stated in an earlier email. ... "Suzanne S. Barnhill" wrote: ...
    (microsoft.public.word.docmanagement)