Re: coordinates of words in HTML document
From: Dances With Crows (danSPANceswitTRAPhcrows_at_gmail.com)
Date: 03/30/05
- Next message: John Stolz: "Re: Help with ATI drivers"
- Previous message: Luka Vuletic: "Re: Help with ATI drivers"
- In reply to: Vivek: "coordinates of words in HTML document"
- Next in thread: ray: "Re: coordinates of words in HTML document"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 30 Mar 2005 14:12:38 GMT
On 29 Mar 2005 22:19:24 -0800, Vivek staggered into the Black Sun and
said:
> I wish to find coordinates of all [the] words in [an] html document as
> they are rendered on the screen.
Why? This sounds like it'd be totally useless.
> I want the coordinates with reference to start of html document (say
> left bottom)
Most languages are read top-to-bottom, so the start of the document
would most naturally be at the top left unless you're trying to do some
sort of PostScript thing here.
> and not in reference with screen visible to user. The task at hand is
> to measure distances between words on display in HTML document.
Oh, come on! There are way too many variables for that to be useful.
The user can and will change the font, font size, and window size at any
time. HTML is *supposed* to be about separating content from
presentation anyway. Unless the HTML monkey did something like '<p
align="justify">', which shouldn't be done since justified type is
harder to read, the spaces between words will always be the same width.
The width of a word will be the sum of its character widths. You can
get those widths by finding the font used, then using XQueryFont().
(Check the man page for that function for all the details.)
> Please suggest what would be good steps for doing so.
0. Don't.
1. If you must, put some hooks into the HTML renderer of Firefox and/or
Konqueror. Have the browser render the page and feed coordinates and
font info for each line to a FIFO or stderr or something. Get
coordinates of words using font metrics.
-- Matt G|There is no Darkness in Eternity/But only Light too dim for us to see Brainbench MVP for Linux Admin / mail: TRAP + SPAN don't belong http://www.brainbench.com / Hire me! -----------------------------/ http://crow202.dyndns.org/~mhgraham/resume
- Next message: John Stolz: "Re: Help with ATI drivers"
- Previous message: Luka Vuletic: "Re: Help with ATI drivers"
- In reply to: Vivek: "coordinates of words in HTML document"
- Next in thread: ray: "Re: coordinates of words in HTML document"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|