Re: OCR and linux; text and equations



On Tue, 31 Jan 2006 19:32:55 +0100, Horacio staggered into the Black Sun
and said:
> I've scanned some book pages, including text and equations. Abby
> [Finereader] works much better than Kooka "recognizing" text, but
> anyway, [neither one can] "recognize" or "scan" equations.

This is pretty normal. Math equations can be substantially more complex
than English text. You have a lot more characters to recognize in
simple math (English letters, bars, upper and lowercase Greek letters,
arrows, infinity, fractions, set notation, maybe Hebrew letters...) than
you do in English text. And instead of everything being left-to-right,
some symbols can mean drastically different things if they're on top or
on the bottom of a divisor. And then there are infinite series and
integrals.

> What I'm looking for is an OCR program for Linux but better than Kooka

Kooka is a frontend for ocrad or gocr. gocr is better than ocrad. But
no matter what, Linux OCR = teh suck. Sorry to say it but it does. Use
Finereader if you have it; it'll give you much better quality than any
Linux engine.

> and optionally the possibility of "recognising" equations

This is a difficult problem, and I don't think there's anything out
there that does that right now. If you have math to enter, you'll
probably have better luck doing it by hand with (La)TeX or LyX. If
you're really ambitious, you can try to write something, but that'd be a
long run for a short slide unless you had thousands of pages of math to
OCR. (And you'd still have to proof it by hand afterwards!)

--
Matt G|There is no Darkness in Eternity/But only Light too dim for us to see
Brainbench MVP for Linux Admin / mail: TRAP + SPAN don't belong
http://www.brainbench.com / "He is a rhythmic movement of the
-----------------------------/ penguins, is Tux." --MegaHAL
.



Relevant Pages

  • Re: OCR and linux; text and equations
    ... Math equations can be substantially more complex> than English text. ... >>> What I'm looking for is an OCR program for Linux but better than Kooka> ...
    (comp.os.linux.misc)
  • Re: Mileva Maric - any educated opions?
    ... >> modern bigots have a hard time believing that women can do math. ... >> Or you have a male patron in the power structure, ... So long as he licked the boots of the English. ... > would have been allowed to do it as MIT. ...
    (sci.physics.relativity)
  • Political Expediency or Policy Flexibility?
    ... Political Expediency or Policy Flexibility? ... years after the start of the policy of teaching Science and Math in English, ...
    (soc.culture.malaysia)
  • Re: Linear algebra with pitfall..
    ... problemsis not math and it's bad ... problems: bad english. ... Choose carefully selected books written in English -- certain math ... Keep the old tapes of yourself for later comparison. ...
    (sci.math)
  • Re: Reformatting a data table
    ... 94911 30304E Math II ... 94911 40405E English II ... 94911 50506E Science II ... the first column is the student ID, the second column is the course number, ...
    (microsoft.public.access.modulesdaovba)