Re: detecting file types
- From: Måns Rullgård <mru@xxxxxxxxxxxxx>
- Date: Wed, 08 Feb 2006 20:59:22 +0000
Grant Edwards <grante@xxxxxxxx> writes:
On 2006-02-08, dagecko <dagecko@xxxxxxx> wrote:
I read it quickly, but in fact this was not what I was looking
for. I needed a way to detect, in a program made in C/C++, if
a file is a binary one or a simple ascii one.
1) Do a bitwise or of all the bytes in the file.
2) If bit 7 is set in the result, it's not an ASCII file.
3) If bit 7 is not set in the result, it _might_ be an ASCII
file. Or it might be a binay file that doesn't have any
bytes with bit 7 set.
If you know what language the ASCII is supposed to be, you
could look at the frequency distributions of individual
characters to give you a better idea if a file is really ASCII
or if it's a degenerate binary file.
Looking for non-printable characters < 0x20 is also a good idea.
--
Måns Rullgård
mru@xxxxxxxxxxxxx
.
- Follow-Ups:
- Re: detecting file types
- From: Grant Edwards
- Re: detecting file types
- References:
- detecting file types
- From: dagecko
- Re: detecting file types
- From: Bill Marcum
- Re: detecting file types
- From: dagecko
- Re: detecting file types
- From: Chris F.A. Johnson
- Re: detecting file types
- From: dagecko
- Re: detecting file types
- From: Grant Edwards
- detecting file types
- Prev by Date: Re: detecting file types
- Next by Date: Help a newbie :) (prob with 'for')
- Previous by thread: Re: detecting file types
- Next by thread: Re: detecting file types
- Index(es):
Relevant Pages
|