I am writing an application that takes a file as an input.
I want to avoid binary files that have been specified by the user.
Is there any way to detect that a file contains binary data?
It depends on the system. On some systems, you can't open a
binary file in text mode, so the open would fail. This isn't
the case for Unix or Windows, however. Other than that, you can
look at the first n bytes (for some appropriate n), and use a
heuristic to guess: historically, the presence of nul bytes, for
example, or for that matter, on most systems, the presence of
any bytes in the ranges 0x00-0x06 or 0x0E-0x1F generally means
binary; so may (often) a byte in the range 0x80-0x9F. This
depend on the text encoding used, however: the sequence 0xC3,
0x89 is a capital E with an acute accent in UTF-8, and nul bytes
are likely to be pretty frequent if UTF-16 or UTF-32 is used.