Questions You Should Know about Scanner Online

13 Apr.,2024

 

This topic is now closed to further replies.

Ensure? No. A simple reason: Images, layout information, fonts, and all sorts of other "simple" data can nonetheless be malicious, and can lead to arbitrary code execution if the parser for them has an exploitable bug (a.k.a. a vulnerability). This is not academic; lots of exploits, including some quite famous ones, were carried out through image or font parsers.

Similarly, any scanner that you could use to theoretically validate the contents of a PDF could, itself, be vulnerable. After all, it too is parsing the file, and there's nothing that says security tools can't contain vulnerabilities themselves. In fact, adding a security tool always increases the attack surface - the amount of space where a vulnerability could exist - and there is no way to guarantee that the tool, even if not itself vulnerable, will reliably detect malicious data without passing it on to other code.

You could, in theory, have a PDF reader that doesn't handle any but the most common and trusted formats; it wouldn't be able to open everything (not even every book), but it could open most of them (probably all from most publishers, etc.). It wouldn't be totally safe - even common and trusted code can have vulnerabilities that lurk undetected for over a decade. I don't know of any PDF reader that has this feature (and specific product recommendations are out of scope for this site anyhow), but you might be able to find one if you look.

Another option would be a PDF validator. As mentioned above, this does add attack surface (the validator itself), but in theory a validator could apply strict validation without attempting to render the font/image/layout/whatever, which reduces the risk somewhat, and would probably throw out anything that isn't safe (not guaranteed, but probably) without being at risk itself (unless the validator was software somebody specifically targeted, or was rather shoddily written).

One way to mitigate all these risks is to handle the PDFs in a sandbox, a low-privilege process with minimal and strictly-controlled access to the rest of the system. Sandboxing is quite common, including for PDFs - Adobe Reader was one of the first really popular desktop programs that I know of to include a sandbox (other than browsers; Adobe adapted the one Chrome was already using) - and is used for approximately all apps on mobile devices and most apps from the desktop Windows Store and MacOS App Store. Mind you, sandboxes aren't a perfect solution - they don't restrict everything, and even stuff that they do try to restrict might be possible if the sandbox is itself buggy (as pretty much all complex software is) in the right way. Still, it adds defense in depth.

Questions You Should Know about Scanner Online

Is there a way to scan a pdf to ensure it doesn't contain anything that could be a virus?