Elsevier
Elsevier
No point discussing this if neither of us is going to prove it one way or the other.
Bitmaps are actually a key part of what I was thinking about, so you agree with me there it seems. There’s also the issue of using the wrong paper size. .IIRC Windows usually defaults to Letter for printing even in places where A4 is the only common size and no one has heard of Letter, and most people don’t realise their prints are cropped/resized. This would still apply when printing to PDF.
You’re pushing it through one system that converts a PDF file into printer instructions, and then through another system that converts printer instructions into a PDF file. Each step probably has to make adjustments with the data it’s pushing through.
Without looking deeply into the systems involved, I have to assume it’s not a lossless process.
They maintain a high quality but not lossless.
As a trivial example, if you use the wrong paper size (like Letter instead of A4) then it might crop parts of the page or add borders or resize everything. Again I’ll admit, in 99% of cases it doesn’t matter, but it might matter if, say, an embedded picture was meant to be exactly to scale.
Those printer instructions are called Postscript and they’re the basis of PDF.
You are thinking that the printing process will rasterize the PDF and then essentially OCR/vector map it back. It’s (usually) not that complicated.
I don’t understand the “that’s no how PDFs work” criticism.
Removing data from the original file is the whole point of the exercise! Of course unique tokens can be hidden in plain sight in images, letter spacing, etc. If we want to make sure to remove that we need to degrade the quality of the PDF so that this information is lost in said lossy conversion.