CVE Alert: CVE-2026-27628 - py-pdf - pypdf - https://www.redpacketsecurity.com/cve-alert-cve-2026-27628-py-pdf-pypdf/
#OSINT #ThreatIntel #CyberSecurity #cve-2026-27628 #py-pdf #pypdf
That said and celebrated ;), there are things that #Censor is not yet well redacting.
The upstream library #MuPDF (with its #Python bindings in #PyMuPDF) supports by default only redaction of text, vector graphics and images. Testing on a variety of PDF files (thanks to #pypdf, #qpdf, #ghostscript, and their issue reporters, as well as @pdfarranger for their hint) let me discover that some vector graphics are not properly redacted and an upstream issue has been reported for that.
Also, form fields (widgets), signatures and links may be incompletely redacted.
You can find an updated list of “What is redacted? What not?” here: https://codeberg.org/censor/Censor/issues/120

> **Warning** > The following description is **not** valid for Censor until version 0.4.0. I recommend to update to [version 0.5.0](https://codeberg.org/censor/Censor/releases/tag/v0.5.0) for secure redaction. ## Elements under redaction rectangles - [x] Text: - characters are removed when ...
CVE Alert: CVE-2026-27628 - py-pdf - pypdf - https://www.redpacketsecurity.com/cve-alert-cve-2026-27628-py-pdf-pypdf/
#OSINT #ThreatIntel #CyberSecurity #cve-2026-27628 #py-pdf #pypdf
Q for other programmers - do you ever, out of caution, do things to prevent issues that probably won't happen? I'm processing PDFs with #pypdf, don't think I have to worry about a Bobby Tables situation but I'm still hitting it with (pseudo code)
if not base:
base = "file"
if not ext:
ext = ".pdf"
base = re.sub(r"[^A-Za-z0-9_-]+", base).strip()
base = re.sub(r"[ _]+", " ",)
if len(base) > 60:
base = base[:60].rstrip()
Is this a waste of time?
Khi tạo bộ phân tích hồ sơ tự động, một vấn đề thú vị xảy ra khi phân tích 15.000 hồ sơ với PyPDF. Hồ sơ được thiết kế 2 cột, nhưng khi trích xuất văn bản, nó không giữ được bố cục. Để tránh vấn đề này, hãy sử dụng font chữ đơn giản, tránh thiết kế nhiều cột và giữ thông tin liên lạc ở trên cùng. #HồSơ #TựĐộngHóa #PyPDF #ATS #Resume #Automation #PDF #Parsing #SaaS #ngDụng #TưVấn #LờiKhuyên #HồSơXinh #TìmViệc #IT #CôngNghệ #Vietnam #JobSearch #ResumeTips #SaaSTips
I want to write a program to extract a list of clickable links from a PDF page.
#pypdf can list the link positions/sizes and target URLs. But in a PDF document, links are annotations, which are separate data from the document text.
To get the display text of a clickable link in a PDF, is the easiest way to convert the full page to PNG, crop it to the link's bounding box, and run that through OCR? Or am I missing something more reasonable?