The new bugfix release 3.0.5 of #Apache #PDFBox JBIG2 ImageIO plugin is available http://pdfbox.apache.org/download.html
Apache PDFBox | Download

The Apache PDFBox™ library is an open source Java tool for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command-line utilities. Apache PDFBox is published under the Apache License v2.0.

GraphCompose: как я приволок ECS из геймдева и снапшот-тесты из фронта в PDF-генерацию на Java

TL;DR Я сделал библиотеку для генерации PDF на Java, в которой: Документ описывается семантически (модули, секции, параграфы, таблицы, слои), а не через moveTo/lineTo/showText . Layout и рендер — это два разных прохода . Геометрия резолвится один раз, потом её рисуют. Поэтому документ можно тестировать до того, как написан хотя бы один байт PDF . Под капотом — ECS-архитектура в стиле игровых движков: Entity / Component / System. Сущности документа лежат в EntityManager , компоненты прицепляются и снимаются, системы ( LayoutSystem , PaginationSystem , RenderingSystem ) работают над ними. Тестирование трёхуровневое: unit → layout-снапшоты (как у Jest для React) → визуальная регрессия по PNG-диффу . На простом инвойсе библиотека идёт 2.45 мс (iText 5 — 1.57 мс, JasperReports — 4.45 мс). На стресс-тесте: 50 потоков, 5000 документов, 0 ошибок, ~2000 doc/sec . Это статья про задумку и инженерные решения, которые получились нетривиальными. Если вам интересно, как декларативный UI, ECS и снапшот-тесты влезают в одну библиотеку для PDF — заходите.

https://habr.com/ru/articles/1030796/

#open_source #java #pdf #pdfbox #graphcompose #layout_engine #document_generation #pagination #backend #visual_regression

GraphCompose: как я приволок ECS из геймдева и снапшот-тесты из фронта в PDF-генерацию на Java

С чего всё началось: проблема, которая бесила В мире Java для генерации PDF исторически есть три лагеря: Низкоуровневые рисовалки — iText, PDFBox. Быстро, мощно, но ты буквально пишешь на бумаге...

Хабр
The new bugfix release 2.0.36 of #Apache #PDFBox is available https://
pdfbox.apache.org/download.html
The new bugfix release 3.0.7 of #Apache #PDFBox is available https://
pdfbox.apache.org/download.html

ExtractPDF4J 2.0 ra mắt với khả năng trích xuất bảng từ PDF dạng văn bản và quét (có OCR). Hỗ trợ đa chiến lược: Stream, Lattice, OCR, HybridParser tự động chọn phương pháp tối ưu. Tích hợp CLI cho CI/CD, cấu hình annotation, Spring Boot & Docker. Tài liệu Javadoc đầy đủ, dễ tích hợp vào dự án Java. Phù hợp cho FinTech, tự động hóa xử lý tài liệu. #Java #OpenSource #PDF #OCR #DocumentAI #Automation #FinTech #BackendEngineering #PDFBox #Tesseract

https://www.reddit.com/r/programming/comments/1q5

I never realized how easy it is to create #PDF documents with #Java using #Apache #PDFBox. Printing with Java is just as easy as The #JDK has a wonderful PrintingService integratedalready! Why did i always put off using it. And with a bit of AI one can easily create nice looking documents!

#programming

@mattjohns The PDFBox command-line tool is good for this kind of thing (PDFMerger) https://pdfbox.apache.org/3.0/commandline.html

But, yeah, Windows should let you arrange PDF pages (without AI) like MacOS Preview

#pdf #windows #PDFBox

Apache PDFBox | Command-Line Tools

The Apache PDFBox™ library is an open source Java tool for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command-line utilities. Apache PDFBox is published under the Apache License v2.0.

The new bugfix release 3.0.6 of #Apache #PDFBox is available https://pdfbox.apache.org/download.html
Apache PDFBox | Download

The Apache PDFBox™ library is an open source Java tool for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command-line utilities. Apache PDFBox is published under the Apache License v2.0.

The new bugfix release 2.0.35 of #Apache #PDFBox is available https://pdfbox.apache.org/download.html
Apache PDFBox | Download

The Apache PDFBox™ library is an open source Java tool for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command-line utilities. Apache PDFBox is published under the Apache License v2.0.

The new bugfix release 3.0.5 of #Apache #PDFBox is available https://pdfbox.apache.org/download.html
Apache PDFBox | Download

The Apache PDFBox™ library is an open source Java tool for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command-line utilities. Apache PDFBox is published under the Apache License v2.0.