Some devs write PDFs.
Others generate them from AI prompts — fully offline, entirely in Java.
Here’s how with Quarkus, LangChain4j, Ollama, and iText:
https://myfear.substack.com/p/ai-whitepaper-generator-quarkus-langchain4j-itext
Some devs write PDFs.
Others generate them from AI prompts — fully offline, entirely in Java.
Here’s how with Quarkus, LangChain4j, Ollama, and iText:
https://myfear.substack.com/p/ai-whitepaper-generator-quarkus-langchain4j-itext
On 14th February 2000 the first public version of the iText PDF library was released to the open-source community. A quarter of a century later, Apryse is proudly celebrating with the release of iText Suite 9.1 on iText’s 25th anniversary, which is also Valentine’s Day! This release brings significantly expanded SVG and CSS support, huge performance increases, GraalVM for pdfHTML, and a whole lot of love!
Defusing AGPL-3 With Batch Processing
Preamble: this is an intentionally incendiary post, you could even define it as trolling. I have been thinking whether to publish this for a while, as it is on the border of being the type of unkind post I wouldn’t want to be associated with. But I decided to post this for two reasons: the first is that clearly a lot of people already know what I’m talking about, so it’s not like I’m revealing some tightly-held secret, and the second is that hopefully it will make FLOSS users and activists reflect a bit on the position they take about proprietary software piracy.
Disclaimer: as usual, the content of this blog is solely my personal output. My employers, current, past, and future, do not get to review my writing, and as such this does not represent their position or opinions. I’m also not a lawyer so you should not be taking the content of this post as a legal opinion.
You may already be familiar with AGPL-3.0, the GNU Affero General Public License. This is a variation of GNU’s favourite license, designed explicitly for the purpose of making copyleft apply to Web Apps. To over-simplify a bit how this works, GPL (both v2 and v3) require you to provide corresponding source code to the users who received a copy of the program — but when the program is a web application, only the administrator who set it up received a copy — the user accessing the application is only interacting with it. As such, the Affero variant introduces a new clause that state:
Notwithstanding any other provision of this License, if you modify the Program, your modified version must prominently offer all users interacting with it remotely through a computer network (if your version supports such interaction) an opportunity to receive the Corresponding Source of your version by providing access to the Corresponding Source from a network server at no charge, through some standard or customary means of facilitating copying of software. This Corresponding Source shall include the Corresponding Source for any work covered by version 3 of the GNU General Public License that is incorporated pursuant to the following paragraph.
together with some more language to make it compatible with GPL-3.
The “network interaction” part was definitely intended to refer to web applications, but it does work for a few other cases — back when I had a more radical stance on Free Software (currently, it’s more nuanced), I have actually released rbot plugins under AGPL-3, and made sure that they included an info command that would point at the repository with the source code. As it turns out this might not have actually satisfied the letter of the license, even though I was following the spirit of it, but that’s another problem.
Much as the history of AGPL-3 seems to be a star-studded cast of innovative heroes, there have been other similar attempts to make a stronger license apply to software that implements a network service. NoX-Wizard, an Ultima OnLine server emulator to which I contributed close to 20 years ago now, used a modified GPL-2 license for this — which made it incompatible with GPL-2 in the first place. The primary breakthrough for AGPL-3 was the wording to make it explicitly compatible with GPL-3.
On the other hand, aside from the original intention, AGPL-3 had possibly unintentional side effects, which were already covered by Armin Ronacher over ten years ago: the somewhat significant restrictions applied by AGPL-3 are very much asymmetric for projects where a commercial entity has full copyright control — just like GPL would be for embedded libraries.
A second alternative usage of AGPL-3 has been using it as a way to keep your software away from Google (and possibly a few other companies), since their stance on this license is that it is toxic. No matter whether your software can meaningfully be used over the network, releasing it under AGPL-3 has meant for a long time that Google will refuse to touch it at all. Make of that what you wish.
Much as I am not a fan of AGPL-3 at this point, there are still a few cases for which I not only see the point, but I think it is actually a perfectly acceptable choice. Slicer software for 3D printing is one of these: without AGPL-3, you would totally see a number of services turning the software into an online-only, pay-for-slicing services. So kudos to whoever decided to use this particular license for what is probably the great ancestor of all the commonly used slicers.
But, none of this answer the question of “can I take an AGPL-3 project, use it, and yet not be required to release any line of code?” — The answer is “it depends, but possibly”, and I’m about to get to this with a practical example.
iText, whose Wikipedia page appears heavily edited by the people involved in the project, is a commercial library that is dual-licensed as AGPL-3 and with a proprietary license. This software came to my attention the first time around the time I started looking at my banks’ PDFs. This is because the library applies metadata to generated PDF files that explicitly notes the license it is being used as. Most people would never notice that, and even if they did notice it, only a handful of people would ever understand what it meant, and even fewer would care.
As I keep adding documents generated by different services to my hacky tool, I keep finding more and more of them generated by the AGPL-3 version of iText and its sibling iTextSharp — and I realized why this is the case: for the most part, iText’s license is not as relevant as it might sound from their website or Wikipedia entry, particularly when using it for generating invoices, bills, or statements.
The AGPL-3 license explicitly calls out the interaction over the network between the user and a covered program. It does not apply on the output of the program. Services that do generate PDFs for invoices, monthly bills, or bank statements do not do this interactively. Indeed in many cases it is a strong requirement that the file is not generated on the fly, but rather served from an already-generated, immutable file whose signature and checksum remains stable.
The only real use case for which the network interaction could be relevant, is for the on-the-fly generation of PDFs such as reports — but even that is going to be tenuous, as you could have a separate network service generate the PDF, and the web application serve it off a temporary shared storage system.
Technically, you could probably use a similar setup to put a 3D printing slicer online: you would want to have the web application take a set of parameters, and then run the slicing as a background batch job, informing the users when the files are ready for them to collect. You wouldn’t be interacting with the software over the network at all, so you wouldn’t need to release any of your web application as AGPL-3. On the other hand, given the complexity of the UI, the multitude of the parameters, and so on, it does not sound likely that this would become any useful for a commercial application.
What’s the lesson here? Well, I’m not quite sure — if anything, it should make you think twice when choosing a license, for it to be actually accomplishing what you’re setting up to. Personally, it tells me that if you want to apply specific restrictions in the hope to get commercial actors to act against their best interest, you’re probably just slowing them down a little, and making it more complicated for the consumers that will have to use the service in the first place.
This is why I’ve been thinking how to make unpaper easier to integrate: the tool is licensed under GPL-2, but it is used in more permissively licensed software by executing it at arms’ length, rather than linking it directly into them. Making it easier to use in that mode might be considered defeating the spirit of GPL-2, but the reality is that making it harder to use is just a disservice for the users, and not really getting any more contributions.
今年度 @COSCUP 2024 法律軌 "Open Licensing Kaleidoscope" 有兩場演講以侵權糾紛為主題,第一場介紹 GPL-2.0 的核心義務內容,以及相關的侵權糾紛,適合所有想了解這個主題的人,第二場針對近來在台灣發生的著作權蟑螂行為,將說明 AGPL-3.0 的相關內容與合理邊際,適合中階者。歡迎對這兩場演講有興趣的人前來聆聽。
(Chinese)
Typical Infringement Disputes of GPL
14:50-15:20
Florence T.M. Ko 葛冬梅
https://coscup.org/2024/en/session/H7DTXT
(Chinese)
iText & QT -- Copyright or Copywrong? Copyleft or Copytroll?
15:25-15:55
Lucien C.H. Lin 林誠夏
https://coscup.org/2024/en/session/TRHD9K

@julientaq @spaceninja Do you happen to know how well those might support PDF/UA or WCAG 2 standards for accessible PDFs? Anything that doesn't is a non-starter for me.
iText and PrinceXML both claim to support PDF/UA.
On the WeasyPrint site, they also claim to be able to produce PDF/UA compliant files. I wonder if anyone knows of examples I might look at?
I don't see any mention of accessibility in vivliostyle or paged.js.
Bruno Lowagie, creator of #iText: From #FOSS to COSS https://codecodeship.com/blog/2023-04-07-bruno-lowagie
Bruno, first of all, thanks for agreeing to this interview. In 2004, iText was 4 years old, and in widespread use. Can you tell us what factored into the decision to commercialize your library? iText started as a project that I maintained after hours ...