Voting is underway for #ApacheTika 3.3.0! Please give it a try and let us know if there are any surprises!

https://lists.apache.org/thread/pq4zjvqf3w5zbm5yoyg14qvr2kpd2by3

On #ApacheTika we're moving entirely to json for configuration in 4.x.

If you use tika-server and are interested in runtime configuration, please take a look and offer feedback:

https://lists.apache.org/thread/jlt8jv47t8tm58dlrnxsrfodxm2d6o0z

Please repost for reach.

โš ๏ธ CRITICAL XXE bug (CVE-2025-66516, CVSS 10.0) in Apache Tika (tika-core, tika-pdf-module, tika-parsers). Exploitation via crafted PDFs can lead to file disclosure & RCE. Upgrade to 3.2.2+ ASAP! https://radar.offseq.com/threat/critical-xxe-bug-cve-2025-66516-cvss-100-hits-apac-d08561e7 #OffSeq #ApacheTika #XXE #Security
๐Ÿšจ CVE-2025-66516 CRITICAL: XXE in Apache Tika core (v1.13โ€“3.2.1), tika-pdf-module, tika-parsers. Exploitable via crafted PDF XFA files โ€” risks data exfil & DoS. Patch to 3.2.2+ now! https://radar.offseq.com/threat/cve-2025-66516-cwe-611-improper-restriction-of-xml-fa601313 #OffSeq #ApacheTika #XXE #Vuln

RE: https://mastodon.social/@tallison/115452030199746498

Please join me tomorrow, November 13 at noon EST to chat #ApacheTika.

Please dm me for the connection info.

LOL.. given that I'm going to be a remote presenter, I taped my Digital Preservation Bake-off talk last night in case I have wifi-problems during the session.

I really wish conferences would require 3 or 4 videos of the talk before I'm allowed to speak.

#ipres2025 #digipresBakeoff #ApacheTika

In belated celebration of World Digital Preservation Day, I'm throwing a "What's new with Apache Tika/Office hours" meetup at noon on November 13 EST.

This is intended for anyone interested in files from search to digital preservation to file forensics/reverse engineering folks.

https://www.meetup.com/apache-tika-community/events/311746184

#wdpd2025 #ApacheTika

Apache Tika -- What's New/Office Hours, Thu, Nov 13, 2025, 12:00 PM | Meetup

This will be an expansion of my presentation at the Digital Preservation Bake Off (Tools Demonstration) #iPres2025 and a late entry to celebrate World Digital Preservation

Meetup

If I hosted an #ApacheTika demo/office hours on Thursday, Nov 6 at noon EST, would that time work?

#wdpd2025

@mutanthumb

Maybe I should throw a demo/office hours for #ApacheTika on #wdpd2025?

@mutanthumb

Y, #ApacheTika will extract what the PDF alleges it is.

These are some of the fields that I'll focus on in the #digipresBakeoff #ipres2025 #ipresBakeOff

These include pdf/a and pdf/x. hasMarkedContent suggests PDF/UA.