Voting is underway for #ApacheTika 3.3.0! Please give it a try and let us know if there are any surprises!
https://lists.apache.org/thread/pq4zjvqf3w5zbm5yoyg14qvr2kpd2by3
Voting is underway for #ApacheTika 3.3.0! Please give it a try and let us know if there are any surprises!
https://lists.apache.org/thread/pq4zjvqf3w5zbm5yoyg14qvr2kpd2by3
On #ApacheTika we're moving entirely to json for configuration in 4.x.
If you use tika-server and are interested in runtime configuration, please take a look and offer feedback:
https://lists.apache.org/thread/jlt8jv47t8tm58dlrnxsrfodxm2d6o0z
Please repost for reach.
RE: https://mastodon.social/@tallison/115452030199746498
Please join me tomorrow, November 13 at noon EST to chat #ApacheTika.
Please dm me for the connection info.
LOL.. given that I'm going to be a remote presenter, I taped my Digital Preservation Bake-off talk last night in case I have wifi-problems during the session.
I really wish conferences would require 3 or 4 videos of the talk before I'm allowed to speak.
In belated celebration of World Digital Preservation Day, I'm throwing a "What's new with Apache Tika/Office hours" meetup at noon on November 13 EST.
This is intended for anyone interested in files from search to digital preservation to file forensics/reverse engineering folks.
https://www.meetup.com/apache-tika-community/events/311746184
If I hosted an #ApacheTika demo/office hours on Thursday, Nov 6 at noon EST, would that time work?
Maybe I should throw a demo/office hours for #ApacheTika on #wdpd2025?
Y, #ApacheTika will extract what the PDF alleges it is.
These are some of the fields that I'll focus on in the #digipresBakeoff #ipres2025 #ipresBakeOff
These include pdf/a and pdf/x. hasMarkedContent suggests PDF/UA.