archive DataHoarders

markdown formatted

A nice presentable DataHoarders archive has been created regarding the epstein files

The archive is online accessible as given in the sources matrix.

Even if the content is less interesting to you, the manner in which the front & backend end is built is quite interesting. I have interests in both backend and frontend programming & networking, thus think this is a treasure trove from both perspectives.

YMMV

When you glance through the wikipedia pages of Jeffrey you will find interesting tidbits of his nature rise and fall. When you read it multiple times you will know more than you may want to about this man, enabled by different forces to flourish in his behavour. Go in with a neutral mind and read the sources, go there if you want to know more.

The wikipedia dbase of epstein is LONG the data ammount is massive. Don't expect to even glance over it in just a few minutes.

There are 305 references in this document

When you go to this datahoarders media archive you will have a pleasant representation of the visual and printed data as released by the USA DOJ

Quotes from the archive creators:

Hey! We are two college students and we just want to share the technical part of our project because you might appreciate it. The DOJ released the Epstein files and we decided to host the entire thing ourselves and build a proper interface on top of it. Here is what the archive actually looks like.

354GB total. 160GB of raw data from the original files and 194GB of our own processed data. Around 600,000 PDF files which actually contain roughly 1,400,000 individual pages inside them since many PDFs bundle multiple pages together when you scroll down. All 3,200 videos have been converted to HLS with adaptive bitrate streaming so quality adjusts automatically to your connection the same way Netflix does it.

For the videos we ran a full audio extraction pipeline, converting video to audio MP4 and then audio to text, generating SRT subtitle files for every single video that contains spoken content. This means you can search for a word that was spoken in any video and find the exact moment it was said

For the PDFs we converted every single page to PNG and ran OCR across all 1,400,000 pages. We then used Go to run AI agents that analyze and summarize the OCR output across the documents. The search engine works through tags associated to each specific file, built on top of all that processed data.

The frontend is React Native, infrastructure runs through Cloudflare.

We also added the possibility for a user to make an anonymous account to like, add a comment and reply to others or make your own investigation post on our platform.

We are not stopping here. There is still a lot to do and we are pushing updates constantly.

Z

Naturally ffmpeg / curl are crucial tool combo's for all this conversion fetch and serve to work smoothly, but I don't need to tell you that. There are many more tools used, go in read and learn!

Sources:

https://exposingepstein.com/home

https://en.wikipedia.org/wiki/Jeffrey_Epstein

https://www.reddit.com/r/DataHoarder/comments/1shx4po/we_scraped_processed_and_now_host_the_entire_doj/

#programming #database #video #HLS #pdf #recoding #streaming #json #backend #frontend #react #srt #subtitles #FFMPEG

ExposingEpstein

How to fix: Jellyfin Library Scan Hangs

Just recently I wrote about How to install Jellyfin on a RaspberryPi. It all started promising until – after adding a couple of movies – the Library Scan suddenly hung. Unfortunately I didn’t dig too deep into all kinds of logs (which I certainly forgot but which might have given a hint).

[…]

https://www.locked.de/how-to-fix-jellyfin-library-scan-hangs/

#Handbreak #Jellyfin #RaspberryPi #recoding #Video

How to fix: Jellyfin Library Scan Hangs

Just recently I wrote about How to install Jellyfin on a RaspberryPi. It all started promising until - after adding a couple of movies - the Library Scan suddenly hung. Unfortunately I didn't dig too deep into all kinds of logs (which I certainly forgot but which might have given a hint). TL

The IT Blog
[R(ae)cord] 되게 중독성 있고 신나는 노래예요! 🎧 | aespa 에스파 ‘Trick or Trick’ Recording Behind the Scenes

YouTube
[R(ae)cord] aespa 에스파 ‘Drama’ Recording Behind The Scenes

YouTube
When water temperatures change, the molecular motors of cephalopods do too

Cephalopods are a large family of marine animals that includes octopuses, cuttlefish and squid. They live in every ocean, from warm, shallow tropical waters to near-freezing, abyssal depths. More remarkably, report two scientists at University of California San Diego in a new study, at least some cephalopods possess the ability to recode protein motors within cells to adapt to different water temperatures.

Phys.org