Tôi đang kỹ thuật):如何設計一個搜索庫存200,000 PDF trangkí trên 128GB trạng xác? Gợi ý công cụ, cấu trúc dữ liệu, và kiến trúc tìm kiếm khôngाने? Mục tiêu: Cần truy cập أدري online với máy tính thông dụng 20 năm sau.

#Verbatim #PDF #DataArchiving #OfflineSearch #TechSolutions #VietnamTech #DatabaseDesign

*(397 ký tự)*

http://Archive.org

Internet Archive: Digital Library of Free & Borrowable Texts, Movies, Music & Wayback Machine

Tuyển dụng: Vị trí scrape 300.000 tiêu đề sách PDF từ AbeBooks, tìm file từ Wayback Machine/Anna's Archive. Tổng 4TB dữ liệu sẽ được lưu trữ vào đĩa quang 128GB (Verbatim/Panasonic) để đảm bảo đọc được 100 năm. Ngân sách: $700 (chưa vật tư).

#TuyểnDụng #Scraping #LưuTrữDữLiệu #PDF #AbeBooks
#Hiring #DataScraping #DataArchiving #PDF

https://www.reddit.com/r/programming/comments/1o4te1o/hiring_scrape_300000_pdfs_and_archive_to_128_gb/

Why 7-Zip is the Go-To Software for High-Ratio Compression and Encryption

7-Zip is a free, open-source file archiver software, widely recognized for its high compression ratio and versatility in handling various archive formats.

GTech Booster
Vanaf 1 september kun je weer elke maandag ochtend tussen 10-11 digitaal langs komen met al je vragen rondom #DataArchiving #DataManagement of de @DANS_knaw_nwo diensten
https://dans.knaw.nl/en/agenda/23892/
Open Hour SSH: live Q&A on Monday - DANS

DANS

Terence Eden: You don’t need an API key to archive Twitter Data. “Apparently there’s no need for IP laws any more, so here’s a way to archive high-fidelity Twitter data without signing up for an expensive API key. This is perfect for academics wishing to preserve Tweets, journalists wanting to download evidence, or simply embedding content without leaking user data back to Twitter.”

https://rbfirehose.com/2025/04/16/terence-eden-you-dont-need-an-api-key-to-archive-twitter-data/

Terence Eden: You don’t need an API key to archive Twitter Data | ResearchBuzz: Firehose

ResearchBuzz: Firehose | Individual posts from ResearchBuzz

If I had to pick one YouTube channel to archive, it would be the PBS CrashCourse channel.

Actually someone should probably back that channel up before the republicans dismantle PBS. I might try to do that myself soon once I get the time. #youtube #archive #dataarchiving

Interested in #OpenScience & the state of #DataArchiving in ancient #genomics research (maybe a bit niche, but still 😉)?

My colleagues Anders Bergström, Tina Warinner, & I, all working in #aDNA have published an invited commentary piece describing the challenges in both sequencing data & #metadata archiving. We propose some solutions (some active!) to improve adherence to #FAIR in #palaeogenomics. We hope some may be a model for the wider #ArchaeologicalSciences field.

https://rdcu.be/d21Kn

#TheMetalDogArticleList #ArsTechnica Music industry’s 1990s hard drives, like all HDDs, are dying The music industry traded tape for hard drives and got a hard-earned lesson. arstechnica.com/gadgets/2024... #MusicIndustry #HardDrives #DataArchiving #Entropy #Storage #Failure #DataLoss
Music industry’s 1990s hard drives, like all HDDs, are dying

The music industry traded tape for hard drives and got a hard-earned lesson.

Ars Technica
Why baking and data archiving really aren’t all that different - Lab Horizons

Proper documentation in science is vitally important but did you how closely related the challenges are to baking? Read about data archiving best practices and all the ingredients you need to make it go well.

Lab Horizons - Exploring the Bright Future of Science in a Digital World