🚀 Running #OCR at scale with a #Vision #LLM for $0.49/hour

Just deployed dots.ocr (3B parameter Vision LLM by RedNote) on a single #RTX A6000 (48GB VRAM) via #RunPod. The results are great:

https://github.com/rednote-hilab/dots.ocr

#ai #opensource

📄 The Setup
- Upload any #PDF → server converts each page to an image (PyMuPDF)
- Images are sent in parallel to #vLLM (continuous batching)
- The Vision LLM reads each page and returns clean Markdown

🧵 👇

So, #tech question targeting #cloudGaming as well as #selfHosting / #selfHosted:

I've been looking around at GPU-on-demand providers and came across a number of decent offerings, currently favouring #RunPod (https://runpod.io).

However, storage is as always the killing blow to any self-run, cloud-hosted game streaming setup. I can live with as low as 250gb but my budget frame is around $15/month max including GPU hours (we're talking about 15-20h/month max) -- outside of that and you end up in GFN territory which I'm not going to pay.

Talking to some friends last night got me on an interesting track though, what if you host a RunPod with only session storage (which AFAIK grows dynamically at no cost but is also lost once the system is shut down) and backup / restore that session storage on-the-fly to a provider like #Wasabi?

I checked a few providers in that regard and Wasabi seems to be the only one to have no egress fees as long as you play fair. Big question being is "download 250gb about 4-6 times a month in one go" still fair?

Are there alternatives?

I'm all ears..

And no, "get a gaming PC" doesn't cut it - for one I want to be able to stream my games via moonlight and parsec and for another the price of even a low-end gaming rig to use for this covers the cost of a cloud-hosted environment for the next decade.

AI and Cloud Infrastructure Provider | Runpod

AI infrastructure with on-demand GPUs and serverless compute. Run training, inference, and batch workloads on the cloud with Runpod.

Runpod

Runpod, khởi nghiệp AI cloud từ Reddit 2022, hôm nay đạt $120M ARR và 500k nhà phát triển. Nền tảng cung cấp GPU “gần như tại chỗ” với bảo mật, mở rộng serverless, API đơn giản, không hợp đồng dài hạn, hỗ trợ H100. Cảm ơn cộng đồng! #AI #Cloud #GPU #Startup #Runpod #CôngNghệ #KhởiNghiệp

https://www.reddit.com/r/LocalLLaMA/comments/1qib2ks/runpod_hits_120m_arr_four_years_after_launching/

Why 90% of LLM Fine-Tuning Fails & How RunPod Fixes It

…A practical breakdown of data, cost and control

Medium
#RunPod 上的 #RTX5090#QwenImageEdit2509 #ComfyUI 生成一張要40秒左右,在 #Replicate 上用官方的 #QwenImageEditPlus 大約是8秒生成一張圖

Chạy Kyutai Unmute trên máy chủ Runpod L40s với 1 GPU. Dự án cho phép chạy trực tiếp trên thiết bị iOS thông qua kết nối WebRTC tương thích OpenAI. #KyutaiUnmute #Runpod #LLaMA #TríTuệNhânTạo #AI

https://www.reddit.com/r/LocalLLaMA/comments/1okvyud/run_kyutai_unmute_on_a_runpod_l40s_singlegpu/

🚀 Tired of burning weekends fixing infra? RunPod 2025 makes GPU deploys boring (in the best way). Pods, endpoints & MCP turn ideas into live projects faster than ever. ⚡

👉 Read the full guide:
https://medium.com/@rogt.x1997/pods-endpoints-and-a-smoother-future-the-hidden-simplicity-of-runpod-f9bace9e1a8c

#RunPod #AIInfra #AIBuilders
https://medium.com/@rogt.x1997/pods-endpoints-and-a-smoother-future-the-hidden-simplicity-of-runpod-f9bace9e1a8c

Pods, Endpoints and a Smoother Future: The Hidden Simplicity of RunPod

…Why infra finally feels boring (in the best possible way)

Medium

🚀 Tired of burning weekends fixing infra? RunPod 2025 makes GPU deploys boring (in the best way). Pods, endpoints & MCP turn ideas into live projects faster than ever. ⚡

👉 Read the full guide:
https://medium.com/@rogt.x1997/pods-endpoints-and-a-smoother-future-the-hidden-simplicity-of-runpod-f9bace9e1a8c

#RunPod #AIInfra #AIBuilders
https://medium.com/@rogt.x1997/pods-endpoints-and-a-smoother-future-the-hidden-simplicity-of-runpod-f9bace9e1a8c

Pods, Endpoints and a Smoother Future: The Hidden Simplicity of RunPod

…Why infra finally feels boring (in the best possible way)

Medium
🚨 Trained a GPT-style model for just $0.80 in 90 mins.
🤯 No GPU farm. No million-dollar lab. Just LoRA + RunPod magic.
This changes everything for indie devs, students & lean startups.
👇 Read the future of fine-tuning here:
https://medium.com/@rogt.x1997/lora-runpod-the-0-80-ai-revolution-you-cant-afford-to-ignore-c14c2ed857a9
#LoRA #AIRevolution #RunPod
https://medium.com/@rogt.x1997/lora-runpod-the-0-80-ai-revolution-you-cant-afford-to-ignore-c14c2ed857a9
LoRA + RunPod: The $0.80 AI Revolution You Can’t Afford to Ignore

In 2025, the most disruptive force in AI isn’t a billion-dollar foundation model. It’s LoRA — Low-Rank Adaptation — teamed up with RunPod. Together, they’re reshaping fine-tuning from a luxury…

Medium

💻 Ever wondered how startups are training 70B parameter models for under $10?

This is your backstage pass to the AI cloud revolution:
• 64 H100s
• 75% cost savings
• 240K tokens per dollar
⚙️ RunPod is quietly powering the next wave of GenAI breakthroughs.

🔥 Read the full case study now:
👉 https://medium.com/@rogt.x1997/why-64-h100s-on-runpod-beat-hyperscalers-and-how-one-startup-slashed-65-of-their-ai-costs-ba251302015e
#LLM #RunPod #GPUCloud #GenAI #TokenEconomy #Mistral
https://medium.com/@rogt.x1997/why-64-h100s-on-runpod-beat-hyperscalers-and-how-one-startup-slashed-65-of-their-ai-costs-ba251302015e

Why 64 H100s on RunPod Beat Hyperscalers-And How One Startup Slashed 65% of Their AI Costs…

In the high-stakes world of generative AI, training a language model isn’t just a technical task — it’s an economic and architectural challenge. Massive models like LLaMA 3 or DeepSeek R1 demand…

Medium