Show HN: Gemini can now natively embed video, so I built sub-second video search

Gemini Embedding 2 can project raw video directly into a 768-dimensional vector space alongside text. No transcription, no frame captioning, no intermediate text. A query like "green car cutting me off" is directly comparable to a 30-second video clip at the vector level.

I used this to build a CLI that indexes hours of footage into ChromaDB, then searches it with natural language and auto-trims the matching clip. Demo video on the GitHub README.
Indexing costs ~$2.50/hr of footage. Still-frame detection skips idle chunks, so security camera / sentry mode footage is much cheaper.

https://github.com/ssrajadh/sentrysearch

GitHub - ssrajadh/sentrysearch: Semantic search over videos using Gemini Embedding 2.

Semantic search over videos using Gemini Embedding 2. - ssrajadh/sentrysearch

GitHub
Where is the Exit to this dystopia?
In the matrix the exit was pay phones, which perhaps explains why our overlords are removing them