The Google DeepMind team really cooked with Gemini 3.1 in the Live API: it's fast and the output quality is great🔥
That's why at @llamaindex we decided to test it out with our bread and butter: document processing📄
The voice agent we built:
- Takes voice command from terminal
- Calls tools to explore available files and parse them, powered by LiteParse, our fully-local parser
- Live-updates you on its task🔊
Take a look at the demo👇
Repo: https://github.com/run-llama/voice-document-assistant
That's why at @llamaindex we decided to test it out with our bread and butter: document processing📄
The voice agent we built:
- Takes voice command from terminal
- Calls tools to explore available files and parse them, powered by LiteParse, our fully-local parser
- Live-updates you on its task🔊
Take a look at the demo👇
Repo: https://github.com/run-llama/voice-document-assistant