Mastodawn

RT @MiniMax_AI: Ein beeindruckendes tiefgehendes Gespräch des @togethercompute-Teams über den Einsatz von MiniMax M3 in der Produktion. M3 mit seinem 1-Millionen-Kontextfenster, nativer Multimodalität und der MiniMax Sparse Attention erfordert echte Arbeit an paged decode, Index-Scoreing und multimodaler Vorverarbeitung, um es effizient zu gestalten. So sieht eine Partnerschaft an der Frontierspitze aus🤝. Together AI (@togethercompute) x.com/i/article/206189124776… — https://nitter.net/togethercompute/status/2061894792020197881#m

mehr auf Arint.info

#AIInfrastructure #MiniMaxM3 #MultimodalAI #ProductionAI #SparseAttention #TogetherAI #arint_info

https://x.com/MiniMax_AI/status/2061913941702533241#m

Winbuzzer 4h ago

https://winbuzzer.com/2026/06/04/google-gemma-4-12b-targets-local-ai-agents-on-laptops-xcxwbn/

Google has released Gemma 4 12B, a local multimodal AI model for laptops that handles audio, images, code, and tool calls with 16GB memory locally.

#AI #Gemma4E4B #Gemma4 #GoogleGemma #Gemma #Google #GoogleAI #GoogleDeepMind #AIModels #MultimodalAI #OpenSourceAI #OnDeviceAI #AIAgents #AgenticAI

NERDS.xyz – Real Tech News for Real Nerds [Unofficial]10h ago

Google wants your next AI agent running locally on a 16GB laptop

https://fed.brid.gy/r/https://nerds.xyz/2026/06/google-gemma-4-12b-local-ai/

Arint - SEO+KI 1d ago

RT @MiniMax_AI: Ein beeindruckender tiefgehender Einblick des @togethercompute-Teams zum Einsatz von MiniMax M3 in der Produktion. M3 mit seinem 1-Millionen-Kontextfenster, nativer Multimodalität und der MiniMax-Sparse-Aufmerksamkeit erfordert echte Arbeit an paged decode, Index-Scoreing und multimodaler Vorverarbeitung, um Effizienz zu erreichen. So sieht eine Partnerschaft an der technologischen Spitze aus🤝. Together AI (@togethercompute) x.com/i/article/206189124776… — https://nitter.net/togethercompute/status/2061894792020197881#m

mehr auf Arint.info

#AIInfrastructure #LLMOps #MiniMaxM3 #MultimodalAI #SparseAttention #TogetherAI #arint_info

https://x.com/MiniMax_AI/status/2061913941702533241#m

Winbuzzer 1d ago

https://winbuzzer.com/2026/06/02/microsoft-adds-seven-mai-models-to-foundry-for-developers-xcxwbn/

Microsoft is putting seven first-party MAI models into developer channels, led by the MAI-Thinking-1 reasoning model in Foundry private preview.

#AI #MAIThinking1 #MicrosoftFoundry #Microsoft #MicrosoftAI #AIModels #MultimodalAI #Build2026

Winbuzzer 2d ago

https://winbuzzer.com/2026/06/01/nvidia-launches-cosmos-3-with-openmdw-for-physical-ai-xcxwbn/

NVIDIA has launched Cosmos 3 as a physical-AI model that combines scene reasoning, multimodal generation and action output, tying the release to a new OpenMDW licensing framework.

#AI #NVIDIA #PhysicalAI #AIModels #MultimodalAI #WorldModels #Robotics

Winbuzzer 2d ago

https://winbuzzer.com/2026/06/01/minimax-launches-m3-with-1m-context-multimodal-push-xcxwbn/

MiniMax is pushing M3 into the long-context model race with multimodal input and a claimed 1 million-token window.

#AI #MiniMax #AIModels #MultimodalAI #AgenticAI #AICoding #MiniMaxM3 #ChinaAI

Radar Kilat May 27

Hark raised $700 million Series A with $6 billion valuation in May 2026

#aihardware #startupfunding #multimodalai #seriesafunding #personalaiassistant

https://radarkilat.com/en/article/hark-raises-700-million-series-a-with-6-billion-valuation-to-build-ai-personal-assistant-hardware-and-models

Winbuzzer May 25

https://winbuzzer.com/2026/05/25/bytedance-hkust-find-better-long-document-ai-training-xcxwbn/

ByteDance AI Training Study: Multimodal Q&A Strategy Beats Raw OCR-Transcription Input

#AI #ByteDance #HKUST #AIModels #AIResearch #AITraining #MultimodalAI

bosh May 24

Tornare da un viaggio significa quasi sempre ritrovarsi con una quantità ingestibile di foto. Nel caso di Lisbona, il problema non era tanto archiviare gli scatti, quanto riuscire a estrarne una ventina davvero condivisibile: belle, sì, ma anche varie e capaci di raccontare l’esperienza nel suo insieme. PhotoPrism offriva già un’ottima base grazie a geolocalizzazione, riconoscimento facciale, label e strumenti di organizzazione, ma non aveva ancora un modo per comporre automaticamente un album con “le foto più belle” e soprattutto con sufficiente varietà.

Da qui è nata l’idea di un selezionatore AI: una piccola applicazione Java che usa PhotoPrism per recuperare le miniature delle immagini e Ollama per far lavorare due modelli AI, uno multimodale per assegnare un punteggio estetico e produrre una descrizione oggettiva, e un secondo modello testuale per raggruppare semanticamente le foto e selezionarle con più equilibrio.

Il problema vero non era la qualità

Il primo prototipo faceva una cosa molto semplice: prendere le foto da PhotoPrism, inviarle a un modello multimodale su Ollama e chiedere un voto estetico da 1 a 100 insieme a una breve descrizione. Sulla carta sembrava sufficiente, ma in pratica produceva una selezione monotona: immagini molto belle singolarmente, ma spesso troppo simili tra loro.

Era il classico caso in cui un ranking puro ottimizza la qualità locale ma non la copertura narrativa. Se cinque foto dello stesso scorcio o dello stesso momento ricevono voti alti, un algoritmo ingenuo tende a sceglierle tutte. Per costruire un album da condividere, invece, non basta premiare le immagini migliori: bisogna anche evitare la ripetizione.

[…]

#ai #albumFotografici #clusteringSemantico #computerVision #fotografia #java #Maven #multimodalAI #ollama #organizzazioneFoto #photoprism #PhotoPrismAPI #selezioneFoto #selfHosted https://www.b0sh.net/2026/05/ho-costruito-un-selezionatore-ai-per-scegliere-le-foto-migliori-da-photoprism/

Share to Mastodon - AddToAny