Mastodawn

Mô hình AniMUL-v1, 30B tham số, được tinh chỉnh từ Qwen3-Omni để phân loại loài động vật qua âm thanh. Sử dụng dữ liệu NatureLM với 26 triệu cặp âm thanh-văn bản, huấn luyện trên 8x B200 suốt ~912 giờ. Hiệu suất vượt trội: 75% trùng khớp chính xác, tăng 61% so với mô hình gốc. Thử nghiệm tại animul.ai! #AI #MachineLearning #AudioClassification #SpeciesIdentification #TríTuệNhânTạo #PhânLoạiÂmThanh #ĐộngVật học #MôHìnhNgônNgữ

https://www.reddit.com/r/LocalLLaMA/comments/1qtf8hk/animulv1_a_30b_mo

AI Daily Post Dec 17

A new open‑source audio dataset on Hugging Face is raising the bar for speech recognition and audio classification. It covers diverse accents, real‑world noise, and precise timing, helping multimodal AI models get more robust. Dive in to see how MRSAudio can boost your projects! #MRSAudio #HuggingFace #SpeechRecognition #AudioClassification

🔗 https://aidailypost.com/news/audio-dataset-valuable-listening-models-tackles-noise-accents-timing

Larry O'Brien May 11, 2022

In an ML model for ID'ing bird calls, you _don't_ want to slice the training data into small, eg, 5 sec slices. All bird recordings are polluted w background, non-target birds. You need to multimodel, w 1st model grabbing long, eg, 60 secs saying 'ROI for target species at seconds :31-:36" and then your 2nd model is your fine-tuned discriminator. #ML #AudioClassification