Как обучают ИИ: без формул, но с котами

В этой статье — без воды, трюизмов, академизмов и формул — разберёмся, в чём принципиальное отличие машинного обучения (ML) от до-ИИ программирования, а затем генеративного ИИ от классических моделей машинного обучения (ML). Поговорим о типах генеративных моделей, их архитектуре и областях применения. Заодно затронем важный вопрос: где проходит граница между классическим программированием и вероятностным творчеством, на котором построены современные нейросети. Статья ориентирована прежде всего на тех, кто делает первые шаги в ИИ, но если ты начинающий ML-инженер, архитектор ИИ-приложений, основатель стартапа или просто хочешь разобраться, что на самом деле происходит под капотом у ChatGPT и Midjourney — ты, скорее всего, найдёшь здесь для себя что-то полезное.

https://habr.com/ru/articles/919296/

#машинное+обучение #искусственный_интеллект #generative_models #generative_art #ml #научпоп #обучение_нейронных_сетей #генеративные_модели #парадигмы #selfsupervised

Как обучают ИИ: без формул, но с котами

Четыре кота, на которых стоит ML Что такое машинное обучение и как оно вообще «учится»? Чем это отличается от обычного программирования с if, for и «всё работает, пока...

Хабр

📢 What if AI models could teach themselves — no labels, no limits?
Meet the 5 foundation models rewriting the rules of machine learning in 2025. From DINO v2 to AZR, these systems learn by reconstructing, contrasting, and evolving — just like humans.

📊 Zero supervision. 90%+ accuracy. Real-world impact.
🧠 This article is a must-read for data scientists, tech leaders, and AI enthusiasts.

🔗 Read now:
https://medium.com/@rogt.x1997/top-5-foundation-models-that-learn-without-labels-and-beat-supervised-ai-8669ccb5cb0f

#AI #FoundationModels #SelfSupervised #MachineL
https://medium.com/@rogt.x1997/top-5-foundation-models-that-learn-without-labels-and-beat-supervised-ai-8669ccb5cb0f

Top 5 Foundation Models That Learn Without Labels — And Beat Supervised AI

The machines have learned to teach themselves — and it’s changing the future of artificial intelligence. Imagine an AI system learning to recognize faces by playing a complex game of hide-and-seek…

Medium

Мэтчинг персонажей. Level Hard

Интро Для всех, кто знаком со свертками, задача мэтчинга персонажейне кажется сверхсложной. На Kaggle есть даже соревнования с подобной задачей и размеченный датасет с персонажами мультсериала Симпсоны . Но здесь ключевое слово — «размеченный». Что делать, если датасет не размеченный и на каждом изображении несколько персонажей, а размечать все это очень не хочется? Тут на помощь приходят алгоритмы сегментации и контрастивное обучение, но обо всем по порядку. Какие данные Мы работали с коллекцией гравюр Британского музея. Все гравюры находятся в открытом доступе , поэтому мы их спарсили (исключительно в исследовательских целях) для дальнейших манипуляций. Итого, у нас в датасете оказалось около 25 тысяч гравюр. Да-да, это только гравюры, о количестве персонажей пока речи не идет. А учитывая любовь граверов 18-19 веков к изображению сцен с массовыми скоплениями людей, можем утверждать сразу, что персонажей будет намного больше.

https://habr.com/ru/articles/868742/

#image_segmentation #image_classification #selfsupervised #computer_vision #detection

Мэтчинг персонажей. Level Hard

Интро Для всех, кто знаком со свертками, задача мэтчинга персонажейне кажется сверхсложной. На Kaggle есть даже соревнования с подобной задачей и размеченный датасет с персонажами...

Хабр

Update on my joint work with @JulianTachella on "#Learning to Reconstruct Signals From Binary Measurements" on arXiv. #RandomProjection #onebit #SelfSupervisedLearning https://arxiv.org/abs/2303.08691

We brought several improvements on the proofs and the bounds allowing us to determine from how many binarized (random) projections, only, one can learn, up to a controlled identification error, a low-complexity space (with small box dimemsion). Moreover, a practical #selfsupervised scheme, SSBM, run over real datasets of images, enables to learn a reconstruction algorithm from those same binary observations (without access to the original images and on par with supervised alternatives), implicitly confirming the encoding of a good estimate of the image set.

Learning to Reconstruct Signals From Binary Measurements

Recent advances in unsupervised learning have highlighted the possibility of learning to reconstruct signals from noisy and incomplete linear measurements alone. These methods play a key role in medical and scientific imaging and sensing, where ground truth data is often scarce or difficult to obtain. However, in practice, measurements are not only noisy and incomplete but also quantized. Here we explore the extreme case of learning from binary observations and provide necessary and sufficient conditions on the number of measurements required for identifying a set of signals from incomplete binary data. Our results are complementary to existing bounds on signal recovery from binary measurements. Furthermore, we introduce a novel self-supervised learning approach, which we name SSBM, that only requires binary data for training. We demonstrate in a series of experiments with real datasets that SSBM performs on par with supervised learning and outperforms sparse reconstruction methods with a fixed wavelet basis by a large margin.

arXiv.org

From Julián Tachella @JulianTachella, posted on "Chi":

📰""Learning to reconstruct signals from binary measurements alone"📰

We present theory + a #selfsupervised approach for learning to reconstruct incomplete (!) and binary (!) measurements using the binary data itself. See the first figure and its alt-text.

https://arxiv.org/abs/2303.08691
with @lowrankjack
---

The theory characterizes

- the best approximation of a set of signals from incomplete binary observations
- its sample complexity
- complements existing theory for signal recovery from binary measurements

See the third figure and its alt-text.
---

The proposed self-supervised algorithm obtains performances on par with supervised learning and outperforms standard reconstruction techniques (such as binary iterative hard thresholding)

See the second figure and its alt-text.

---

Code based on the deepinverse library is available at https://github.com/tachella/ssbm

Check out the paper for more details!

#SelfSupervisedLearning #CompressiveSensing #Quantization #InverseProblem #1bitcamera

Learning to Reconstruct Signals From Binary Measurements

Recent advances in unsupervised learning have highlighted the possibility of learning to reconstruct signals from noisy and incomplete linear measurements alone. These methods play a key role in medical and scientific imaging and sensing, where ground truth data is often scarce or difficult to obtain. However, in practice, measurements are not only noisy and incomplete but also quantized. Here we explore the extreme case of learning from binary observations and provide necessary and sufficient conditions on the number of measurements required for identifying a set of signals from incomplete binary data. Our results are complementary to existing bounds on signal recovery from binary measurements. Furthermore, we introduce a novel self-supervised learning approach, which we name SSBM, that only requires binary data for training. We demonstrate in a series of experiments with real datasets that SSBM performs on par with supervised learning and outperforms sparse reconstruction methods with a fixed wavelet basis by a large margin.

arXiv.org

Check out our new work on weighted #generative neural #network models! 🧠🤖

Recently, we’ve seen tons of work using generative models to elucidate candidate principles of neural connectivity.

Our v2: Inspired by redundancy reduction, our new model can generate both the topology & weights of the #connectome

We envision this will help us understand #brain development and also impact structural #selfsupervised learning in ANNs in the future! 📚🧑‍🏫

https://www.biorxiv.org/content/10.1101/2023.06.23.546237v1

#neuroscience #AI #ANN

date: 2023-03-31 09:48:02
by: AICareer (🚀Jobs-Internships-Scholarships)

PhD Studentship – Self-supervised ML from Multiple Sensory Data
at University of Birmingham
Check the details here: https://ai-jobs.org/job/phd-studentship-self-supervised-machine-learning-from-multiple-sensory-data/

#phd #phdposition #phdscholarship #selfsupervised #machinelearning #ai #artificialintelliegence #university #universityofbirmingham

🐦🔗: https://twitter.com/twitter/statuses/1641739119708590080
#PhdPosition

PhD Studentship - Self-supervised Machine Learning from Multiple Sensory Data - AI Jobs

Funding: The position offered is for three and a half years full-time study. The current (2022-23) value of the award is stipend; £17,668 pa; tuition fee: £4,596 pa. Awards are usually incremented on 1 October each following year. Eligibility: First or Upper Second Class Honours undergraduate degree and/or postgraduate degree with Distinction (or an international equivalent). Machine […]

AI Jobs

For the last highlight of 2022, our latest contribution on exploiting high-level structure in #Speech for #SelfSupervised learning. Presented at #NeurIPS2022

Joint work with @tiagoCuervoG@twitter.com, Adrian Łancucki, Paweł Rychlikowski and Jan Chorowski.

Check out the nice summary 👇

https://twitter.com/tiagoCuervoG/status/1608119507519959040?t=nnqXLZCej5KRUIpQGEUYLA&s=19

Santiago Cuervo on Twitter

“A bit overdue, but still glad to introduce our work to wrap up the year: Variable-rate hierarchical CPC leads to acoustic unit discovery in speech ↕️ ⌛💬🧠 presented at #NeurIPS2022. #Speech #AI #DL #RL #SignalProcessing #SelfSupervised 📜: https://t.co/ISHU2jF9eX 🧵👇(1/n)”

Twitter

For those sticking around to the very end at 5.25pm 😂 , I'll be presenting our team's work on Sentinel-1 SAR-based landslide change detection using self-supervised techniques! https://agu.confex.com/agu/fm22/meetingapp.cgi/Paper/1162298

P.S. We've open sourced our data pipeline and model code at https://gitlab.com/frontierdevelopmentlab/2022-us-sarchangedetection/deepslide

#AGU22 #SAR #SelfSupervised #DeepLearning #Sentinel1 #FrontierDevelopmentLab #FDL22 #SAR4ML #eochat

DeepSlide: Self-supervised learning on SAR data for change detection

In light of the upcoming NASA-ISRO NISAR satellite mission, which will further ...

AGU - Fall Meeting 2022
@tyrell_turing #SelfSupervised @cogneurophys imo better models of the brain