Gasper Begus

@begus
95 Followers
53 Following
67 Posts
Asst. Professor @UCBerkeley.
Interpretable DL & language (teaching #AI how to speak) 
PI @BerkeleySCLab, College Principal at Bowles, Linguistics Lead @ProjectCETI, Prev. @Harvard @UW
Websitehttps://www.gasperbegus.com
YouTubehttps://www.youtube.com/channel/UCH16qHjOg8rbcAc_mtZo7Vw

One of the most puzzling questions in Vedic Sanskrit and Avestan is how word-final -ah turns into an -o.

Here's a new proposal that explains the super complex development of fricatives in Indo-Iranian, with a special focus on Avestan and Old Persian.

Also a cool fact: Ahura Mazda is written in Assyrian as as-sa-ra ma-za-áš.

Paper (it's been great to cite works from the 19th century, 1882, 1888, 1889 ...):

https://ling.auf.net/lingbuzz/007911

The development of Indo-Iranian voiced fricatives - lingbuzz/007911

The development of voiced sibilants is a long-standing puzzle in Indo-Iranian historical phonology. In Vedic, all voiced sibilants are lost from the system, but the details of this loss are complex an - lingbuzz, the linguistics archive

Linguistics is an unsung hero!

My reflections on:
- why studying ancient languages was important for my research in ML
- why linguistics is awesome
- how my education fully recapitulates the evolution of linguistics

And what I had learn from that.

Post:
https://open.substack.com/pub/begus/p/linguistics-is-an-unsung-hero

Linguistics is an unsung hero

And how I got from linguistics to machine learning

Gasper’s Substack

It was wonderful to give a talk at U Arizona.

Key takeaways:

- Amazing linguistics department

- Heat like I've never experienced before

- Cacti very sharp

- My talk

There are many ways to model language - with rules, exemplars, FSA, or Bayesian approaches. In this talk, I propose a new way to model language in a fully unsupervised way from raw speech: as a dependency between latent space and generated data in generative AI models called Generative Adversarial Nets (GANs).

A new Indo-European language is discovered!

It’s an Anatolian language, appears to be related to Luwian.

Its name is Kalašma.

I didn’t think another Indo-European language would be found within our lifetimes. This is so exciting. Can’t wait to read what the tablets are saying.

Read more:

https://www.uni-wuerzburg.de/en/news-and-events/news/detail/news/new-indo-european-language-discovered/

New Indo-European Language Discovered

An excavation in Turkey has brought to light an unknown Indo-European language. Professor Daniel Schwemer, an expert for the ancient near east from Würzburg, is involved in investigating the discovery.

I'm on Substack.

How can we get a better understanding of how we're similar and different from animals and machines?

Some big picture thoughts about the research directions of our lab.

https://open.substack.com/pub/begus/p/artificial-and-biological-intelligence?r=2sxjr4&utm_campaign=post&utm_medium=web

Artificial and Biological Intelligence

Humans, Animals, and Machines

Gasper’s Substack

We're proposing a model that is perhaps the closest approximation of human language acquisition with deep learning.

We model information exchange through articulation and raw sound perception.

Advantages:
-Communicative intent
-Raw speech
-Imitation and imagination
-Embodiment

The model can be used for behavioral, neural, articulatory, and evolutionary simulations of human speech. Code on GitHub.

https://arxiv.org/abs/2309.07861

CiwaGAN: Articulatory information exchange

Humans encode information into sounds by controlling articulators and decode information from sounds using the auditory apparatus. This paper introduces CiwaGAN, a model of human spoken language acquisition that combines unsupervised articulatory modeling with an unsupervised model of information exchange through the auditory modality. While prior research includes unsupervised articulatory modeling and information exchange separately, our model is the first to combine the two components. The paper also proposes an improved articulatory model with more interpretable internal representations. The proposed CiwaGAN model is the most realistic approximation of human spoken language acquisition using deep learning. As such, it is useful for cognitively plausible simulations of the human speech act.

arXiv.org
Just gave a talk on Generative AI and speech at IIT Guwahati's School of Data Science and AI. It was great to meet the people at the institute and learn about their research.

Any acoustic property can be compared in raw form (untransformed) using this technique.

How does this work? We train a GAN model with Generator (production) and Discriminator (perception) on raw speech. We then send the exact same syllable that was used in the brain experiment (cABR) to the Discriminator (or force the Generator to output similar syllables).

To our knowledge, this is the most similar signal between the brain and artificial neural networks that requires not transformations.

We propose a new technique for comparing the human brain and deep neural networks.

Key points:

- No transformations required between signals
- Unsupervised learning on raw speech
- Production and perception
- Trained on two languages (English and Spanish)

The proposed technique has a very similar underlying mechanism to actual brain imaging experiments: averaging of activity (biological and artificial) in the time domain.

We found one of the most similar signals between the brain and artificial neural networks

Blue is brain wave when humans listen to a vowel. Red is artificial neural network's response to the exact same vowel

The two signals are raw (no transformations needed)

Paper out 🧠🤖

https://www.nature.com/articles/s41598-023-33384-9

Listen to examples 🔊:
https://www.youtube.com/watch?v=io9fOpn1NuE

With Alan Zhou and Christina Zhao

Encoding of speech in convolutional layers and the brain stem based on language experience - Scientific Reports

Comparing artificial neural networks with outputs of neuroimaging techniques has recently seen substantial advances in (computer) vision and text-based language models. Here, we propose a framework to compare biological and artificial neural computations of spoken language representations and propose several new challenges to this paradigm. The proposed technique is based on a similar principle that underlies electroencephalography (EEG): averaging of neural (artificial or biological) activity across neurons in the time domain, and allows to compare encoding of any acoustic property in the brain and in intermediate convolutional layers of an artificial neural network. Our approach allows a direct comparison of responses to a phonetic property in the brain and in deep neural networks that requires no linear transformations between the signals. We argue that the brain stem response (cABR) and the response in intermediate convolutional layers to the exact same stimulus are highly similar without applying any transformations, and we quantify this observation. The proposed technique not only reveals similarities, but also allows for analysis of the encoding of actual acoustic properties in the two signals: we compare peak latency (i) in cABR relative to the stimulus in the brain stem and in (ii) intermediate convolutional layers relative to the input/output in deep convolutional networks. We also examine and compare the effect of prior language exposure on the peak latency in cABR and in intermediate convolutional layers. Substantial similarities in peak latency encoding between the human brain and intermediate convolutional networks emerge based on results from eight trained networks (including a replication experiment). The proposed technique can be used to compare encoding between the human brain and intermediate convolutional layers for any acoustic property and for other neuroimaging techniques.

Nature