📰 "Sampling at intermediate temperatures is optimal for training large language models in protein structure prediction"
https://arxiv.org/abs/2603.29529 #Cond-Mat.Dis-Nn #Mechanics #Q-Bio.Bm #Matrix #Cs.Lg
Sampling at intermediate temperatures is optimal for training large language models in protein structure prediction

We investigate the parameter space of transformer models trained on protein sequence data using a statistical mechanics framework, sampling the loss landscape at varying temperatures by Langevin dynamics to characterize the low-loss manifold and understand the mechanisms underlying the superior performance of transformers in protein structure prediction. We find that, at variance with feedforward networks, the lack of a first--order--like transition in the loss of the transformer produces a range of intermediate temperatures with good learning properties. We show that the parameters of most layers are highly conserved at these temperatures if the dimension of the embedding is optimal, and we provide an operative way to find this dimension. Finally, we show that the attention matrix is more predictive of the contact maps of the protein at higher temperatures and for higher dimensions of the embedding than those optimal for learning.

arXiv.org
📰 "Which Similarity-Sensitive Entropy (Sentropy)?"
https://arxiv.org/abs/2511.03849 #Mechanical #Q-Bio.Pe #Math.It #Matrix #Cs.It #Cs.Lg
Which Similarity-Sensitive Entropy (Sentropy)?

Shannon entropy is not the only entropy that is relevant to machine-learning datasets, nor possibly even the most important one. Traditional entropies such as Shannon entropy capture information represented by elements' frequencies but not the richer information encoded by their similarities and differences. Capturing the latter requires similarity-sensitive entropy (``sentropy''). Sentropy can be measured using either the recently developed Leinster-Cobbold-Reeve framework (LCR) or the newer Vendi score (VS). This raises the practical question of which one to use: LCR or VS. Here we address this question theoretically and numerically, using 53 large and well-known imaging and tabular datasets. We find that LCR and VS values can differ by orders of magnitude and are complementary, except in limiting cases. We show that both LCR and VS results depend on how similarities are scaled, and introduce the notion of ``half-distance'' to parameterize this dependence. We prove the VS provides an upper bound on LCR for all non-negative values of the Rényi-Hill order parameter, as well as for negative values in the special case that the similarity matrix is full rank. We conclude that VS is preferable only when a dataset's elements can be usefully interpreted as linear combinations of a more fundamental set of ``ur-elements'' or when the system that the dataset describes has a quantum-mechanical character. In the broader case where one simply wishes to capture the rich information encoded by elements' similarities and differences as well as their frequencies, we propose that LCR should be favored; nevertheless, for certain half-distances the two methods can complement each other.

arXiv.org

Is there a some general theory of object oriented programming languages that improves on "A Theory of Objects" by Abadi and Cardelli?

#programming #cs #oop

Back home, years ago, when I started teaching computer science stuff "for real", a senior faculty member told me that I cannot mention average case analysis in the data structures course. "Too complicated, none of our students will get it!"

I said "But without average case analysis, hash tables are O(n) so ... do I not teach hash tables?" Memory becomes a bit fuzzy here, but I think the answer was something along the lines of "just tell them it's magic".

I learned a lot of things in those early years, but the most important one was that senior faculty with lots of publications, awards, grad students, text books, etc. can be just as wrong about stuff as anyone else.

I ignored the advice and hopefully managed to explain to "our 'dumb' CS students" why hash tables are a good idea. I never bothered anyone with detailed proofs, but to take away the very *idea* of why hash tables work, that's not okay. Everyone programming anything needs at least that much.

Note that I didn't say "all senior faculty are this that or the other thing" which is definitely not true. Also this predates my time in Baltimore for anyone keeping track, so ... I always taught hash tables "properly" there and nobody ever told me not to. 😄

(Proudly self-plagiarized off of a December 2022 thread from one of my previous accounts.)

#academia #cs #teaching #datastructures

https://www.wacoca.com/life/373456/ 【西武新型】2028年デビュー・妹島和世氏デザインのレストラン列車名は「vies」に! 逆から読むと? 先行招待キャンペーンで2年後の約束を | 旅とおでかけ 鉄道チャンネル #【西武新型】2028年デビュー・妹島和世氏デザインのレストラン列車名は「vies」に!逆から読むと?先行招待キャンペーンで2年後の約束を #546 #Ch546 #CS #HighEndRestaurant #KōkyūResutoran #スカパー #チャンネル #前面展望 #鉄道 #鉄道コラム #鉄道チャンネル #鉄道ニュース #電車 #高級レストラン
📰 "Identifying Connectivity Distributions from Neural Dynamics Using Flows"
https://arxiv.org/abs/2603.26506 #Q-Bio.Nc #Dynamics #Matrix #Cs.Lg
Identifying Connectivity Distributions from Neural Dynamics Using Flows

Connectivity structure shapes neural computation, but inferring this structure from population recordings is degenerate: multiple connectivity structures can generate identical dynamics. Recent work uses low-rank recurrent neural networks (lrRNNs) to infer low-dimensional latent dynamics and connectivity structure from observed activity, enabling a mechanistic interpretation of the dynamics. However, standard approaches for training lrRNNs can recover spurious structures irrelevant to the underlying dynamics. We first characterize the identifiability of connectivity structures in lrRNNs and determine conditions under which a unique solution exists. Then, to find such solutions, we develop an inference framework based on maximum entropy and continuous normalizing flows (CNFs), trained via flow matching. Instead of estimating a single connectivity matrix, our method learns the maximally unbiased distribution over connection weights consistent with observed dynamics. This approach captures complex yet necessary distributions such as heavy-tailed connectivity found in empirical data. We validate our method on synthetic datasets with connectivity structures that generate multistable attractors, limit cycles, and ring attractors, and demonstrate its applicability in recordings from rat frontal cortex during decision-making. Our framework shifts circuit inference from recovering connectivity to identifying which connectivity structures are computationally required, and which are artifacts of underconstrained inference.

arXiv.org

Образовательные программы CS/AI в Германии без немецкого языка: варианты, о которых мало кто знает

В 2023 году JetBrains запустил стипендиальную программу в немецком частном университете — с полным покрытием tuition fee, жилья и проживания для студентов CS/AI. Университет этот русскоязычным абитуриентам почти не известен. Ниже — разбор того, что это за место, какие там программы и стоит ли его рассматривать всерьёз.

https://habr.com/ru/articles/1016406/

#Германия #образование #CS #AI #машинное_обучение #университет #стипендия #JetBrains #бакалавриат #магистратура

Образовательные программы CS/AI в Германии без немецкого языка: варианты, о которых мало кто знает

В 2023 году JetBrains запустил стипендиальную программу в немецком частном университете — с полным покрытием tuition fee, жилья и проживания для студентов CS/AI. Университет этот русскоязычным...

Хабр

If I were an AI Doomer, I would review the causes of the last AI Winter (e.g., Minsky's Perceptrons paper) and try to replicate them today, instead of all the name-calling, shouting, and arm-flailing.

Knuth already likes LLMs, so you will need to find someone else. Margaret Boden could have been a prime candidate, but she's sadly passed away.

Just saying.

#ai #cs

It's funny, I was asking Anthropic model to draw a timeline in regards to computer science, and it's pretty good except, that Pascal, Pouzin, were not cited. #networks #CS

"The network has never been neutral."
— Louis Pouzin

https://computerhistory.org/profile/louis-pouzin/

Louis Pouzin

For the pioneering design and implementation of packet communication networks that led the way to the internet.

CHM