Mastodawn

Regardless of expectations for AI systems, interpretability studies seem to be promising towards the future of understanding the mathematical associations of concepts. Breaking down the model to its most atomic representation. This paper explores some of the associations regarding trust. It would be interesting to see if there's a correlation between the embedding of human trust models, the persona vectors of various models, and the ability to jailbreak.

https://arxiv.org/abs/2603.05839

#AI #LLM

Evaluating LLM Alignment With Human Trust Models

Trust plays a pivotal role in enabling effective cooperation, reducing uncertainty, and guiding decision-making in both human interactions and multi-agent systems. Although it is significant, there is limited understanding of how large language models (LLMs) internally conceptualize and reason about trust. This work presents a white-box analysis of trust representation in EleutherAI/gpt-j-6B, using contrastive prompting to generate embedding vectors within the activation space of the LLM for diadic trust and related interpersonal relationship attributes. We first identified trust-related concepts from five established human trust models. We then determined a threshold for significant conceptual alignment by computing pairwise cosine similarities across 60 general emotional concepts. Then we measured the cosine similarities between the LLM's internal representation of trust and the derived trust-related concepts. Our results show that the internal trust representation of EleutherAI/gpt-j-6B aligns most closely with the Castelfranchi socio-cognitive model, followed by the Marsh Model. These findings indicate that LLMs encode socio-cognitive constructs in their activation space in ways that support meaningful comparative analyses, inform theories of social cognition, and support the design of human-AI collaborative systems.

arXiv.org

Show thread

William Whitlow Feb 26

@shaedrich

Possibly. You could say that the end of a tool is decided upon by the designer. In which case this would ended be the end of most of these systems.

A point that leads to a follow-up question. What should the end of AI systems be? The current answer would seem to be a disappointing result for the amount of resources poured into it.

Show thread

William Whitlow Feb 26

@shaedrich

I wonder what it will take to address the gap between what is opinion, and what is knowledge.

Musing on the idea that because journalists and public opinion is not always able to discern this difference AI evangelists are being given a global pulpit to make their opinion fact through uncritical adoption and funding.

Show thread

William Whitlow Feb 26

@frankel

FWIW: While I don't disagree with the fact that AI Coding Assistance is reducing skill, the way this study is being presented is misleading. I had already seen the study, but clicked on it hoping this was a new study. Sample size is small and is focused on new skill acquisition rather than overall developer skill.

This claim may be true, but a more in-depth study needs to be conducted to fully judge the overall impact.

Show thread

William Whitlow Feb 26

@MostlyBlindGamer @fastfinge

I certainly agree with this. That doesn't make the direction and manner the industry is going to progress clear though.

Show thread

William Whitlow Feb 26

@danielmunoz @strypey

From the looks of this, the microapps here are those that were vibe-coded with Claude within an hour or two. Seems like anyone with a Claude or ChatGPT subscription would be able to replicate their own personal version. I don't know, it will be interesting to see how many more niche apps like this begin to crowd app stores. Simply because it takes no more than a basic idea to develop a working prototype.

Show thread

William Whitlow Feb 26

@MostlyBlindGamer @fastfinge I have taken the time to play around with Codex and Claude Code. In my experience Claude Code is a little more inclined to think about frameworks and tech stacks. Now, I often supply long prompts trying to encapsulate the details from a programmer's perspective. I'm agreeing with everyone else that this only highlights how tech skills are still non-negotiable. Even if a model can handle the grunt work of writing. This simply makes more developers, project managers.

William Whitlow Feb 25

Interesting paper that was recently updated today engaging Heidegger's philosophy with contemporary Machine Learning techniques. I hope to take more time to engage with this paper over the next couple of days. Still, it is encouraging to see such consideration connecting AGI concerns with philosophical principles. Exploring how contemporary design principles lead more to tool use, than AGI.
https://arxiv.org/abs/2602.19028

#AI #ML #AGI #philosophy

The Metaphysics We Train: A Heideggerian Reading of Machine Learning

This paper offers a phenomenological reading of contemporary machine learning through Heideggerian concepts, aimed at enriching practitioners' reflexive understanding of their own practice. We argue that this philosophical lens reveals three insights invisible to purely technical analysis. First, the algorithmic Entwurf (projection) is distinctive in being automated, opaque, and emergent--a metaphysics that operates without explicit articulation or debate, crystallizing implicitly through gradient descent rather than theoretical argument. Second, even sophisticated technical advances remain within the regime of Gestell (Enframing), improving calculation without questioning the primacy of calculation itself. Third, AI's lack of existential structure, specifically the absence of Care (Sorge), is genuinely explanatory: it illuminates why AI systems have no internal resources for questioning their own optimization imperatives, and why they optimize without the anxiety (Angst) that signals, in human agents, the friction between calculative absorption and authentic existence. We conclude by exploring the pedagogical value of this perspective, arguing that data science education should cultivate not only technical competence but ontological literacy--the capacity to recognize what worldviews our tools enact and when calculation itself may be the wrong mode of engagement.

arXiv.org

Show thread

William Whitlow Feb 21

@Blahster @Szescstopni @EricLawton

No, if you have a source I would be fascinated to read about it.

Show thread

William Whitlow Feb 21

@Szescstopni @EricLawton

I suppose the logical consistency of this would then involve complex legal proceedings equivalent to withdrawing life support in order to close a computer program...

At that point would AI data centers be approaching the equivalence of hospitals from a critical infrastructure perspective?

Website	https://www.williamcwhitlow.com
Github	https://github.com/wwhitlow