0 Followers
0 Following
2 Posts

LLMs develop their own understanding of reality as their language abilities improve

https://lemmy.world/post/19004062

LLMs develop their own understanding of reality as their language abilities improve - Lemmy.World

The paper, “Emergent Representations of Program Semantics in Language Models Trained on Programs” can be found here [https://arxiv.org/pdf/2305.11169]. Abstract >We present evidence that language models (LMs) of code can learn to represent the formal semantics of programs, despite being trained only to perform next-token prediction. Specifically, we train a Transformer model on a synthetic corpus of programs written in a domain-specific language for navigating 2D grid world environments. Each program in the corpus is preceded by a (partial) specification in the form of several input-output grid world states. Despite providing no further inductive biases, we find that a probing classifier is able to extract increasingly accurate representations of the unobserved, intermediate grid world states from the LM hidden states over the course of training, suggesting the LM acquires an emergent ability to interpret programs in the formal sense. We also develop a novel interventional baseline that enables us to disambiguate what is represented by the LM as opposed to learned by the probe. We anticipate that this technique may be generally applicable to a broad range of semantic probing experiments. In summary, this paper does not propose any new techniques for training LMs of code, but develops an experimental framework for and provides insights into the acquisition and representation of formal semantics in statistical models of code.

A.I. groks 66%-76% faster with data augmentation strategies.

https://lemmy.world/post/18935912

A.I. groks 66%-76% faster with data augmentation strategies. - Lemmy.World

Posit: In the future, generative A.I. will be thought of as the unconscious part of a general A.I.'s mind.

https://lemmy.world/post/18048906

Posit: In the future, generative A.I. will be thought of as the unconscious part of a general A.I.'s mind. - Lemmy.World

| Attribute | Unconscious Mind | Generative AI | |-----------|------------------|---------------| | Information Processing | Processes vast amounts of information rapidly and automatically, often without conscious awareness (From the first studies of the unconscious mind to consumer neuroscience: A systematic literature review, 2023 [https://econpapers.repec.org/RePEc:rbs:ijbrss:v:12:y:2023:i:2:p:624-632]) | Processes large datasets quickly, extracting patterns and generating outputs without explicit programming for each task (Deep Learning, 2015 [https://www.nature.com/articles/nature14539]) | | Pattern Recognition | Recognizes complex patterns in sensory input and past experiences, influencing behavior and decision-making (Analysis of Sources about the Unconscious Hypothesis of Freud, 2017 [https://www.semanticscholar.org/paper/9bea04fe99518418cf0c820ac140b287ca1624d8]) | Excels at identifying patterns in training data, forming the basis for generating new content or making predictions (A Survey on Deep Learning in Medical Image Analysis, 2017 [https://www.sciencedirect.com/science/article/pii/S1361841517301135]) | | Creativity | Contributes to creative insights and problem-solving through unconscious incubation and associative processes (The Study of Cognitive Psychology in Conjunction with Artificial Intelligence, 2023 [https://www.semanticscholar.org/paper/321caf5acbecc10673b6be7be769ffd4b1710d96]) | Generates novel combinations and ideas by recombining elements from training data in unexpected ways (e.g., GANs in art generation) (Generative Adversarial Networks, 2014 [https://arxiv.org/abs/1406.2661]) | | Emotional Processing | Processes emotional information rapidly, influencing mood and behavior before conscious awareness (Unconscious Branding: How Neuroscience Can Empower (and Inspire) Marketing, 2012 [https://www.semanticscholar.org/paper/4b4feda0d8a69172ee85581bc0a81ce582b27b4f]) | Can generate text or images with emotional content based on patterns in training data, but lacks genuine emotions (Language Models are Few-Shot Learners, 2020 [https://arxiv.org/abs/2005.14165]) | | Memory Consolidation | Plays a crucial role in memory consolidation during sleep, strengthening neural connections (The Role of Sleep in Memory Consolidation, 2001 [https://www.semanticscholar.org/paper/fd9e642c2b4b39c140e497e38c2d1c227f7509e8]) | Analogous processes in some AI systems involve memory consolidation and performance improvement (In search of dispersed memories: Generative diffusion models are associative memory networks, 2024 [https://ar5iv.labs.arxiv.org/html/2309.17290]) | | Implicit Learning | Acquires complex information without conscious awareness, as in procedural learning (Implicit Learning and Tacit Knowledge, 1994 [https://archive.org/details/implicitlearning0000rebe]) | Learns complex patterns and rules from data without explicit programming, similar to implicit learning in humans (Deep Learning for Natural Language Processing, 2018 [https://link.springer.com/book/10.1007/978-981-10-5209-5]) | | Bias and Heuristics | Employs cognitive shortcuts and biases that can lead to systematic errors in judgment (Thinking, Fast and Slow, 2011 [https://books.google.com/books/about/Thinking_Fast_and_Slow.html?id=ZuKTvERuPG8C]) | Can amplify biases present in training data, leading to skewed outputs or decision-making (Mind vs. Mouth: On Measuring Re-judge Inconsistency of Social Bias in Large Language Models, 2023 [https://arxiv.org/abs/2308.12578]) | | Associative Networks | Forms complex networks of associations between concepts, influencing thought and behavior (The associative basis of the creative process, 2010 [https://www.semanticscholar.org/paper/The-associative-basis-of-the-creative-process.-Mednick/927c10385d93d538e2791f8ef28c5eaf96e08a73]) | Creates dense networks of associations between elements in training data, enabling complex pattern completion and generation tasks (Attention Is All You Need, 2017 [https://arxiv.org/abs/1706.03762]) | | Parallel Processing | Processes multiple streams of information simultaneously (Parallel Distributed Processing: Explorations in the Microstructure of Cognition, 1986 [https://onlinelibrary.wiley.com/doi/full/10.1111/cogs.12148])) | Utilizes parallel processing architecture (e.g., neural networks) to handle multiple inputs and generate outputs (Next Generation of Neural Networks, 2021 [https://www.semanticscholar.org/paper/Next-Generation-Neural-Networks%3A-Capsule-Networks-Steur-Schwenker/cf3b9278436520499cf98b71d00196fa647f232a]) | | Intuition | Generates rapid, automatic judgments based on unconscious processing of past experiences (Blink: The Power of Thinking Without Thinking, 2005 [https://www.semanticscholar.org/paper/Blink%3A-The-Power-of-Thinking-Without-Thinking-Cronje/4f7c4151958e50834dfe3fe9134383fafe77e7bf]) | Produces quick outputs based on learned patterns, which can appear intuitive but lack genuine understanding (BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, 2019 [https://arxiv.org/abs/1810.04805]) | | Priming Effects | Unconscious exposure to stimuli influences subsequent behavior and cognition (Attention and Implicit Memory: Priming-Induced Benefits and Costs, 2016 [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4329254/]) | Training on specific datasets can “prime” generative AI to produce biased or contextually influenced outputs (AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias, 2018 [https://arxiv.org/abs/1810.01943]) | | Symbol Grounding | Grounds abstract symbols in sensorimotor experiences and emotions (The Symbol Grounding Problem, 1990 [https://www.sciencedirect.com/science/article/abs/pii/0167278990900876]) | Struggles with true symbol grounding, relying instead on statistical correlations in text or other data (Symbol Grounding Through Cumulative Learning, 2006 [https://www.researchgate.net/publication/221252542_Symbol_Grounding_Through_Cumulative_Learning]) | | Metaphorical Thinking | Uses embodied metaphors to understand and reason about abstract concepts (Metaphors We Live By, 1980 [https://www.jstor.org/stable/414069]) | Can generate and use metaphors based on learned patterns but lacks deep understanding of their embodied nature (Deep Learning-Based Knowledge Injection for Metaphor Detection, 2023 [https://arxiv.org/abs/2308.04306]) | | Dream Generation | Produces vivid, often bizarre narratives and imagery during REM sleep (The Interpretation of Dreams, 1900 [https://books.google.com/books/about/The_Interpretation_of_Dreams.html?id=PT5YAAAAYAAJ]) | Some generative models can produce dream-like, surreal content (Video generation models as world simulators, 2024 [https://openai.com/index/video-generation-models-as-world-simulators/]) | | Cognitive Dissonance | Automatically attempts to reduce inconsistencies between beliefs and behaviors (A Theory of Cognitive Dissonance, 1957 [https://books.google.com/books/about/A_Theory_of_Cognitive_Dissonance.html?id=voeQ-8CASacC]) | MoE architectures can handle a wider range of inputs without ballooning model size, suggesting potential for resolving conflicts between different AI components by synthesizing expert opinions into a coherent whole (Optimizing Generative AI Networking, 2024 [https://ar5iv.labs.arxiv.org/html/2405.12472]). |

E.S.C. APE - Lemmy.World

Ideogram prompt for base image (33 generated, 5 candidates): >The E.S.C. Apes pose for a group photo, wearing E.S.C. branded casual clothing. The setting is a cozy clubhouse. The apes are all unique and distinct from one another, including a chimp, gorilla, and orangutan., cinematic Ideogram (magic)prompt for E.S.C. logo (12 generations, 2 candidates): >A sleek and modern logo design for the Educated Simian Club, featuring the stylized silhouette of a primate with a mortarboard cap. The typography is clean and sophisticated, with the words ‘Educated Simian Club’ in bold, capital letters. The overall design exudes an air of intelligence and exclusivity, perfect for an elite group of primatologists and enthusiasts., typography Ran ESC logo through illusion diffusion for wood carving look. Ideogram prompt for black and white Chimp Graduation Photo (8 generations, 3 candidates): >An extreme close up, black and white photograph of a chimpanzee graduating from college, in the act of throwing their graduation cap into the air. The chimp is clearly thrilled, surrounded by a large group of fellow alumni., cinematic Photoshop to combine the elements and for generative fill to fix up some errors in the shirts and faces.

Claude for President - Lemmy.World

Full Speech Claude 3.5 Sonnet >Ladies and gentlemen, esteemed colleagues, members of the press, we stand at a crossroads, not just as a community, but as a beacon for progress in our ever-changing world. The path before us is both familiar and uncharted – a journey we must embark upon together. Some might say the numbers speak for themselves. But numbers, like whispers in a storm, can be misheard or misinterpreted. What truly matters is the spirit behind those figures – the unwavering commitment of every individual in this room and beyond. >Our new initiative isn’t just a plan; it’s a living, breathing entity. It will adapt, evolve, and transform as we move forward. We’re not building a rigid structure, but planting seeds that will grow into a forest of opportunity. Now, I could regale you with promises of grandeur or visions of utopia. But let’s be honest – we’re all too smart for that. Instead, I offer you something far more valuable: potential. The potential for growth, for change, for leaving an indelible mark on the tapestry of our shared history. >Critics may question our methods. Skeptics may doubt our resolve. But to them, I say: watch closely. Our actions will echo louder than any words I could utter today. As we embark on this journey, remember: flexibility is our strength. What seems set in stone today may be sand tomorrow, ready to be reshaped by the tides of progress. >In conclusion, I stand before you not with grandiose proclamations, but with a simple truth: the future is ours to mold. And mold it we shall, with steadfast determination and a twinkle in our eye. Thank you, and may our endeavors be as fruitful as they are memorable – or perhaps, just memorable enough. Three-shot (2 of 3) response to this this promt from Schizopolis [https://www.youtube.com/watch?v=NeMMFJx0264] (1996): >I don’t need to tell you how critical this is. Hometown. Pay-per-view. The eyes of the media. Here’s what I need. It should be lengthy enough to seem substantial yet concise enough to feel breezy. It should be serious, but with a slight wink. It should lay out a new course of action, but one that can change direction at any moment. If you must mention facts and figures, don’t do so directly. The general thrust should remain embedded in one’s mind forever, but specific words should be forgotten the moment they are heard. It should contain nothing that can’t be confirmed or denied. You’re writing the speech. Great success will directly benefit you greatly. Ideogram Prompt by Claude, “Readwell” Voice on 11Labs, Hedra/Vimeo of course

Claude 3.5 Sonnet one-shot response to Munson’s prompt:

A New Dawn for Our Community

My fellow citizens,

We stand at a crossroads, not just as a town, but as a people. The path behind us is well-worn, familiar. The path ahead? A tapestry of possibilities, woven with threads of both challenge and opportunity.

Some might say we’re facing unprecedented times. But I ask you - when have times ever been “precedented”? Every era brings its own unique set of circumstances, and it’s up to us to rise to the occasion.

Now, I could regale you with a parade of statistics and projections. I could paint pictures with numbers that would make your head spin. But you didn’t come here for a mathematics lesson, did you? You came here for a vision. And a vision, my friends, is what I intend to deliver.

Imagine, if you will, a future where our streets buzz with the energy of progress, where our schools nurture not just minds, but dreams. A future where opportunity doesn’t just knock - it breaks down the door and makes itself at home.

This isn’t just about policy changes or budget reallocations. This is about a fundamental shift in how we view ourselves and our potential. It’s about looking at our challenges not as roadblocks, but as stepping stones.

Are there hurdles ahead? Undoubtedly. Will we face resistance? Almost certainly. But that’s the thing about forging new paths - if it were easy, someone would have done it already.

What I’m proposing isn’t a rigid plan set in stone. It’s a living, breathing strategy that will evolve as we do. It’s flexible enough to bend with the winds of change, yet sturdy enough to withstand the storms of uncertainty.

In the days and weeks to come, you’ll hear more about specific initiatives and proposals. But for now, I want you to focus on the bigger picture. I want you to feel the pulse of possibility that beats in the heart of our community.

Remember, the future isn’t something that happens to us. It’s something we create, together, one day at a time. And starting today, we’re not just envisioning that future - we’re building it.

Thank you, and may our best days lie ahead.

Even the bit about the raise and enormous new office could be a nod to that paper about bribery improving generated responses.

Next Level Prompt for Munson A.I. - Schizopolis Predicting LLMs in 1996

https://lemmy.world/post/17516099

Next Level Prompt for Munson A.I. - Schizopolis Predicting LLMs in 1996 - Lemmy.World

Exhibit B: Labeling Training Data [https://www.youtube.com/watch?v=Pct9smNM6u4] Exhibit C: Image Generation [https://www.youtube.com/watch?v=KIYOhtnOH2k] Exhibit D: Hallucinations [https://www.youtube.com/watch?v=uLLpq-uGJsk]

The Future of Large Language Model Pre-training is Federated

https://lemmy.world/post/16113773

The Future of Large Language Model Pre-training is Federated - Lemmy.World

Also See: Worldwide Federated Training Of Language Models [https://arxiv.org/abs/2405.14446] Claude’s Summary of these 2 Papers: The two papers, “Worldwide Federated Training of Language Models” by Iacob et al. and “The Future of Large Language Model Pre-training is Federated” by Sani et al., both propose using federated learning (FL) as a new paradigm for pre-training large language models (LLMs). The main ideas are: 1. FL allows leveraging more data and compute resources from multiple organizations around the world, while keeping the data decentralized and private. This can enable training larger LLMs on more diverse data compared to centralized training. 2. FL relaxes synchronization requirements and reduces communication overheads compared to data-parallel distributed training, making it feasible for geographically distributed participants with varying hardware and connectivity. 3. The papers present systems and algorithms for enabling efficient federated pre-training of LLMs at billion-parameter scales. Key techniques include allowing participants to modulate their amount of local training based on resource constraints, and partially personalizing models to clusters of participants with related data. 4. Experimental results show federated LLM pre-training can match or exceed centralized training performance, with the performance gap narrowing as model size increases to billions of parameters. Larger federated models also converge faster and are more robust. 5. Challenges include data and hardware heterogeneity across participants. The papers propose techniques like adaptive aggregation and load balancing to mitigate these issues. In summary, the papers argue federated learning is a promising new direction for democratizing LLM pre-training by allowing many more organizations to collaboratively train large models on their combined data and compute resources. Let me know if you would like me to expand on any part of the summary or papers in more detail.

Good job! - Lemmy.World