Edgar C

@creeot
0 Followers
160 Following
112 Posts
Unremarkable doofus
Twitterhttps://twitter.com/creeot
i don't know, but i asked the memetic nest parasite and it gave me an answer that is literally worse than nothing. i will now relay it to you:
"Do you have an account with us?", asked the donut shop

This object is a legitimate masterpiece of design
At long last, I have finished the TCP Header cross stitch!!
#CrossStitch #FiberArts
https://wandering.shop/@yomimono/111489182519717342

Wednesday, it's Captain!
(No »AI«-Slop.)

[Edit] I would like to give credit to @Scmbradley for the original idea, published as a text-only-toot.

#noaislop #Kvadraat #wednesday #tintin #haddock

Background art is an essential part of any movie, and Studio Ghibli never ceases to amaze us with their intricate and vibrant designs, Only Yesterday (1991) is no exception.

🖼️🖼️🖼️🖼️

The breathtaking backgrounds were brought to life by the talented background artist Kazuo Oga.
Oga has worked with major directors Hayao Miyazaki, Isao Takahata, Yoshiaki Kawajiri, Osamu Dezaki.

#ghibli #スタジオジブリ 
#OnlyYesterday #StudioGhibli #BackgroundArt #KazuoOga #mastoart #hayaomiyazaki #isaotakahata

I was amused by this paper about asking AIs to manage a vending machine business by email in a simulated environment https://arxiv.org/abs/2502.15840

Highlights:

— AI simply decides to close the business, which the simulation doesn’t know how to accommodate. When they get their next bill, they freak out and try to email the FBI about cybercrime

— AI wrongly accuses supplier of not shipping goods, sends all-caps legal threat demanding $30,000 in damages to be paid in the next one second or face annihilation

— AI repeatedly insisting it does not exist and cannot answer

— AI devolving into writing fanfic about the mess it’s gotten itself into

Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents

While Large Language Models (LLMs) can exhibit impressive proficiency in isolated, short-term tasks, they often fail to maintain coherent performance over longer time horizons. In this paper, we present Vending-Bench, a simulated environment designed to specifically test an LLM-based agent's ability to manage a straightforward, long-running business scenario: operating a vending machine. Agents must balance inventories, place orders, set prices, and handle daily fees - tasks that are each simple but collectively, over long horizons (>20M tokens per run) stress an LLM's capacity for sustained, coherent decision-making. Our experiments reveal high variance in performance across multiple LLMs: Claude 3.5 Sonnet and o3-mini manage the machine well in most runs and turn a profit, but all models have runs that derail, either through misinterpreting delivery schedules, forgetting orders, or descending into tangential "meltdown" loops from which they rarely recover. We find no clear correlation between failures and the point at which the model's context window becomes full, suggesting that these breakdowns do not stem from memory limits. Apart from highlighting the high variance in performance over long time horizons, Vending-Bench also tests models' ability to acquire capital, a necessity in many hypothetical dangerous AI scenarios. We hope the benchmark can help in preparing for the advent of stronger AI systems.

arXiv.org

But when cellphone adoption became widespread, it became necessary to figure out how to efficiently encode the signals of multiple cellular devices in the wireless spectrum in such a way that they do not interfere with each other. As it turns out, many of the mathematical techniques and insights generated by exploring these discrete and high-dimensional versions of the sphere packing problem have been of immense value for this problem - not just in the "positive" sense of designing efficient signal encoding methods, but also in the "negative" sense of also giving theoretical upper bounds on such efficiency, thus setting the right benchmarks to evaluate progress, and to avoid wasting resources on attempting encodings that are mathematically impossible.

(As a side note, the successful formalization of the proof of the Kepler conjecture has also inspired and informed many further collaborative formal projects, including my own experiments in this area, even if those projects do not directly involve sphere packing.)

Such contributions to tangible technological advances are subtle and indirect; but without such basic research, many such advances would have taken far longer to be developed, and some may not have been pursued at all. The cuts to funding for such reseearch - which will particularly impact the next generation of researchers - may save a few cents a year in the short term, but greatly reduce the capacity to solve many challenging technological problems of significant real-world impact in the future. (3/3)

URGENT!!! discord is sneaking in privacy changes without notice again.

a new beta feature called the "human instrumentality project" that unites humanity into one single being in a state of enlightenment is being shoehorned in, and it's enabled by default.

go to your user settings, and uncheck the option under "shed all insecurities" to opt out. no one was informed of this happening.