the 16 byte DOS demo from #revision2026 running on real hardware

holy… HOW

@domi 16.. byte?

@RafiX c40448ab260345ae260305c1e802ebf2

that’s it. that’s the demo.

@domi @RafiX its such a thing

; si, di might be 100, FFFE? https://www.pcjs.org/documents/misc/DOS_COM_Startup_Registers.pdf
les ax,word [si] ; (es, ax) = (ds:[si], ds:[si+2])
loop: dec ax
stosw
add ax,[es:di-82];52h
add ax,[es:di]
shr ax, 2
jmp loop
@mothcompute I would have assumed that ES is b800, as it needs to write into text memory...? So yes, the values it reads from the stack need to be somewhat correct to overlap with the VGA text memory. Someone needs to inspect this!
@root42 nope, es is initialized to AB48 (from the instruction bytes from dec ax) whose range still covers all of text memory. it relies on the com starting with si = 100h; the stack isnt used
mothcompute (@[email protected])

``` ;si=0x100 https://www.pcjs.org/documents/misc/DOS_COM_Startup_Registers.pdf s: les ax,word [si] ; es = [s+2] -> 0xAB48 l: dec ax stosw ;es:[di]=ax add ax,[es:di-82] add ax,[es:di] shr ax, 2 jmp l ```

merpstodon!
@mothcompute @root42 The other clever trick I see here, if I understand the code correctly (I'm not really an assembly guy) is that the code sums two integers and divides the result by 4: this way no matter what you find in memory at the start, you'll sooner or later arrive to 0, and when you decrement ax you restart from 65535, converging to the same pattern no matter what.
@prestissimo yes. And the 82 backwards gives the motion to the upper left. It’s a little bit like the classic fire effect where you average and decrement values from the line below.
Edit: i would have assumed the offset should be 162 (2 bytes per character), but it seems they take the character halfway down the line to the left…
@prestissimo I have tried out with offset 162, and it works, too. Looks slightly different, but maybe that's what they were originally going for...? not sure!

@domi @RafiX If I remember correctly someone did a space search of the 8 byte demos some time ago with a limited machine code set (it had to set a graphical mode mode and have a jmp of some kind at the end).

Can't find it now :(

Puppy Farmer by Rift & Digital Sounds System

demotool for Windows, released in may 2012

pouët.net
@[email protected] @[email protected] Insane.

I mean, sure, it's just a PRNG writing into video RAM... but still very cool to get it done that short.

My attempt at decompiling (correct modulo the mistakes):
uint16_t *base = 0xAB480; base[0x7FFF] = 0x04C3; for (uint15_t d = 0; ; ++d) { base[d] = (base[d - 1] + base[d - 0x29] + base[d]) >> 2 - 1; }
Basically a base-65536 linear shift register of length 32768. Part of which covers the 2048 entries of video memory area from segment 0xB800, and some other random memory that's safe to write and simply does nothing in the usual case.

(I had to take freedom to make d an uint15_t to get the wraparound where needed)

The cool part is of course where the pointer into video RAM (in the original assembly the value of the ES register) comes from:
0x0000000000000100: C4 04 les ax, ptr [si] 0x0000000000000102: 48 dec ax 0x0000000000000103: AB stosw word ptr es:[di], ax
As SI starts at 0100, this will read AX from C4 04 (i.e. 04C4), and ES from the two following bytes 48 AB (AB48) - i.e. from its own machine code which happens to have the right bytes there. The STOSW is critical. This is technically part of the graphics mode video RAM A000-AFFF, however as the code writes an entire 64k block from there, also covers any segemnts until BB47, which in particular covers both the monochrome text mode video RAM from B000 to B7FF and the "interesting part" of color text mode video RAM from B800 to BFFF.
@[email protected] @[email protected] After having run it, I now also realize this modified linear shift register makes for an interestingly biased PRNG.

The dynamics is as follows:
base[d] = (base[d - 1] + base[d - 0x29] + base[d]) >> 2 - 1;
This adds up three values modulo 65536, but divides by four, then subtracts one. So the output value is always in the range from -1 (FFFF) to 3FFE (16382) - and if only such values are used as input (i.e. if one loop already ran), it'll never wrap (provided we interpret FFFF as -1).

This means that you can almost never get the blink attribute. Which is good, as the blink attribute sucks. Also, the background color can basically never have a red component, which gives it a blue/green-ish tint. With the exception of FFFF. Which is generated when all of base[d-1], base[d-0x29] and base[0] all were zero. But as the character code FF on codepage 437 is a non-breaking space, it won't actually blink, but just be a grey box.

So once this thing is stable, the values tend to decrease. Until they reach 0. Once three zeros are the input of this formula, it'll generate a FFFF (a grey square). If one or more FFFFs are the input of this formula, and the rest is zeroes, it'll always generate 3FFE (a white small square inside a cyan full square).

But as not everything goes to 0 at the same time, some leftover small values tend to be there when the wrap to FFFF happens. This makes it a pretty weird PRNG.

Which brings me to the significance of the 0x29 (which was in the asm code doubled, i.e. a 0x52).

0x29 is 41. I.e. one more than the screen width.

So some of its effect creates a frequency component of slightly more than half a screen as its "wavelength". Which explains why there's usually 2 copies of each detail on the screen, and also why stuff tends to be angled from top left to bottom right.

As for why it tends to move slowly to the left - that I cannot quite tell. It must be an interaction of this shift register with period 32768 of the memory access, however, I do not quite see why.

BTW, there is one interesting failure case of this demo: when the memory area from A000 to AFFF isn't mapped. Then the wraparound will not function anymore. This is lost in my decompilation - the actual code will actually transmit the previous value around in the AX register even through a long loop of open bus (all reads return FFFF) from A0000 to AFFFF. So it will be different on e.g. a 512k PC without EGA, but still work. Verified in 86box. It is more random and chaotic than on a "full" PC with the full 1 MiB of RAM, but it still works on less.
@domi @RafiX category MD5 hashes you can execute
@lritter @RafiX md5sum*, sha256 is a whopping 2x larger!!

@domi @lritter @RafiX oh, executable MD5 hashes with a known and meaningful input would be kind of possible, maybe

reminds me of https://shells.aachen.ccc.de/~spq/md5.gif

@lritter @domi @RafiX Makes me wonder if a checksum which hashes to itself (that *has* to exist…) could also be a demo.
@lanodan @domi @RafiX i occasionally think about this problem, without results.
@lritter @domi @RafiX "Hello, I'm going to need all the LLM-datacenters for some serious number crunching"