Mastodawn

Alice Averlong🏳️‍⚧️Dec 12, 2023

why do I keep hacking 16bit DOS games? do I hate myself?

32bit programs are SO MUCH EASIER to RE, because when you see an address, you know what it means. 0x12345678 always means 0x12345678!

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

16bit games are full of MOV AX, 1234h and it's like, WHAT'S DS AT THIS POINT? WHICH 1234?

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

there's 65536 possible memory addresses it could be!

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

not to mention that there's more than one way to address a given part of memory.

in 32bit and 64bit code, if you see 0x12345678, you know that some code that writes to 0x12335662 doesn't change it.

not so in 16bit games. you have plenty of ways to refer to the same address.

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

This is why 16bit x86 is SO much more annoying than 8-bit computers.
with 8-bit computers, you have 16-bit addresses, because 256 bytes is rarely enough memory. So they work by having some addresses which are longer. simple, right? so instead of an 8bit number, you have a 16bit number.

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

16bit x86 does this as well. 16bits of ram is only 64kb, and that's just not enough. So you expand it to 24bits or 32bits, for "long addresses", right? same as you use in 8bit computers?

NOPE

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

segmented addressing, the solution they use, is not as simple as just adding some more bits. a 16bit segment and a 16bit offset.

so that's just a weird way of explaining a 32bit number, right?
NOPE

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

no, you combine 16bits and 16bits and get... 20 bits.

it's a 20bit address.

Daniel Erenrich Dec 13, 2023

Show thread

Alice Averlong🏳️‍⚧️

so what, they ignore all but the bottom 4 bits of the segment?

NO THAT WOULD MAKE SENSE

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

instead the full 16bit segment is used, but it's turned into a 20bit address by shifting it 4 bits over and adding in the offset.

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

So it's the TOP 4 bits that are important, not the bottom 4.

Okay that's fine, but wait, I said adding. Not "replacing".

Yes, all 16bits are used. So the address 0000:0000 is (linear) 0x0, and 0001:0000 is (linear) 0x10

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

which also means that 0001:0000 and 0000:0010 are both linear 0x10.

So you can get pointer aliasing even though both pointers HAVE DIFFERENT VALUES

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

And if that wasn't bad enough, there's also the A20 gate nonsense. Now, the A20 gate was added with the 286, for backwards compatibility with how the 8086/8088 worked, which is that memory wrapped.

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

so not only are 0001:0000 and 0000:00010 the same address, so is FFFF:0020!

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

but don't worry, for the 286 they wanted to add more than 1 megabyte of RAM, which is the max you can address with a 20bit address, so they added the ability to disable address wrapping.

on the keyboard controller.

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

so now your memory wrapping changes based on what you write to the keyboard controller.

BRILLIANT IDEA

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

anyway my favorite part of this A20 line thing is that it was supported by Intel chips up until Haswell, in 2013.

So in 2012 your 8 gigabyte of RAM PC booted up with every other megabyte of RAM mirrored to each other.

Show thread

prozacchiwawa Dec 12, 2023

@foone i've wondered forever how things would have been different if we'd bought heavily into segmentation in the 286 and early 386 eras, once freed from basic real mode. i think part of why it failed was we'd just come to think of segmentation as a painful way of doing linear addressing by then.

Show thread

iris 🟡Dec 13, 2023

@foone if i'm interpreting this right, this means haswell was the last to include it?

if so, that means my PC with 32GB of RAM still does this nonsense

Show thread

mxk Dec 13, 2023

@Iristallite @foone no, up to, excluding Haswell.
But honestly, too much of this early x86 magic still exists in current CPUs. Even without A20 gate, your EFI (and depending on you bootpath your bootloader/kernel again) still has to interact with those segment registers to switch the CPU into 32bit protected mode and then into 64bit mode.
Intel just this year published a white paper, that would end this madness https://www.intel.com/content/www/us/en/developer/articles/technical/envisioning-future-simplified-architecture.html

Envisioning a Simplified Intel® Architecture

Intel investigated architectural enhancements and modifications for a 64-bit mode-only architecture.

Intel

Show thread

Felix "tmbinc" Domke Dec 13, 2023

@foone My favorite part is how the CPU in the (original) Xbox implemented this, so all you had to do is to ground A20M# and it would boot from FEFFFFF0 instead of FFFFFFF0, which would be decoded - in the southbridge - to the external NOR flash (instead of the s3kr1t boot-ROM), so you'd skip all that (wanna-be-)secure boot.

All that because of a hack that MS-DOS did, that was highly irrelevant already even in old DOS times.

Microsoft hacked itself.

Show thread

farhaven 🇪🇺Dec 13, 2023

@felix Huh, that explains why almost all of the mod chips for the classic XBox are more-or-less "a bit of eeprom plus a wire".

Show thread

Rairii Dec 13, 2023

@felix @foone i still love how the earliest dumped xbox alpha ii recovery (december 2000) booted into a slightly modified dos that ran a .com file to boot nt, for some reason (booting from the install cd used the standard NT el torito boot sector)

(recoveries before that used stock windows 2000, recoveries after that used XBOXROM.BIN which would later be the file written to nor flash, which I think is the reason it has an earlier version of the bootrom code at the end: it always did, and alpha ii actually used it iirc)

the alpha devkits were of course Just PCs, just like the xbox one alpha devkits were a decade afterwards

Show thread

Simon Zerafa Dec 13, 2023

@foone

They should have standardized on Unreal Mode from the 80286 onwards 😉

https://en.wikipedia.org/wiki/Unreal_mode

Unreal mode - Wikipedia

Show thread

Neia masks Dec 12, 2023

@foone O_O

Show thread

The Penguin of Evil Dec 12, 2023

@foone That's a PC specific hack. The CPU didn't bother. It's in much later processors because the caches moved so the emulation had to move to the CPU to keep PC compatibility madness happy.

On segments; You can have segments in 32bit mode, in fact Linux used them extensively for thread local storage and the kernel equivalent thereof, as well as user space addressing.

Other funny about the 286 is it has MMU features that can only be replicated on relatively modern 64bit x86 parts !

Show thread

William D. Jones Dec 12, 2023

@etchedpixels @foone I vaguely remember that the 486 has an #A20 pin, which tells it whether address wraparound is enabled for its internal cache.

I guess at some point, x86 grew features to handle the translation entirely internally ("had to move to the CPU")?

Show thread

The Penguin of Evil Dec 13, 2023

@cr1901 @foone 386SL it's port 0xEE/EF internally on the CPU for example as the pieces got more integrated and the keyboard controller ended up in the SoC (or in many cases became an emulation trap for the USB controller). Modern systems don't even have an emulated AT style keyboard controller quite often.

Show thread

Colin Cogle 🔵Dec 13, 2023

@foone That’s it, the world is switching to ARM. Or how about #MIPS? Thats a nice clean learning architecture.

Show thread

Alice Averlong🏳️‍⚧️Dec 13, 2023

@colincogle unfortunately I'm a retrocomputerist, and the past never changes. So I will always be mad at x86

Show thread

root42 Dec 12, 2023

@foone I can live with segments. But A20 really is a hack to maintain backwards compatibility to the PC. They could‘ve just said „screw it“ and had a few programs fail. I mean HIMEM.SYS disables the wrap anyway to make the HMA accessible.
But to be fair the segment model was also a hack to be compatible to the 8080 and be able to use things like the 8257 DMA controller, which also supported only 64KB like the 8080.

Show thread

KeyJ Dec 13, 2023

@root42 @foone The DMA malarkey has nothing to do with x86 Real Mode though; it's just a side effect of IBM reusing an 8-bit system design. Absolutely not Intel's fault!

Show thread

root42 Dec 13, 2023

@KeyJ @foone but the 8086 didn’t have any custom ICs to go along with it. The were reused from the 8080. My armchair theory is that the segment model helped in the case of the DMA controller.

Show thread

KeyJ Dec 13, 2023

@root42 @foone Fair point about the unavailability of 8086-specific companion chips.

I still maintain that the DMA controller's limitations have nothing to do with x86's segmentation or vice-versa.
Sure, a DMA page could be accessed quite conveniently by setting a segment register to a multiple of 0x1000, but that would have been just as easy with any other CPU-side memory model. It's really the DMA itself that's the limiting factor here, and no amount of CPU quirks would have amended that.

Show thread

lick here for more info Dec 12, 2023

@foone i guess that could make some minimal sense, allowing doing pointer math only on the lower part?

but god, what an absolute footgun

Show thread

Aaron Sawdey, Ph.D.Dec 12, 2023

@foone this is a little like indexed addressing modes and how an out of order core might not be able to resolve everything to identify load/store conflicts early in the pipeline.

Loving your rant because I always also thought that x86 segmented addressing was so stupid. The 386 as the first one that was nicely usable for *nix type OSes.

Show thread

Orc Dec 13, 2023

@foone It's like the 370, but somehow worse.

Show thread

duhanebel Dec 13, 2023

@foone yeah that’s a part I never understood.. did they just thought “you know what? let’s just make it harder”