Mastodawn

Alice Averlong🏳️‍⚧️Dec 12, 2023

why do I keep hacking 16bit DOS games? do I hate myself?

32bit programs are SO MUCH EASIER to RE, because when you see an address, you know what it means. 0x12345678 always means 0x12345678!

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

16bit games are full of MOV AX, 1234h and it's like, WHAT'S DS AT THIS POINT? WHICH 1234?

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

there's 65536 possible memory addresses it could be!

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

not to mention that there's more than one way to address a given part of memory.

in 32bit and 64bit code, if you see 0x12345678, you know that some code that writes to 0x12335662 doesn't change it.

not so in 16bit games. you have plenty of ways to refer to the same address.

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

This is why 16bit x86 is SO much more annoying than 8-bit computers.
with 8-bit computers, you have 16-bit addresses, because 256 bytes is rarely enough memory. So they work by having some addresses which are longer. simple, right? so instead of an 8bit number, you have a 16bit number.

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

16bit x86 does this as well. 16bits of ram is only 64kb, and that's just not enough. So you expand it to 24bits or 32bits, for "long addresses", right? same as you use in 8bit computers?

NOPE

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

segmented addressing, the solution they use, is not as simple as just adding some more bits. a 16bit segment and a 16bit offset.

so that's just a weird way of explaining a 32bit number, right?
NOPE

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

no, you combine 16bits and 16bits and get... 20 bits.

it's a 20bit address.

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

so what, they ignore all but the bottom 4 bits of the segment?

NO THAT WOULD MAKE SENSE

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

instead the full 16bit segment is used, but it's turned into a 20bit address by shifting it 4 bits over and adding in the offset.

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

So it's the TOP 4 bits that are important, not the bottom 4.

Okay that's fine, but wait, I said adding. Not "replacing".

Yes, all 16bits are used. So the address 0000:0000 is (linear) 0x0, and 0001:0000 is (linear) 0x10

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

which also means that 0001:0000 and 0000:0010 are both linear 0x10.

So you can get pointer aliasing even though both pointers HAVE DIFFERENT VALUES

Show thread

Alice Averlong🏳️‍⚧️Dec 12, 2023

And if that wasn't bad enough, there's also the A20 gate nonsense. Now, the A20 gate was added with the 286, for backwards compatibility with how the 8086/8088 worked, which is that memory wrapped.

Show thread

root42 Dec 12, 2023

@foone I can live with segments. But A20 really is a hack to maintain backwards compatibility to the PC. They could‘ve just said „screw it“ and had a few programs fail. I mean HIMEM.SYS disables the wrap anyway to make the HMA accessible.
But to be fair the segment model was also a hack to be compatible to the 8080 and be able to use things like the 8257 DMA controller, which also supported only 64KB like the 8080.

Show thread

KeyJ Dec 13, 2023

@root42 @foone The DMA malarkey has nothing to do with x86 Real Mode though; it's just a side effect of IBM reusing an 8-bit system design. Absolutely not Intel's fault!

Show thread

root42

@KeyJ @foone but the 8086 didn’t have any custom ICs to go along with it. The were reused from the 8080. My armchair theory is that the segment model helped in the case of the DMA controller.

Show thread

KeyJ Dec 13, 2023

@root42 @foone Fair point about the unavailability of 8086-specific companion chips.

I still maintain that the DMA controller's limitations have nothing to do with x86's segmentation or vice-versa.
Sure, a DMA page could be accessed quite conveniently by setting a segment register to a multiple of 0x1000, but that would have been just as easy with any other CPU-side memory model. It's really the DMA itself that's the limiting factor here, and no amount of CPU quirks would have amended that.