I'm working on a new FPGA project. this one is rather complex.
I'm at the "blink an LED" stage of bringing it up.
wrote some temporary verilog to validate the bus interface using a single 16-bit register. here goes nothing
oops. logic analyzer time. guess i should have expected it.
ok so i unplugged the board and powered up the computer -- and the error stays. looks like i broke something. 😩
gotta take this step by step. I should have checked this at the start but first I will force the FPGA into the unprogrammed state (-CRESET low) then check each pin to make sure it's in a valid state.
ok when *not* in reset, the FPGA is pulling the DMA line BURST_L low constantly. this is bad, and explains the 00011320 error i saw earlier.
seems that i typo'd the wiring between the top level verilog module and the module that handles the micro channel bus. it's a floating connection and it seems to mostly just sit at a logic 0.
the other problem (01290200) is more concerning and will need a logic analyzer.
yes, i made a dedicated interposer/extender board just to help with the logic analyzer connections. it's called the Fing Longer (a reference to Futurama).
hmm, so during a read operation, the data is never driven onto the bus (the output stays pulled up to FFFF). looks like the MADE24 line is staying low? that's weird. let me try making the logic ignore it.
oh! now the card is putting data onto the bus! the "test1" channel is the data direction for the 74lvc4245 buffers showing that they are transferring data from the card to the host PC.
wow, it actually works, i'm able to write a value to the simple register and read it back. this is a HUGE step forward.

i'll need to figure out what is up with the MADE24 line. could be that the pin doesn't actually do that. the HDD pinout is one that i reverse engineered a while back, so it might be a mistake.

this could also explain the damage to the PC, perhaps the card tried to write to the data bus when it was not supposed to and damaged the output drivers of some other chip.

huh, the "MADE24" line is controlled by bit 7 of register 96. I wonder what that is.
lol, this is the active high CHRESET! i was wondering why that line seemed to be missing.
moving on to the Teensy interface. i had to choose the IO pins carefully so i can make a 16-bit parallel IO port.

got the Teensy interface up and running. i'm using direct IO port access on the Teensy 4.1. take a look at core_pins.h in the Teensy header files. basically you can read from GPIOx_PSR and write to GPIOx_DR.

i also had to add a short delay to create some setup time for the FPGA--the Teensy 4.1 is a hair too fast lol

bidirectional registers now work! i can write a command from the PC to the Teensy, and i can write a response from the Teensy and read it from the PC. there are also status flags showing when new data is available. it may not seem like much, but this is huge progress.
excellent progress today. I've been able to implement the "Get Diagnostic Status" command. it transfers the command block and handles the returning status block as well as the flags and interrupts. best of all, it works on real hardware using my diagnostic program!

OK why does pin 1 start halfway down the edge of this chip???

my best guess is that the die is rotated to a 45 degree angle. anyway i want to dump the contents so i can analyze the drive firmware.

no 28-pin TSOP socket, oh well
now i'm knee deep in Ghidra listings. this code probably runs the entire hard drive, not just the host interface.
this sort of reverse engineering is very much like solving a challenging puzzle. you push and push until you can deduce something based on what you already know, then you pivot, taking that new knowledge and pushing on that until you learn even more.

so last night I identified the power on self test routines by inspection. it's not too hard to identify a checksum routine or a memory test routine. this helped me fill in the memory map.

also, the POR test function stores the results at a particular memory location, and the codes match up with the POR error codes in the DBA-ESDI spec! the next step is to search the whole ROM for any instructions that read this memory location--this should identify the functions that generate the status block.

drive firmware is turning into a bit of a slog so i switched over to the IBM BIOS. having a spec is nice, but the code will cover a bunch of corner cases.
managed to reverse engineer enough that I was able to read the defect map out of one of the original hard drives. sounds easy but the process uses DMA.

working through a nasty timing hazard with the mailbox flags on the command port. sometimes you write data and the "data available" flag never gets set.

so now i am digging through the logic that yosys generated to see if it even makes sense.

having good test programs is important. here's the status interface register dropping values. the Teensy program is just writing an incrementing number, and the diagnostics program is checking for gaps.

got that all sorted out. it was a synchronization issue with the flags between the two interfaces.

this is the "seek" command successfully completing! this is a *major* step since it requires 4 working mailboxes and interrupts.

another important step today--i got the data port and data port mailbox flags working. it can also detect 8-bit vs 16-bit transfers. getting very close to working PIO transfers.
nice! I managed to get PIO data transfers working well enough for the buffer test routine to pass.
ok this is fantastic--I've managed to transfer my first actual sector! it's just using PIO and the data is not from a real filesystem, but this is another big step forward!

DMA on Micro Channel is really hard. i'm running a bunch of simulations first, making adjustments to the logic as needed.

so many moving parts.

wow, got four bytes to transfer successfully over DMA! not sure why it got stuck after that.

just ran the same test again and it transferred the whole sector over DMA!!

so at least read transfers are working partially. writes just hang the machine after transferring half a sector. it's probably time for the logic analyzer.

not sure why I always end up in front of a logic analyzer, but here we are.
several issues. this first issue, during a host to device write, holds the arbitration bus too long. it should release immediately after the second arb/gnt pulse
had a theory and it reproduces in simulation. the transfer request flag isn't getting cleared soon enough. la_dma_selected is what can clear this flag and it is changed on the falling edge of cmd, which is too late to catch the ARB/GNT pulse.
yes, that solves the crashing problem. but data isn't getting transferred correctly, so I've got more work to do.
weirdly enough, it works the second try!!! something on the host was prematurely turning off DMA. maybe a bug in difdiag.
so the interrupt_detected flag is supposed to be set in the irq14 handler, and it is *supposed* to be set only when DMA is done. but somehow interrupt_detected is set without the IRQ handler ever being called! then the DMA operation is broken down prematurely.
using the logic analyzer, i proved that the irq14 handler never gets called. the only code that *ever* sets the interrupt_detected flag exists in this handler. it's declared as a volatile so it can't be cached in a register.
I wrote the flag value out to an unused IO port, 0x4F, so I can see it on the logic analyzer. a neat trick!

so i don't know how this flag is getting set. my hack is to preemptively clear the flag right before starting DMA, and so far, it seems to be working.

i think this code was "working" with the real ESDI drive because that one uses burst mode DMA and it finishes up very quickly, before the irq14wait routine can exit early.

decided to look at the real drive. and guess what--it's not using burst mode. the POS registers have it turned off by default. it's also slow to read the data from the spinning disk, so IBM must have figured that it wasn't really necessary.
now I'm reading up on accessing SD cards from the Teensy 4.1. looks like SdFat is the library? could it be so easy?
turns out it's easy but I had to reformat the SD card using the official sdcard.org utility. anyway, I've read the first sector from a real disk image!
the drive now gives the POS ID. let's try to boot!
hmm 01048200 is a drive select acknowledgement error.

the BIOS runs faster than the DIFDIAG utility, and so it seems like it is hitting a timing problem that i didn't hit before.

my drive code seems to randomly hang up and not respond correctly.

it's occasionally getting a spurious end-of-interrupt command which is really odd and points to an issue with the mailboxes (again, sigh).

but it's SO DARN CLOSE. it's transferring sectors from the IML region in the disk image.

figured out one problem. the disk boot routines slam the drive with an ATN and the first command word in 5.5us. the Teensy code takes too long to see the ATN and clears the command register full flag, which drops the first word. oops.
so it *almost* boots now. in fact it successfully loads the IML sectors from the hidden partition on the drive, and no longer throws an I999... error code!

my drive doesn't implement this weird feature called pseudo RBAs--it's a way to artificially limit the maximum possible block address, presumably so they can hide the partition data.

i suspect the BIOS checks this, so i'll have to implement it. ugh. that means i need to figure out this incomprehensible diagram.

holy crap it's booting I can't believe it sdfadfsdfsdfsffasdf
well, it's working well enough to run qbasic. right now the drive is read-only.
i think i need to dig into the 01290200 cache error that has been coming up. i'm concerned that an issue with my DBA-ESDI card has caused it, but i'm not sure.
looks like the cache is inside the CPU. i can't find any cache chips on the motherboard.

@tubetime “It’s working well enough to run QBASIC”

Love it 😂

@tubetime

it doesn't look too bad, you just have the choice to save it back to the config area or not with set max-rba (which it probably never does out of the factory). did you clone a disk with the config area?

@bitsavers there's no easy way to image that. it'd require reverse engineering the drive firmware.
@tubetime "No command can not access" so.. uh.. commands can access? 🤯
@vvuk the document is full of typos and errors.

@tubetime Your project is an ESDI drive emulator, right? (specific drive type, but ESDI interface) Would it work in another computer that had an ESDI controller and understood the IBM drive?

(I assume so, but then you mentioned microchannel which confused me -- I'm assuming the "creaky old IBM laptop" interface is ESDI?)

@tubetime just saw your comments about this being DBA-ESDI. Time to read about the difference...
@tubetime Your recent posts make me believe you’re a time traveller - perhaps the great-great-grandson of an IBM engineer who as a child found an old notebook complaining about this problem your grandpa just couldn’t solve and that he was fired for and altered the arc of his life. You studied your whole life as an engineer for this moment to come back to today when the hardware was still available, fix the problem, then go back and help your grandpa. Let us know how it turns out.
@amart not too far off! my grandfather worked there. i once reverse engineered a prototype floppy drive he worked on, and got it working again. https://twitter.com/TubeTimeUS/status/1617703291483467776
Tube Time on X

i've got this weird old IBM prototype 4" floppy disk drive from the late 1970s, and i've decided to reverse engineer it. 🧵 https://t.co/2SWacwTAeC

X (formerly Twitter)
@tubetime What’s the end goal here, boot the laptop using a SD card that’s emulating a Micro Channel hard drive interface, via Teensy?

@tubetime I know it’s not the point of your project, but didn’t this laptop have PCMCIA? Would one of these work as a boot drive?

Digigear SD SDHC SDXC to PCMCIA PC Card, Adapter Supports, ATA Flash Memory https://a.co/d/agF7LO9

Amazon.com

@dschwarz nope, the memory card slot is pre-PCMCIA
@tubetime I've used that library before, yes it really is easy. You probably won't set any throughput records, but it was great for writing diagnostic logs that were later read back and uploaded.
@tubetime Yes, it really is that easy :)
@tubetime When you find yourself asking, "How did this ever work?" and it turns out the answer is "It didn't."
@tubetime how does that even work? After the first iteration isn't the refresh bit always set, so it's stuck in an infinite loop if it never IRQs?
@SteveSyfuhs oh sorry REFRESH_BIT is a macro that grabs an IO port bit that toggles with the DRAM refresh signal every 15us.
@tubetime ahhhh that makes way more sense now
@tubetime it’s been so long since this experiment started I forget what you’re trying to do!
@tubetime
Ok, I'm a bit confused. You have an fpga you are worthing on, but also a hard drive you are trying to read and another cup you have a white paper for? The drive controller chip?
I looks like you are writing notes for yourself.
I'm guessing you are an EE? Or are you a CE? Us normal CS guys don't ever get to the logic analyzer.
@Netux it's a solid state hard drive replacement that I'm designing. EE but I do some digital stuff too
@tubetime
arb_gnt_l doesn't stay high very long, does it?
@bitsavers arbitration cycles are long enough to allow the open drain bus to settle and that's it. sometimes they are longer: the planar hides negative priorities this way like refresh and nmi
@tubetime I implemented PS/2 DMA in an emulator good enough to get a Sound Blaster working a while ago and god help us all
@luigithirty nice, you didn't even have to get burst mode working 😉