one thing i've been struggling with all evening is how to restart the DMA buffer without inducing some sort of sync-killing delay or jitter

We have a picture. It isn't perfect - there's some flickery bits - I don't know if dot crawl is the right word. Not sure what's going on there.

Driving sync signals with a single image buffer was untenable - it's really difficult to have seamless, continuous DMA on the Pico. Every time you restart DMA there's a discontinuity, and if the thing you're DMA'ing represents every single clock cycle in a video field, inserting extra time messes up your picture.

So separate PIO programs generate hsync and vsync, and we restart our now image-only DMA in vsync, just like the OG Glyphblaster does, which we already know works.

I think the crawling may be that the color output is not perfectly aligned with the dot clock.

I'm starting to think a pico is not the appropriate thing to build a video card with.

all my fun ideas always end up with me concluding i should use an FPGA.

FPGAs are like the crabs of electronics projects. everything wants to turn into an FPGA if you give it enough time.

I do have a de10 nano FPGA board. It's in my MiSTer. Which I never use!
Right. This isn't completely baffling or anything
Oh, wait, I remember this level from ZZT.
I think the elf needs food badly

okay this isn't that confusing

for example, these are the snake pins. you can choose from left snakes or right snakes

you know, there's a very good reason NOT to use an FPGA

when you actually use the proper tools for a job, it stops becoming a hack.

you're no longer a hacker, you're some kind of engineer or something. ugh.

Well, I did an FPGA thing. Can check that off the bucket list.

Okay, lets see if I can get a 14.31818MHz clock out of this thing.

We have a 50MHz clock source, on a pin called FPGA_CLK1_50.

How do we get 14.31818MHz out of that?

By attaching a PLL to it, hopefully.

module de10_clock_test (
input wire FPGA_CLK1_50,
output wire GPIO_1_0
);

wire clk_osc;
wire pll_locked;

cga_pll pll_inst (
.refclk (FPGA_CLK1_50),
.rst (1'b0),
.outclk_0 (clk_osc),
.locked (pll_locked)
);

assign GPIO_1_0 = clk_osc;
endmodule

pll_inst is a PLL instantiation. But notice it says nothing about like dividers or anything. We have to go into something confusingly called the IP (Incendiary Pickle) Catalog to actually configure the PLL.

This is far as a I get because doing this causes Quartus to hang.

Oh wait I just had to wait seventeen minutes. cool.
i bet there are people reading this that know how to actually use this stuff and are having themselves a sensible chuckle at my expense
when you build the PLL you get a QIP file full of nonsense

Okay, if I did all this right, i should have a 14.31818MHz clock on GPIO pin 1.

Let's find out! To the workbench!

Hot damn!

looks just like the Pico clock - maybe it is my probe, lol.

So, the cool thing about FPGAs, is that this clock signal now just ... lives inside the vast and mysterious gate goo within the chip. We connected it to GPIO1, but we can connect it to just about anything, internally or externally, and we can trigger other logic on it.

It's like real legit design shit. It makes the Pico's PIO mode feel like a bit of a toy.

So remember that digital logic simulation I made of the CGA card in Digital?

Digital can export your entire simulation to Verilog.

In theory, I just need to wire up the OSC pin to this 14.31818Mhz PLL clock, and wire the simulation's output pins to some GPIOs, route them out through a 244, and I'll have a picture on screen.

surely it can't be that simple?

that would be a neat party trick but i wouldn't really learn anything and i'm sure the resulting Verilog would be spaghetti.

let's actually build this thing intentionally. Generating a clock was fun but i'm actually going to let the Pico continue to do that, and we'll treat a GPIO on the FPGA as the OSC input pin.

We'll divide it by 8, feed it to my Verilog MC6845 pre-configured for 80-column text mode, and wire up the HSYNC and VSYNC outputs to two more GPIOs.

We should get a 15.7kHz HSYNC and a 59.9kHz VSYNC out of that.

This is incredibly cool.

This is a spot-on, 15.7kHz horizontal sync pulse generated the Cyclone V FPGA running my (very simplified) Motorola 6845 Verilog implementation!

The DE-10 Nano has four user switches. The CGA outputs 4 bit color. Very convenient. Since I don't have any video memory yet, I'm just going to assign each switch to a color component, and AND it with display enable from the 6845. This should give us a well defined display rectangle on screen.

That looks something like this

wire display_en = crtc_de;

assign CGA_R = display_en ? SW[0] : 1'b0;
assign CGA_G = display_en ? SW[1] : 1'b0;
assign CGA_B = display_en ? SW[2] : 1'b0;
assign CGA_I = display_en ? SW[3] : 1'b0;

Well, it took me a bit to figure out that the 47pF capacitors right at the D-SUB connector are not optional, specifically on HSYNC. This explains why the RGB2HDMI would sync to things the real monitor wouldn't.

But hey look, FPGA CGA video!

I didn't properly reset the green switch which is why we got cyan instead of blue. lol. But they do all work.

The FPGA is being fed the 14.31818 clock from the Pico to simulate the ISA bus' OSC pin as an input to some future card. I'm going simulate the ISA bus with Picos until I have some idea how I'm going to connect all this business to an actual PC's ISA bus.

It's a bit silly to talk about building an FPGA-based CGA card without mentioning the Graphics Gremlin, by @tubetime

https://github.com/schlae/graphics-gremlin

He's already made such a thing.

I'm interested in writing my own Verilog implementation, and of course I want to stick a Pico 2 W on it so we can stream video to it, but it seems sensible to fork the Gremlin instead of re-inventing the entire wheel.

GitHub - schlae/graphics-gremlin: Open source retro ISA video card

Open source retro ISA video card. Contribute to schlae/graphics-gremlin development by creating an account on GitHub.

GitHub

@tubetime

There are some downsides, apparently this is a somewhat expensive card to build, just looking through forum discussions. But it's been five years, maybe there are better/cheaper FPGA options available today?

The Gremlin uses a Lattice iCE40 HX FPGA that costs about $20. That's not outrageous.

It has 7680 LUTs.

The Cyclone V I'm using has 110,000, but then again this FPGA is massive overkill. It can emulate a Nintendo 64, after all.

A lot has changed in the last 5 years since the Gremlin was specc'd out.

This is a Tang Nano 9k. It's about $22 on AliExpress.

It has 8640 LUTs, onboard SPI flash, and an HDMI port (!)

It could be socketed so you could take it out and use it for something else if you got bored with your ISA GlyphBlaster.

There's even a Tang Nano 20K now that has 8MB (yes MB) of SRAM on-board, and an SD-card reader. It's $45.

That's like your whole ass GlyphBlaster right there, just add bus buffers and a de-9 port.

Turns out the 244 buffer I pulled out of my cheapo amazon special Box o' Components was just bad, or maybe I inadvertently killed it.

In any case, I replaced it with one from the Chameleon and most of my signal issues cleared right up.

Translating my digital simulation into Verilog proceeds slowly. It's nice that I can copy and paste parts of the sim and export them to Verilog in isolation, then it's a a matter of simplifying things and making connections back to the main file.

I've got the font ROM in place and the chargen serializer. To test it, I hardcoded the font ROM's upper 8 address, then I AND the serializer output directly with the color switches.

What does that all mean? We have smiley faces!

Now I get to translate this hot mess into Verilog.

Yay.

the nice thing is i can just export things to GPIO pins, and then just look at the signal in the simulation and on my scope, and just visually check that they're doing the same thing
Good enough, ship it
I made it worse
The final version should probably not do this

I found a pretty decent way of checking my work against the digital simulation.

We can just run the card in a test harness in iVerilog, and export a CSV of signal states, so we can do a direct comparison against a CSV dumped from Digital.

This is a whole lot easier than using the oscilloscope lol.

Then you find yourself in the frustrating situation where all your signals match but your video card still emits tutti frutti.
solution: MOAR SIGNALS

It helps not to wire your DRAM address pins backwards.

Still major issues to iron out.

We're getting there. Still have an issue with alignment of glyphs and attributes.

I'm not sure how, since my latch signals for characters and attributes match the simulation. But baffling bugs are the best kind of bugs.

its still kind of weird seeing like, readable text on a monitor, and knowing this isn't IBM or Tseng Labs or Cirrus Logic it's just GloriousCow.

I made a video card?? you can just do that? isn't it illegal?

There's a lot of ways I could just change the way the card works that would probably be absolutely fine and would be less of a pain in the ass to implement and debug, but there's something pleasing about matching the schematics.

I took a peek at the Verilog source for the Graphics Gremlin and I think I know why it may have trouble with 8088MPH and Area 5150. I'd love to offer a hand at fixing it, but I'm not paying $200 for one from Germany.

Anyone out there have a Graphics Gremlin they'd be willing to loan me for the cause?

#retrocomputing

Text mode working, with proper glyph/attribute alignment. I had my Johnson Counter wired up wrong.

(That's what she said)

so i have a few reflections on this endeavor so far - I don't want to sound like i'm just bragging, even if I am a little bit.

Making your own FPGA video card may sound like some sort of omega-level nerd activity, but this wasn't fundamentally all that difficult. The CGA is made almost entirely of off-the-shelf 74-series logic chips, a ROM, some DRAMs, the Motorola MC6845, and a delicious sprinkling of resistors and capacitors.

If you know boolean logic, you can read the schematic for it, and you can translate that into Verilog. Verilog isn't that scary. It's just another programming language. If you can code in C and boolean logic doesn't make you cry, you can code in Verilog.

There's probably some stuff you always wanted to learn but maybe the thought scares you a bit. It feels too big.

I bet you, if you can find the time, you'll be looking back before you know it, knowing how to do that thing and wondering what you were scared about.

i might as well show some Verilog, and then people who know what they are doing can make fun of me

the CGA has a ton of flip-flops. like a metric ass-ton of flip-flops. So I made this module for a 74ls174, wich is a 6-bit D-flop with clear.

module ttl_74174 (
input wire clk,
input wire clr_n,
input wire [5:0] d,
output reg [5:0] q
);

always @(posedge clk or negedge clr_n) begin
if (!clr_n) begin
q <= 6'b000000;
end else begin
q <= d;
end
end

endmodule

what's a D-type flip-flop you ask? well, the D stands for Deez.

the top bit just defines the input and outputs. you can group things into busses, like we grouped all six of the inputs and outputs into d and q, which each has six wires indicated by [5:0]

i'm still not really sure when to use a reg and when to use a wire.

anyway, a D-type flipflop latches its deez when it is clocked.

so with always @(posedge clk we are saying "do this shit every time the clk input has a rising edge" and we latch q with d via q <= d.

very simple. you didn't even need to go college for this. you're welcome

you could literally just model every single chip like this, and just connect them up the way they are actually connected. that would work. it wouldn't be much fun to read though.

the CGA generates a ton of clocks via something called a Johnson Counter which is just a funny name for people feeding outputs of a flip-flop back into the next inputs of a flip-flop in stages.

this is what happens when you wire up the literal chip connections:

assign u4_d = {
u4_q[4], // 6D
u4_q[3], // 5D
u4_q[2], // 4D
u4_q[1], // 3D
u4_q[0], // 2D
~u5_q[1] // 1D
};

assign u5_d = {
~u5_q[1], // 6D
~u5_q[4], // 5D
~u5_q[3], // 4D
u4_q[3] ^ ~u5_q[1], // 3D
u5_q[0], // 2D
u4_q[5] // 1D
};

ttl_74174 u4 (
.clk (clk_14),
.clr_n (reset_n),
.d (u4_d),
.q (u4_q)
);

ttl_74174 u5 (
.clk (clk_14),
.clr_n (reset_n),
.d (u5_d),
.q (u5_q)
);

assign lclk_n = u4_q[0];
assign lclk = ~lclk_n;
assign clk_q2_u4 = u4_q[1];
assign clk_q4 = u4_q[3];
assign clk_q5 = u4_q[4];
assign clk_q6 = u4_q[5];
assign clk_q1 = u5_q[0];
assign clk_q2 = u5_q[1];
assign clk_q2_n = ~clk_q2;
to be fair it's not much more readable on the schematic
I guess in reflection, there are things that are hard because they're legitimately hard, like brain science, or rocket surgery, and then there are things like this, that are hard simply by virtue of being impenetrably tedious. you just have to be weird enough to have fun doing it anyway
@gloriouscow It's just like D&D. You have to grind until you level up, then it's easy ;)
@gloriouscow I am very impressed. I still think video cards are some kind of magic. Theoretically I know it's "just" matching computer behavior to what the monitor expects but there's so much tedious wiring and clocking between the two. It may be the dullest magic but it's still magic.
@gloriouscow isn't that quite straightforward? A wire for anything that can be described combinatorially and a reg for, well, a register, i.e. a flip flop.
@gloriouscow VHDL doesn't know this distinction, so you have to nudge the synthesis tool towards what you want to have
@gloriouscow 100%. And doing it wrong is a great way to learn how to do it right. 😆

@gloriouscow

It's been a while, but I have to agree. The chip programming languages aren't really any harder than software programming languages.

@gloriouscow text mode. Weird flex. :)
@gloriouscow maybe @tubetime can help?
Or @furan?
@root42 @tubetime I asked him before, he doesn't have any, the only one selling them is in Germany, it's $80 just for shipping