It is legitimately bugging me that I seem to be the only person who wants to be able to buy an ARM or RISC-V or MIPS or SPARC CPU, implemented with an inexpensive FPGA, with a standard parallel memory bus.

I feel stupid for wanting to be able to take such a thing, stick two or four 8/16 bit wide SRAMS and ROMs on it, and have a play like it was Real Hardware™.

I don’t want the whole system on one chip. There’s no fun there for me.
I just want 24 or 32 address pins, 16 or 32 data pins, one or more interrupts, a R/W pin or /RD /WR pair, that sort of pinout. No multiplexing, no cache bollocks, no uarts or sdram controllers or any of that mess. Just. A. Bloody. Processor.
It can be ARM. RISC-V looks neat. MIPS is cool. SPARC has virtues. I’m not particular. I just feel like it should exist.

@mos_8502 The only real limitation here is that it's hard to get inexpensive FPGAs in a package with so many physical pins, especially a more hobbyist friendly package (not a dense BGA, etc). Given that constraint, most pre-existing designs are tuned for lower pin count buses and more integration (there's plenty of space in the fabric for peripherals too).

PicoRV32 will fit in 1-2K CLBs and supports a very simple 32bit memory interface and a simple custom IRQ interface.
https://github.com/YosysHQ/picorv32

GitHub - YosysHQ/picorv32: PicoRV32 - A Size-Optimized RISC-V CPU

PicoRV32 - A Size-Optimized RISC-V CPU. Contribute to YosysHQ/picorv32 development by creating an account on GitHub.

GitHub

@mos_8502 Of the approachable Lattice ICE40 parts, if you want 44+ IOs (to allow 24A + 16D + control/cs/irq lines), and a non-BGA package, that's an ICE40HX with ~3K CLBs in a 20x20 LQFP144 package. ~$10 for singles at Mouser or Digikey. Not a lot of elbow room, but a minimal RV32I would be doable. There are a couple smaller parts but they only have ~1K LEs which is going to be very tight.

The more powerful ECP5 family has some 98IO 3/6/11K CLB 144LQFP options for ~$15/20/40 in singles.

@swetland I feel like everyone in this thread is trying to be helpful, but has missed that I am complaining that there’s no ready to go purchasable product like this. It could be a BGA part, because what I mean is it would be on a PCB module with pins or castellated holes.
@mos_8502 @swetland What would be an attractive price for something like this? I wonder how many people like you just want to purchase that piece and play with it.
@poleguy @swetland I don't know, depends on the arch, how hard it is to build a system around, that sort of thing. I'd pay a solid $60USD for a reasonably fast (cycle time in line with available SRAM chips) RISC family CPU that was as basic as I describe, that didn't have a multiplexed bus.

@mos_8502 I guess I was just misunderstanding your complaint as a "I wish this was doable" (to which I feel like the answer is "it is, but not in a terribly optimal way") rather than "I wish somebody had built it and was offering one for sale".

I suspect it's a niche product -- wanting a more modern RISC-y CPU but with no integrated peripherals, a traditional parallel bus (but wider enough that many would find it awkward to work with), and packaged as a DIP64 footprint and/or castellated PCB.

@mos_8502 Even so, you did get me thinking about it, and while I've got too many projects and don't need another, building a little SBC around an 144LQFP ICE40HX4 with CPU/IRQ/UART/address-decode-glue, a pair of 44pin SOIC 8Mbit 16bit SRAMs, and a ICE40UP5K for an A/V peripheral (VERA or the like), is a kind of intriguing thought. Not sure if inexpensive 16bit wide flash is available, etc, as I haven't dug that deep.
@mos_8502 For these kind of soft core systems folks tend to either do small CPUs that can get by with just onboard memory (32K-256K or so in a small FPGA) or go for somewhat beefier designs to deal with SDRAM or SPI-RAM/FLASH with faster narrower interfaces requiring more complexity in the memory controller and probably some amount of cache for reasonable performance. So I can see the appeal to exploring something in the middle.
@swetland I think the iCE40LP8K in 81-ball BGA with 63 I/O would probably do the job, wouldn't it? Assuming PCBWay or JLCPCB did the batch assembly?
@mos_8502 Yeah, I'd think so. I've been thinking about it from the standpoint of something that could be accomplished probably in 4 layers and hand-assembled for 3-5 piece prototyping, in line with my personal comfort zone for such things.
@swetland Ah, yeah, I don't really do a lot of SMD work -- mostly small repairs, and I'm slowly expanding into one-sided SMD assembly with 08 sized parts and bigger. Very newbie friendly stuff, hot air tool and SMD reflow hot plate.
@mos_8502 I've found I can get a long way with an inexpensive stereo microscope, decent soldering iron, and liberal use of flux. 0603s or 0402s are not too scary, and SOIC, TSOP, LQFP, or QFN packages are doable in such a setup with a bit of practice and patience. I definitely should play with stencils and hotplates at some point, or maybe just lean into assembly overseas now that it's pretty affordable.
@swetland I am loving the Kester 2235 flux for all soldering I've been doing lately.
@mos_8502 The tradeoff in exchange for the complexity (and for some, the enjoyable challenge) of building those memory interfaces, is you can manage to build 50-100MHz cores that are pretty performant on FPGA platforms that are still pretty inexpensive ($50-200 dev boards), enabling some 90s-era equivalent retro platforms, which is definitely fun too. Like the stuff I've been enjoying on MiSTer FPGA, and some 50MHz RV32 stuff I've done on ULX3S and the like.
@mos_8502 Related, for anyone interested in playing with slightly larger FPGA systems, the very nice ULX3S board is available now in the large part ($235) or pre-order for early January in the smaller (but still very useful) version ($135), is fully supported by the open source FPGA toolchains, and has an SDRAM that is well understood and less daunting than DDR3, etc on higher end boards...
https://www.crowdsupply.com/radiona/ulx3s
ULX3S

A powerful, open hardware ECP5 FPGA dev board

Crowd Supply

@mos_8502 Agreed, and trying to interface to it with some ports or what nots isn’t fun at all, nor effective.

That said, I’m generally more into accelerators for existing systems (cheap!) than whole new systems. A dirt cheap IIGS accelerator would be good.

@mos_8502 What's the threshold for "inexpensive"? I use the Xilinx XC7A15T, which costs about $35 in quantity one. It's a very capable part. Like some of the ice40 chips have done, it's really an XC7A50T, which has 3x the resources, and the Xilinx software restricts what you can use, but I published an open source tool that "fixes" that. Compared to ice40, it's faster and has a lot more I/O pins. 185 in the $35 part (FTG256).
@mos_8502 Isn't it hard nowadays to get an SRAM card?
@tocisz SRAM chips are available at Mouser and Digi-Key.

@mos_8502 Ah, yes. That's for sure. But I was thinking is there some standard of static RAM cards? As SDRAM cards for dynamic RAM. They are easily available, ready to plug in and use.

I think not, but maybe I'm just not aware of it.

@mos_8502 RISC-V is probably your best chance for FPGA implementation (there are a bunch of open source cores). But all the architectures you mention are 32 bit (or 64 bit), so external pin count is rapidly a limitation on “inexpensive” :-/

Maybe the original 8088 approach of multiplexing the data bus and address bus could bridge the gap? It’s (much!) slower, and needs external address/data latching. But could fit in 16 data/address pins + control pins 🤔

@ewenmcneill Let's say it's a hypothetical RISC-V variant, which is 32-bit internally, but has a 16-bit data bus (like the old 68K CPUs) and a 24-bit external address bus (32 internally, top bits always 0). That's 40 pins right there. You'd also need:

/IRQ
/NMI
/RES
CLK (input or output, whichever is more convenient)
R/W or a /RD and /WR pair

So let's call it 46 I/O pins needed, and a RISC core that knew how to handle spreading a 32-bit read or write over 2 cycles.

@ewenmcneill If it's on a small PCB with components only on the top side, it can have Pi Pico style castellated holes for board mounting. Make it a square-ish board, with a notch on a corner to signify pin 1. Say you need a power and ground on each side, so that's 13 castellated holes per side, 2.54mm pitch, that's ~30mm per side, call it 35mm square just to be safe.

Use a BGA package FPGA, give it the regulator and capacitor support bits it needs on the module, so it can talk 3.3V bus.

@mos_8502 using a BGA-style FPGA and a four sided (square) castellated “module” is a good idea.

But even ~46 I/Os is taking you to the top end of the iCE40 range, or the lower end of the ECP5 range (and it might need ECP5 for logic). Looks like an ECP5 12K LUT with 90-200 I/Os is US$12-20 (depending on package/quantity). So the full module, with support chips, etc, is probably still ~US$100. Which I guess is still plausible.

@ewenmcneill
I mean, the iCE40LP8K in 81-ball BGA is like $12 and has 60+ I/O lines. Could it do the job?

@mos_8502 8K 4-LUTs is fairly small for a RISC-V 32 bit CPU, but people have definitely put microcontroller class RISC-V CPUs into them, which is possibly what you want anyway.

The ECP5 LFE5U-12F-6BG256C is also about US$12-15, and has 1.5x more LUTs and more I/O pins. Which would leave more room for eg memory bus logic.

Both are 1.4-1.5V core though, so non-trivial power management.

(Link is NZ$)

https://nz.mouser.com/c/?marcom=171422561

@ewenmcneill The ECP5 datasheet says the I/O can take 3.3V levels, though. Putting a 1.xV regulator and some caps doesn't seem much. of a burden.
@ewenmcneill For speed, I mean, anywhere from 8-30MHz would be fine? This is hobby computing we're talking about.
@mos_8502 definitely if “1980s era” clock speeds are fine the problem is a lot easier for a FPGA! (The big issue with crowded FPGAs and clock speeds is it becomes difficult / impossible to put all the right things next to each other, so you end up with long internal chains which are slow to propagate signals. Hence “crowding” being a performance issue — and crowding also slows down synthesis layout planning too.)
@ewenmcneill The two big "retro graphics on FPGA" chips are VERA (which is used in the Commander X16, and a variant is being made for the Sentinel 65X) and the Xosera (which is also very cool). The VERA at least apparently tops out at around an 8MHz bus speed on an iCE40UP5K.

@mos_8502 yes, it’s pretty common for FPGAs to support 3.3V I/O, but have a lower (ie faster, cooler) internal logic. They’ve got voltage converters at the edge I/Os (you just give them 2+ supply voltages, in the right order — VCC core and VCC IO).

IIRC both iCE40 and ECP5 are like that.

@ewenmcneill The LFE5U-12 series would seem to have sufficient I/O to expose all 32 address and data lines, CPU clock, /RD, /WR, /IRQ, and /NMI lines. That's about all that's needed, really. I assume RISC-V starts up from some specific address, so maybe you'd want it to have two clock inputs or a clock divider, so it could run slow to start from slow ROM until it copies all that to fast SRAM and swaps to fast mode?

@mos_8502 people have run FPGA based CPU cores from even SPI flash directly (execute in place), without extra clocks. I think they just stall the memory bus for extended periods, so it’s quite slow.

But yes starting with “long memory fetched cycles” from ROM and then copying into “faster fetch cycles” RAM is pretty common.

@mos_8502 it’s funny. Just this morning I was contemplating a RISC-V or MMIX variant of the 68k as a sort of drop-in replacement for the 68k computer I’m building using $2 10x10 cm boards from JLC.
@guidoism I think RISC-V is the best long term bet, but MIPS and SPARC also have good compiler support.
@mos_8502 not quite sure what you're missing: is it the digital design to put on a sufficiently large FPGA, or is it the sufficiently cheap FPGA that has this many pins?
@funkylab It's the "ready to go" purchasable CPU. I'm complaining about the nonexistence of a product to buy.
@mos_8502 ah! so you would have a design for such an FPGA, you simply don't have an FPGA on a board with configuration memory, decoupling caps and maybe core voltage regulators that you could plug your memories into? (or were you rather thinking of a somewhat large DIP device that you could plug into your own motherboard?)

@funkylab Let's say it's a hypothetical RISC-V variant, which is 32-bit internally, but has a 16-bit data bus (like the old 68K CPUs) and a 24-bit external address bus (32 internally, top bits always 0). That's 40 pins right there. You'd also need:

/IRQ
/NMI
/RES
CLK (input or output, whichever is more convenient)
R/W or a /RD and /WR pair

So let's call it 46 I/O pins needed, and a RISC core that knew how to handle spreading a 32-bit read or write over 2 cycles.

@funkylab If it's on a small PCB with components only on the top side, it can have Pi Pico style castellated holes for board mounting. Make it a square-ish board, with a notch on a corner to signify pin 1. Say you need a power and ground on each side, so that's 13 castellated holes per side, 2.54mm pitch, that's ~30mm per side, call it 35mm square just to be safe.

Use a BGA package FPGA, give it the regulator and capacitor support bits it needs on the module, so it can talk 3.3V bus.

@funkylab Sort of like an Upduino, only specialized to act as a drop-in CPU part, rather than an FPGA dev board.
@funkylab Such a part could be specced such that the same exact hardware could be sold as a drop-in CPU with RISC-V, MIPS, 68K, whatever. A generic, low-performance hobbyist CPU part that is engineered to a cost and intended for that type of use.

@mos_8502 yeah, the problem really will be spec'ing this for low price; can't do it with less than ca 48 GPIOs , and there's simply cost to having enough silicon die to connect that many pads/balls/pins, so the cheapest FPGA families probably won't do. and then we're pretty quickly in >11€/piece territory for the FPGA… add in 1€ for config memory, 1€ for power, 1€ for passives, plus PCB, plus assembly.

Honestly, I could see me spinning a cheap 4-layer board for such an endeavor, but even if …

@mos_8502 … if every enthusiast for custom processors (which will mostly be very advanced retro computing people, since people who just want to design their own CPU will be happier with an eval board) bought, like 4, of these, the upfront development/prototype costs, and the effort to build a programming / test rig even after JLCPCB assembled the boards for free:
I don't think you could push the price much:
Assume you'd sold 1000 of them (which is rather optimistic), and you'd only have 1000€…

@mos_8502 upfront R&D, prototype costs,

then your price would have to be (taxes+per piece fees)+(R&D cost)/1000 above per-piece costs.

costs: 11€ per FPGA, (cheapest I can find on LSCS,), 4€ other components, 5€ board and assembly costs = 19 € per piece

Say you need another 45s per piece for programming, a minimal functional test, and putting a sticker saying "MIPS" on it, and you do that for 30 €/hr (smwhat minimum cost for labor from POV of an employer), incl sticker you're down +50ct;

@mos_8502 so we're at 20€/pc, and we haven't done any sales yet … usually you'd sell at > 2.5× cost (rule of thumb), so we're at 50€ :( :( :(
@mos_8502 @funkylab There is the breadboard-friendly Cmod, but it's not a drop-in replacement. https://digilent.com/reference/programmable-logic/cmod-a7/start
Cmod A7 - Digilent Reference

Cmod A7 The Cmod A7-15T variant is now retired and no longer for sale in our store. The Cmod A7-35T is still available. The Digilent Cmod A7 is a small, 48-pin DIP form factor board built around a Xilinx Artix 7 FPGA. The board also includes a USB-JTAG programming circuit, USB-UART bridge, clock source, Pmod host connector, SRAM, Quad SPI Flash, and basic I/O devices. These components make it a formidable, albeit compact, platform for digital logic circuits and Microblaze embedded softcore p…

@hennichodernich @funkylab Yeah, there's also the Upduino. Not quite what I'm after.
@mos_8502 You are definitely not the only one, “There are dozens of us. Dozens!” I think we are actually at a pretty good time in history to make this happen ourselves. Our options are: 1. FPGAs with lots pins, 2. A "macro”-processor using KiCAD and JLCPCB, and 3. "Cheap” 130nm process shuttles à la @matthewvenn first to make parts for the "macro" processor and eventually building out a whole processor.
@mos_8502 This is something I think about a lot but I'm still pretty ignorant and in the studying phase right now. The DE-10 nano is a good starting place for the FPGA side since there's already a lot of retro cpu development going on for the MiSTer project and has 80 pins available for RAM, ROM, etc.
@mos_8502 I've started playing around with Tiny Tapeout to get some experience designing parts for my "macro”-processor. I ported the 74181 ALU and it barely took up any space but what we are lacking for that particular way of getting onto the shuttle are pins.
@mos_8502 the costs for getting dedicated space for a whole cpu on the shuttle outside of Tiny Tapeout I think are in the $10,000 range. Not sure if that includes packaging. But it's not completely outside of the realm of possibility for a dedicated group of enthusiasts. I would probably go for a highly configurable CPU maybe taking inspiration from Koopman's WISC-16 architecture allowing us to use a common base for multiple ISAs. I personally want the MMIX instruction set.

@guidoism I think for the scale we’re talking, unless we got sponsored, FPGA would be the way to go. I’d love an inexpensive, simple to use, CPU-Only RISV-V or MIPS or SPARC open hardware ASIC, but that’s a lot more money than I think could be raised.

So treat the FPGA as a poor man’s ASIC. Put the programming pins of the config flash on the external pins, make it as minimal as possible, expose a plain static memory bus.

@mos_8502 I know. I know. I’m not ready to give up on this dream though. I know if it happens it’s going to be many many years away and will depend on the shuttles getting even cheaper (I don’t know if they are already at the floor) maybe @matthewvenn can comment.

In the meantime I’m going to get a part onto a Tiny Tapeout run in order to get comfortable with the process. I’m going to concentrate on getting a WISC-style “macro”-processor on $2 10x10 cm PCB that can be used for experimentation.

@guidoism @mos_8502 but then yes, go to ASIC if you're keen to learn about that or just for kicks. Definitely can be done for a few hundred bucks on #TinyTapeout - there are already a ton of cool riscv processors taped out on TT02,03,04 & 05 https://tinytapeout.com/runs/
Tiny Tapeout Chips

Quicker, easier and cheaper than ever to make your own chip!

@mos_8502 @guidoism I would definitely recommend doing it with an FPGA - at least to start.