because of the rotates/funnel shifts thing, i’m wondering which kinds of operations are best to expose in an IR in general

eg arith in mlir doesn’t have shifts, but this results in decomposing to bit twiddling ops and hoping you can put it back together into eg a funnel shift at the ir level, and then just lower to a rotate

i guess that with things like mlir we have this weird problem where irs have expanded breadthwise considerably, but there are still dialects that like, represent “generic” things

eg arith exists to represent generic arithmetic operations, so it makes sense to eg not have rotates cause those are Computer Things

but like, we need to lower to stuff on a real computer at some point

like ideally i think you want to represent Computer Things in a way that will be representable on many computers idk
@barrelshifter I asked a question on the MLIR discourse a few years ago now that was essentially just "why is MLIR only used for frontend IRs and transforms, where are the people working on backend things" and the general vibe seems to be "that's still LLVM's job". I think I specifically asked if anyone knew of any work lifting LLVM into MLIR (so adding a dialect modelling the MC stuff and lowering from LLVM / porting target codegen stuff). I think maybe we're in a kind of Nash equilibrium
@barrelshifter although target codegen could be much nicer, it's not painful enough / there is enough momentum that things at that level have stayed in LLVM territory instead of (what I assumed to be the ultimate aim) MLIR becoming LLVM to a great extent. There is so much low level stuff that is still very generic, and so much optimisation is only exposed when you get to that level, it seems like the potential for building a good still-sorta-target-independent framework is pretty huge
@barrelshifter since I asked that question I've worked at places on NPU stuff where we did in fact do target codegen all in MLIR, with an ELF dialect and honest-to-goodness target instruction encoding as the output. But we had to build everything from scratch each time in each place.
@barrelshifter the dream would be having a dialect stack that went all the way down, LLVM, MC, ELF + x86_64/AArch64/RISCV with all the relevant transformations in passes at each level. If someone releases an ISA extension to some platform that adds an instruction with semantics almost identical to an instruction in some other platform that I've already done lots of work to get hit by MC lowering or some optimisation, I would love to take advantage of that by actually reusing code 🙏
@barrelshifter so I think I'm saying "put a rotate/funnel shift op directly in the IR, but with complications" 😅
@barrelshifter we need more dialects *closer* to the machine than LLVM IR IMHO

@andrew_wja i think mlir folks are generally restricted by “not replacing llvm”

if we want to do this the right way, every backend needs to be represented in mlir

and reasonable abstractions to that goal will fall out

@barrelshifter I would go as far as saying, shouldn't fully replacing LLVM be (one of?) the headline goals of MLIR-the-project. It just feels right 😄

When you finally get that new tool that properly addresses something you've had to do by stringing things together ad-hoc for ages, that seems like the obvious point where you start to spend some amount of time you previously spent just making things work on making things *nice*

@barrelshifter there is some really exciting stuff in the PLDI proceedings this year about building generic lowering/transform machinery for MLIR using semantic descriptions of what ops do, and an SMT solver to decide "yes this op from IR A lowers to this op sequence in IR B" which, once you have that capability, means you only need to describe your target well enough for the compiler to start writing itself
@barrelshifter a bunch of progressively more generic dialects spanning the target <-> LLVM divide helps a huge amount with that kind of thing since the semantic gap between ops in "neighbouring" dialects is small. Your rotate/shift encoding example seems like exactly one of these instances, you know more than you can encode in arith, and you would ideally like to not throw useful info away or use a really awkward or fragile encoding and rely on some later reconstruction to recover what you meant