Mastodawn

Josh Grams

@evilpaul_atebit It'd be cool if it showed you which level you're on while you're in the level? Unless that is somewhere and I'm missing it.

I only did the first 8 levels but doing World 2 level 4 in three instructions was very satisfying.

@evilpaul_atebit This gives me 100% but is still showing a handful of red pixels...?

sub #31,r0,r0
sub.mi #-1,r0,r0
mulu r0,r0,r0
sub #31,r1,r1
sub.mi #-1,r1,r1
mulu r1,r1,r1
add r0,r1,r0
sub r0,#981,r0
mov.lt #6,r0
mov.ge #3,r0

@evilpaul_atebit Oh wait, I guess to get 100% you just have to get it to the point where the error rounds down to zero? 64 squared is 4096, half a percent is 20.48 pixels. And I got it down to 16. Ha, nice.

@evilpaul_atebit I have no idea how you're supposed to get the circles pixel-correct, but otherwise... fun.

jcreed May 14, 2024

@JoshGrams @evilpaul_atebit aha, pixel-perfect!

Josh Grams May 14, 2024

@jcreed @evilpaul_atebit Huh! Doubling the resolution? I wouldn't have thought that would make a difference. Very cool!

Looks like anything from 4091 to 4106 works... interesting that it's asymmetrical around 4096, but I guess that makes sense, since it's a curve.

jcreed May 14, 2024

@JoshGrams @evilpaul_atebit it's that it lets you put the center of the circle *between* pixels, at (31.5, 31.5), which means it's actually symmetric when I interchange pixel coords 0 and 63, and 1 and 62, and so on.

Josh Grams May 14, 2024

@jcreed @evilpaul_atebit Ah, I see.

And 512 and 2048 work for the next one, which is kinda doing my head in because neither of those are squares, but... cool.

Josh Grams May 14, 2024

@jcreed @evilpaul_atebit So mine was off because I was still doubling the zero at the middle from the diamond. I'm actually surprised now that that worked as well as it did.

Paul Grenfell May 15, 2024

@jcreed @JoshGrams Interesting. Someone else had solved it also in 10 instructions, but differently. Oh.. and SIMD would shave more instructions here so I'm getting more and more keen to add that ;)

@evilpaul_atebit @JoshGrams haha me too...

it suddenly occurred to me that mixing "SIMD register specs" and ordinary registers could be confusing or erroneous.
mov rH, r0
add rH, #1, r0
seem of dubious meaning (unless the last is interpreted as r0 := r4+r5+r6+r7+#1?? 😨 )

not sure what to do about it. maybe just require that all registers have to be of the same type for an instruction? does that mean the concrete syntax for them should be different?? guess it's up to you to make these calls :)

Paul Grenfell May 15, 2024

@jcreed @JoshGrams I was thinking of just saying that all regs have to be of the same type. Feels a little awkward and unsatisfying..

@evilpaul_atebit @JoshGrams yeah, dunno. brainstorming some alternatives (for something that's meant to be equivalent to
mov r0, r4
mov r1, r5
mov r2, r6
mov r3, r7)
I can think of
mov.4 R0, R1 ; R0 = r0,r1,r2,r3, and R1 = r4,r5,r6,r7
mov.4 R0, R4 ; but maybe you'd have to forbid overlapping register groups?
mov r0:4 r1:4
mov r0-3, r4-7
mov r0123, r4567

@evilpaul_atebit @JoshGrams (& I want to triple down on the sentiment that --- it's your game, and please feel very free to ignore any of my suggestions, they're offered in the spirit of hoping they help but no big deal if not :)

Josh Grams May 15, 2024

@evilpaul_atebit I've been mostly ignoring this part of the conversation, but... idk. Multi-register operations feel like something that is moving away from the spirit of assembly language.

Deciding how to handle "I need to do the same thing multiple times" is very much a part of all the asm writing that I've done: do you cut-and-paste? Just re-type it? Write a preprocessor macro? Make it a subroutine? Are you optimizing for overall code size, or for execution speed? etc.

So I think if it were me I'd be leaning toward adding a macro facility or a call stack. But I've always felt that code re-use and editing/re-factoring/tinkering with existing code is a major part of my coding time but is unfortunately left out of coding games, so I'm biased.

And as @jcreed says, I don't want to tell you what's best for your game.

@JoshGrams @evilpaul_atebit huh! tbh I'm stoked that you have different aesthetic feelings on this issue..., it means we're in an interesting space. for me, having played a bunch of "assembly programming games", SIMD is something I haven't seen explored, and it seems like a sweet spot of "kind of awkward but potentially powerful" to me that might lead to fun surprises while optimizing.

@JoshGrams @evilpaul_atebit but if these kinds of instructions are "not in the spirit of [the kind of asm you want to program in]", that's totally legit!

dunno, I think macros could be an interesting feature in general, but my gut sense is they would be Very Big Change relative to this particular game

@jcreed @JoshGrams Well.. yes.. this is a super interesting discussion and it's great to see different views!

The "hardware model" that I have for this thing is super limited "no branches" and "no local memory", so that rules out functions and loops. I did this partly to make it close to modern shader arch (where branches and loops hurt performance so you try to avoid them), partly to make the compile/execution code faster/easier and partly because it feels closer to the hardware that was available in the 70s/80s timeline that it I'm targeting.

@jcreed @JoshGrams I'm leaning heavily towards the SIMD thing because it's something I've not really played with, and it seems like an interesting area to get really clever with size optimizations and that's what some of the games goals are geared towards. I do worry that it might be too complex/scary though.. so I might introduce it later in the game.

I also have to admit that macros are interesting too..

@jcreed @JoshGrams I took a quick hack at implementing SIMD last night. It sort of worked.. but I went for rH/rL type pairings and didn't like it.

I think I'll go for something like:

mov r01,r10 ; swap r0 and r1

add r01,#1,r01; add 1 to r0 and r1

mov r000,r123 ; copy r0 to r1,r2 and r3

etc...

@jcreed @JoshGrams Overlapping ranges are fine here. Flags.. hmm.. I guess I'll use a combination of all operations: zero flag is set if any of the values are zero.

@jcreed @JoshGrams But anyway... after my quick hack I realised that I need to go back and refactor some compiler stuff to make it more robust to the types of modifications that I'm going to need to make. I'll mull on my decisions while I do that..

@evilpaul_atebit @jcreed yeah, the SIMD stuff I've tried is all a separate register file, so it has a multiple flags register too.

@evilpaul_atebit @jcreed "went for rH/rL type pairings and didn't like it" Ha, yeah. Thinking about it more, a lot of my bias against SIMD is that the little real-world stuff always seemed super clunky and not something you'd want to do by hand, and probably only really saved time/space if you were doing linear algebra or data processing at scale.

Load exactly 4 values from an aligned memory address, do the same thing to all 4 of them: a speed optimization that wasted space and processing units if your use-case didn't line up perfectly exactly with the capabilities.

@evilpaul_atebit @jcreed But yeah, moving in the more flexible direction that you two are talking about (even if I highly doubt it's something you could do with real hardware?) seems like a fun idea for the game

@evilpaul_atebit @jcreed this is making me really want to go build a programming game, heh

@JoshGrams @jcreed Haha.. the more the merrier :)

Josh Grams May 15, 2024

@jcreed @evilpaul_atebit Re: overlapping register groups: I think depending on how you implemented it in hardware, overlapping groups would work fine, you'd capture all the outputs before the new values propagated?

In a software emulation of it you'd have to be careful.

@JoshGrams Ahh... yes, the error rounds to nearest - it should round down. I'll fix that, thanks.

@JoshGrams @evilpaul_atebit I went to bed thinking any fewer than 5 was obviously impossible and woke up and banged out the 3-instruction version in like 2 minutes. so satisfying, yes.

@JoshGrams @evilpaul_atebit
Was "fewer than 22 instructions" meant to be "fewer than 12 instructions"?

@jcreed @evilpaul_atebit Yeah, I wondered that too, because my first working try was 10 instructions.

@JoshGrams @evilpaul_atebit yeah, I was gonna say, I bet I could shave one more instruction off my solution

@jcreed @evilpaul_atebit Yeah, I'm still parsing yours. What I did was... for the stripes, since you need the 2 bit and the colors are 2 apart:

add r1,#1,r2
and r2,#2,r2
add r2,#5,r2 ; 5 or 7

Then center the x coordinate:

sub #31,r0,r0
sub.mi #-1,r0,r0

Triangularize the y coordinate:

sub #31,r1,r7
sub.mi #63,r1,r1

Stick them together and choose blue or stripes:

sub r0,r1,r0
mov.gt #1,r0
mov.le r2,r0

@JoshGrams @evilpaul_atebit nice! I feel like par could be lowered on several other puzzles as well, e.g.

@JoshGrams @evilpaul_atebit what a great little game, though. What I wouldn't do for some SIMD instructions...

@jcreed @JoshGrams Thanks for the feedback! Do you have suggestions for how SIMD might work here?

@evilpaul_atebit @JoshGrams good question, I haven't thought too deeply about it, but just to throw an idea up: what if you could say something like rL to mean all registers r0...r3 uniformly, and rH to mean all registers r4...r7 uniformly, so that
"add rL, rH, rH" would mean
add r0, r4, r4
add r1, r5, r5
add r2, r6, r6
add r3, r7, r7
?

@evilpaul_atebit @JoshGrams I'm thinking this would allow tightening up of identical preprocessing that I do to the x and y coordinates, (and maybe it'd be fun to only introduce the instructions later in the game so that people can go back and revise their earlier naive sol'ns) but surely there's bound to be other sneaky uses of it I haven't thought of

@evilpaul_atebit oh man and if you haven't already made plans to do it, giving access to some character/font/texture lookup data (presumably only in later levels) so that you can do the kind of "text mode" animations that are already in the background of the title screen would be super fun

@jcreed I was thinking about this again recently, but I deliberately steered away from addressing modes other than register and direct because memory access "doesn't make sense" in my mental model of the hardware.

I can see how I could have pre-defined lookup tables (font, etc.) and something like `fetch index,dst` without breaking the model. Hmmm.. that could work

Maybe even a way (on later levels..) for the user to create the lookup tables

@evilpaul_atebit interesting, makes sense! good game design --- especially for this genre, which benefits I think from the architecture being "cute" and "small" enough to kind of fit in your head --- does require a lot of very careful saying "no" to the endless stream of possible additions

@jcreed My challenge now is to make enough interesting levels while keeping things fresh. I expect I'll need to introduce new features to do this

@evilpaul_atebit yep, that sure is the challenge... but I really do think you've got a great start here, best of luck! happy to take a crack at any future betas :)

@jcreed Thanks. I'll go away and think about my next steps now. Maybe I'll have another iteration in another month or two

@evilpaul_atebit oh, one more little idea of an option for how "texture memory" might be handled: you could assign one of the registers to be the index into texture memory, and another be the "value", i.e. if r7 is the "index" and rtex is the "value", then maybe like

and #7, r0, r0
and #7, r1, r1
asl r1, #3, r1
add r0, r1, r7
add #3510, r7, r7 ; 3510 = 65 'A' * 64 pixels per character?
mov rtex, r0

would draw a screen full of 8x8 'A' characters?

@jcreed Oh damn it.. I want to go and try this now but I have more important stuff to do first :D

@evilpaul_atebit gamedev is fun innit :)

@jcreed ...and `and #7,rL,rL` would shave off an instruction..

@evilpaul_atebit actually although I proposed above having a single combined register for x and y, on second thought it seems much more in line with your existing design to just use (x=r0, y=r1) (or alternatively x=r4,y=r5? for simd?) for indexing into the "texture image". that way just
mov rtex, r0
would copy the input image to output, and maybe you could have puzzles about transforming various input images in various ways?

Paul Grenfell May 12, 2024

@jcreed Think I will go for `FETCH [Rn|#],[Rn|#],[Rn]` so that you can choose the x/y source registers. I really like the idea of puzzles about transforming input images.

jcreed May 12, 2024

@evilpaul_atebit cool, that certainly works. excited to see where this goes...

@jcreed @JoshGrams I think I'll implement that and see what kind of fun it unlocks.. thanks!

@jcreed @JoshGrams I'm very, very happy to see you two solving these in fewer instructions than I did!

@evilpaul_atebit @JoshGrams I'm stoked to be showing up early enough in the adoption curve of a game to still have a chance to make clever solutions. The steam-friend leaderboards for TIS-100 were appalling for me, because I made the mistake of being friends with really clever old-school graphics hackers hahaha

@jcreed @JoshGrams Smart! 👍

@jcreed @evilpaul_atebit Ah, yeah, you have an add r3,#0,r3 that's just to set flags: you could maybe drop that with some rearrangement of code/registers.

Ooh, and you're putting the stripe directly in the output register, which feels like it could save an instruction. Now I'm going to have to see if I can get 9 instructions.

@jcreed @evilpaul_atebit Yup. Sweet.

@JoshGrams @jcreed Very nice! New world record ;)

Josh Grams May 12, 2024

@jcreed @evilpaul_atebit I'm still slightly annoyed that these first two diamond ones are 33 pixel diamonds, so you can't get 100% with a diagonal grid and a bitmask, because that's fun and elegant.