21 Followers
47 Following
16 Posts
@pkhuong @fclc Briefly, to scratch the personal itch of insufficient documentation.
@mbr @fclc I should tighten up the description at some point, but “here be dragons” is the correct conclusion - the semantics are only easily describable when used as just-mul or just-add (though perhaps if I study it long enough, I’ll find a good description of the combined semantics).
@raph RE the Tenstorrent mention, there’s 32-wide SIMD in addition to the matrix unit, in case that’s useful for your workloads (see https://www.corsix.org/content/tt-wh-part6).
Tenstorrent Wormhole Series Part 6: Vector instruction set

@steve Am skiing in Austria at the moment. Is good.
@fclc Crays come in a size other than large?
@fclc Puns will continue until morale improves?
@steve @dotstdy @pervognsen Or if you’re looking for an excuse to use avx512, VFIXUPIMMPS.
@moonchild @harold And then arm64 came along, with sane and tasteful flag behaviour, but the infinite monkeys wanted to efficiently emulate x86 on arm64, and so rmif was invented.
@moonchild @harold bt/btc/similar modify CF, preserve ZF, and undefined other flags, so a valid implementation could preserve all but CF
@steve So many sparse FLOP/s!