Hunter (@huntermbown)
ZMLX가 GLM 4.7용 Flash 4bit 양자화에서 디코드 성능을 약 +8% 향상시켰으며, 해당 개선이 ExoLabs 환경에서도 작동한다고 보고했습니다.
Hunter (@huntermbown)
ZMLX가 GLM 4.7용 Flash 4bit 양자화에서 디코드 성능을 약 +8% 향상시켰으며, 해당 개선이 ExoLabs 환경에서도 작동한다고 보고했습니다.
Hunter (@huntermbown)
GLM 4.7 Flash 4bit에서 디코드 성능이 +8% 향상되었다는 보고. 해당 개선은 ZMLX에서 제공한 것으로 보이며, @exolabs 환경에서도 작동이 확인되었다고 언급됨.
Hunter (@huntermbown)
ZMLX의 GLM 4.7 Flash 4bit에서 디코딩 성능이 약 +8% 향상되었고, 해당 개선이 ExoLabs 환경에서도 작동한다는 업데이트입니다. 저비트 양자화(4bit) 기반 모델 최적화와 실사용 환경 호환성 측면에서 중요한 성능 개선 소식입니다.
TD4 4-bit Sound
Over on my other blog, I spentt a fair bit of time looking at the TD4 4-bit CPU. One of the things I wanted to do with my NAND Oscillators and Logic Sequencer PCB was hook up the address/select pins to something else. And with three select pins, allowing the choice between 8 notes, what better to connect it to, than a 4-bit CPU?
https://makertube.net/w/aroDZYM2BHYpoB9QLJvHnk
Warning! I strongly recommend using old or second hand equipment for your experiments. I am not responsible for any damage to expensive instruments!
If you are new to microcontrollers, see the Getting Started pages.
Parts list
The Setup
The most obvious thing in my mind, is to hook up three of the four outputs to the three selection pins of the NAND sequencer, so that is what this post explores.
The NAND PCB needs the jumpers removing, which disconnects the pot-driven oscillators. Then the three select/address lines can be connected to three of the four resistors supporting the OUTPUT LEDs of the TD4, as shown above.
It is also possible to use the POWER header pins to power the NAND PCB too.
Any of the variants of TD4 I’ve built could be used, but I’ve shown above where they would need to be connected on the original. In the end I actually soldered four header pins to the appropriate side of the resistors on my own PCB version of the TD4 as shown below. A bit crude, but it does the job.
Connecting these over to the NAND sequencer and hooking up power gives me the following.
The Code
The simplest way to create a sequence is a set of OUT xx instructions where the least significant 3 bits (so values 0 to 7) map onto the three possible notes played by the NAND sequencer.
This is the simple LED OUTPUT code from Part 3 of my series, but this continually toggles between the lowest and highest notes.
0000 OUT 0001 # 1000 1101A counter can be used to play all 8 notes. Note that in this code B will go from 0 to 15 (b0000 to b1111) but only the last three bits select notes. This means that the sequence will count from b000 to b111 twice for each pass through this loop with the top bit being ignored.
0000 ADD B,0001 # 1000 1010There are only two speeds though, 1Hz and 10Hz so the above, which has three instructions, has a tempo of 20 bpm (1 note every 3 seconds) or 200 bpm (approx 3 notes every second). The tempo can be slowed down in steps of 1 second or 1/10 second by moving the JMP an instruction further down and back-filling with other instructions (ADD A,0 or b00000000 is a good one, and is essentially equivalent to a NOP).
The following code uses the INPUT as a counter in a loop to provide a partly configurable tempo.
0000 IN A # 0000 0100 A = INPUTThis is still only cycling through each note individually though, but that is kind of what an 8-step sequencer would do.
To get more creative with the programmability of the sequencer requires a series of OUT instructions and NOPs between them, for example:
0000 OUT 0000 # 0000 1101 OUTPUT = 0000 # Play note 000This last programme is the one running in the video at the start of this post.
Closing Thoughts
I appear to have made a sound card for a 4-bit CPU 🙂
One thing I am quite keen to do is connect up the sequencer’s select pins to the TD4’s address lines, as I’d like to be able to have some incidental (accidental?) music that appears as a result of the CPU just running any other normal programme.
To do this I’d need to either hook into the output of the PC register or the input to the HC154 ROM decoder.
In fact, it would be really interesting to be able to hook up any sets of four signals – so the INPUT selector, or even the control decoding logic – just to see what it sounds like as the CPU is running normal code. That might require a special build of the CPU though.
I also have an address line spare of course, so it would also be interesting to use that to select between two NAND sequencers to give me a 16 step sequence.
Kevin
TD4 Sequencer and NAND Oscillators

NVIDIA công bố đột phá nén mô hình từ 16-bit xuống 4-bit với độ chính xác giữ nguyên tới 99.4% – gần như không mất dữ liệu. Công nghệ này hứa hẹn thu nhỏ kích thước AI, tăng tốc độ xử lý và tiết kiệm năng lượng, mở đường cho AI nhỏ gọn, hiệu suất cao trên thiết bị di động và biên. #NVIDIA #AI #MachineLearning #AICompression #4bit #AIHiệuSuấtCao #TríTuệNhânTạo #NénMôHình
Scott (@scottstts)
LM Studio에서 mlx 4비트 버전 GLM 4.7 Flash 모델(mlx-community/GLM-4.7-Flash-4bit)을 사용할 때 발생하는 문제에 대한 문의입니다. 작성자는 mlx 런타임이 최신이라고 보고하며 동일한 문제를 겪는 사람이 있는지 @lmstudio와 @awnihannun에 묻고 있습니다.
Ivan Fioravanti ᯅ (@ivanfioravanti)
모델 양자화 관련 의견: 4비트(4bit) 양자화는 과도한 압축으로 인해 품질이 떨어지는 반면, 5비트(5bit) 양자화는 결과가 훨씬 낫다는 경험을 공유한 짧은 코멘트입니다. 경량화-정밀도 트레이드오프에 대한 실무적 관찰입니다.
Q*Satoshi (@AiXsatoshi)
이번 테스트는 극단적으로 긴 프롬프트를 사용해 Mac에서 처리 시간이 길었지만, 일상적 사용에서는 크게 문제되지 않는다고 설명. 프롬프트 처리를 기다릴 수 있다면 Mac Studio에서 더 큰(약 5배 이상) 모델을 실행할 수 있다는 관찰을 공유했다.
Q*Satoshi (@AiXsatoshi)
두 환경 모두 MimiMax-M2.1 모델을 4비트(4bit)로 사용했다는 언급으로, 양자화된(4-bit) 모델을 통한 경량화된 로컬 추론 구성을 사용하고 있음을 알린다.