用 C-Reduce 找問題 (包括了程式的問題以及可能的 compiler 問題)
上個禮拜看到「You can use C-Reduce for any language (bernsteinbear.com)」這個,原文
#Computer #Murmuring #Programming #Software #bug #c #creduce #language #programming #reduce
用 C-Reduce 找問題 (包括了程式的問題以及可能的 compiler 問題)
上個禮拜看到「You can use C-Reduce for any language (bernsteinbear.com)」這個,原文
#Computer #Murmuring #Programming #Software #bug #c #creduce #language #programming #reduce
You can use C-Reduce for any language | Max Bernstein
Link📌 Summary: C-Reduce 是一個工具,最初由 Regehr 和他的團隊開發,旨在減少 C 編譯器的錯誤重現代碼。儘管最初為 C 語言設計,但它實際上對其他語言同樣適用,只需滿足幾個條件。使用者只需提供一個可重現的錯誤條件和一個或多個可變的源文件,C-Reduce 便能自動縮減代碼。文章中舉了 RustPython 的實例,展示瞭如何透過 C-Reduce 在短時間內將一個文件縮減近 50%。整體操作快速且有效,適合開發者報告軟體錯誤時使用。--not-c 參數可避免 C-Reduce 的 C 特定操作,適用於非 C 語言的代碼處理。C-Reduce is a tool by Regehr and friends for minimizing C compiler bug reproducers. Imagine if you had a 10,000 line long C file that triggered a Clang bug. You don’t want to send a massive blob to the compiler developers because that’s unhelpful, but you also don’t want to cut it down to size by hand. The good news is that C-Reduce can do that for you. The bad news is that everyone thinks it only works for C.
👎 using #cvise (#creduce replacement) to reduce the C code triggering a compiler problem
👍 using cvise to reduce the C code triggering a crash inside cvise
While trying to reduce a C file, clang_delta, crashes with the following assertion: 00:00:00 INFO ===< ClangBinarySearchPass::replace-function-def-with-decl (30 T) >=== 00:00:00 WARNING clang_delta...
👎 użycie #cvise (alternatywa dla #creduce) do zredukowania kodu C powodującego błąd kompilatora
👍 użycie cvise, by redukować kod C powodując wysypanie się cvise
While trying to reduce a C file, clang_delta, crashes with the following assertion: 00:00:00 INFO ===< ClangBinarySearchPass::replace-function-def-with-decl (30 T) >=== 00:00:00 WARNING clang_delta...
Challenge 2: #debugging. You spend a ton of time writing the compiler, fingers crossed for getting at least the hello world working, but instead you get table/memory out of bounds! Unreachable instruction executed! What on earth went wrong!?? It's totally not a shame to not get things right on the first try. Especially compilers, one small mistake may be amplified repeatedly at compile-time, making the output a pile of trash.
But you're unlucky if targetting #wasm. You may have searched the internet and found blog posts about source maps, dwarf, v8 inspector, or some wasm engine claiming to support debugging via lldb/gdb. My own experience as of today: they are extremely fragile if not to say non-existent. Give them a try anyway, but keep in mind they don't qualify as your lifeboat, and to get dwarf stuff working you need a ton of extra effort during code generation!
There're still some strategies you can follow.
First and foremost: crash early. Instrument your code aggressively, whenever you doubt if a property holds at runtime, assert it. It's common that the runtime state is already corrupt but the module runs longer and trips on other seemingly irrelevant places. You may also dump logs, they do help sometimes.
Next: shrink it. Use wasm-reduce in #binaryen to shrink the wasm module, or even better, use #creduce to shrink the miscompiled module's assembly source (if you know it's the crime scene), or the offending input that triggers the bug. Shrinking is an absolute must to minimize the debugging overhead. In the worst case you don't get additional insight, but at least you get some coffee breaks to relax :/
Sometimes you have an alternative compiler which emits correct wasm from the same input, which can be regarded as the source of truth. Luckily this was the case for #ghc wasm backend! GHC has target-specific assembly generators, but also a target-independent c generator, which is meant to ease porting GHC to new platforms. And it was tremendously useful when I debugged the wasm backend's code generator part; I even spent extra effort to make callconv & symbol names coherent between the two codegens, mixed good/bad objects at link-time, this was super useful when narrowing down the actual crime scenes.
Another low effort thing to try, especially if your compiler piggybacks on other toolchains like #llvm or binaryen: turn off any optimization. If you're lucky, it's someone else's bug :)
#CReduce is an impressive piece of software. If you're not familiar with it: you give it a C file that exhibits a bug, or some other interesting property (you specify a shell script that returns 0 only if the file is interesting). It then proceeds to shorten the file considerably.
I had a 10 kLOC file on which my analyzer showed buggy behaviour, CReduce shrinked it by 99.6 % 😅 (so far)