I want a compiler to optimize away the doubled up check of the static var initialization, is that too much?
https://godbolt.org/z/xsYT8nMd6
(Yes I know, it boils down to optimizing around atomics and nobody wants to touch that too much)
| blog | https://codingnest.com |
| github | https://github.com/horenmar |
I want a compiler to optimize away the doubled up check of the static var initialization, is that too much?
https://godbolt.org/z/xsYT8nMd6
(Yes I know, it boils down to optimizing around atomics and nobody wants to touch that too much)
https://godbolt.org/z/b66vhKb1f
Is there anything actually stopping the compilers to optimize wide_load1 into actually doing a wide load? As I understand the C++ memory model, relaxed atomic load should be reorderable with other loads.

uint8_t arr[4]; std::atomic<uint32_t> a1, a2; uint32_t wide_load1() { uint32_t ret = 0; ret |= arr[3] << 24; a1.load(std::memory_order_relaxed); ret |= arr[2] << 16; ret |= arr[1] << 8; ret |= arr[0] << 0; return ret; } uint32_t wide_load2() { uint32_t ret = 0; ret |= arr[3] << 24; ret |= arr[2] << 16; ret |= arr[1] << 8; ret |= arr[0] << 0; return ret; }
@malwareminigun In Catch2 it is pretty new. It has been in the works for loooong time now, but I rarely have the time to sit down on work on it nowadays, especially on things that need concentration and careful plumbing through trackers.
It also doesn't have negative selection (yet? 😃 ), so you can't ask for "everything except 3rd input".
@malwareminigun Sure, but the practical reality is that for large crossplatform projects you end up needing this.
For example where the tests are not failing, right now we disable 5 tests in our valgrind CI job. 1 because it causes internal Valgrind error, 4 because even the 6 hour timeout we have is not enough (Valgrind...).
Of those, 2 are in huge table-driven test sets, so being able to disable just 1 test from the table saves bunch of complexity in workarounds.
@malwareminigun Catch2's advantage is that it is much simpler to start writing parametrized tests and that the value parametrization models input stream -- you don't need to know how many elements there are ahead of time, it does not even have to be deterministic.
Do you want to run 2 inputs to check for previous regression and then spend 100ms sending random ones? No problem. If the 6th generated input failed,
./SelfTest --rng-seed 2302301031 -g '7' 'my test'
and start debugging.
@malwareminigun A further advantage of GTest's approach here is that it is also very simple to run the different inputs in parallel:
```
$ ctest -j 16
Start 1017: {test}/50
Start 1012: {test}/45
Start 1008: {test}/41
Start 1015: {test}/48
Start 1019: {test}/52
Start 1014: {test}/47
Start 994: {test}/27
Start 1013: {test}/46
Start 1009: {test}/42
Start 999: {test}/32
```
just look at it go :-D
@malwareminigun The advantage of for loop is that it is dead simple. But it is opaque to the test framework, so if you write test like that and want to investigate failure for specific input, you have to edit the sources.
With value param., you can filter down to the failing input just from the runner. This means that, e.g. if you know that test foo, input #23 fails on ARM and that's ok, you can specify this in your CI script, without having to hardcode the detection into the test itself.
There are some advantages to how GTest does value parametrization, but the amount of boilerplate to get even the most simple parametrization going means that we have ton of basically duplicated tests.
With Catch2 these 12 tests I am looking at would've been 1, because the barrier to go from 1 concrete test to a value parametrized test is tiny.