Is there anyone here with experience with #JMeter?

I'm generating a dashboard with jmeter.sh -g data.jtl -o dashboard -p my.properties and I've set jmeter.reportgenerator.overall_granularity to 5000. However, every graph gives the elapsed time with granularity 1 min.

I'm really not in the mood to plot the graphs myself.

#AskFedi

Как мы выстроили процесс нагрузочного тестирования в KISLOROD

Привет, Хабр! Меня зовут Эдуард, я руковожу отделом DevOps в компании

https://habr.com/ru/companies/kislorod/articles/964212/

#нагрузочное_тестирование #devops #тестирование_производительности #jmeter #zabbix #load_testing #тестирование_вебсервисов #stress_testing #тестовый_сервер #сценарии_тестирования

Как мы выстроили процесс нагрузочного тестирования в KISLOROD

Привет, Хабр! Меня зовут Эдуард, я руковожу отделом DevOps в компании KISLOROD . В этой статье расскажу про подход к нагрузочному тестированию, который сформировался у нас. Мы постоянно...

Хабр

So sánh hiệu năng giữa Virtual Threads và Platform Threads trong Spring Boot với JMeter Load Test:

Virtual Threads cho thấy khả năng mở rộng tốt hơn và thời gian phản hồi nhanh hơn so với Platform Threads khi xử lý các tác vụ I/O. Tuy nhiên, với các tác vụ nặng về CPU, Virtual Threads không mang lại lợi thế tương tự.

#VirtualThreads #PlatformThreads #SpringBoot #JMeter #Performance #IO #LậpTrình #CongNghe

https://www.reddit.com/r/programming/comments/1nd95t0/comparing_virtual_threads_vs_platf

I need help. First the question: On #FreeBSD, with all ports built with #LibreSSL, can I somehow use the #clang #thread #sanitizer on a binary actually using LibreSSL and get sane output?

What I now observe debugging #swad:

- A version built with #OpenSSL (from base) doesn't crash. At least I tried very hard, really stressing it with #jmeter, to no avail. Built with LibreSSL, it does crash.
- Less relevant: the OpenSSL version also performs slightly better, but needs almost twice the RAM
- The thread sanitizer finds nothing to complain when built with OpenSSL
- It complains a lot with LibreSSL, but the reports look "fishy", e.g. it seems to intercept some OpenSSL API functions (like SHA384_Final)
- It even complains when running with a single-thread event loop.
- I use a single SSL_CTX per listening socket, creating SSL objects from it per connection ... also with multithreading; according to a few sources, this should be supported and safe.
- I can't imagine doing that on a *single* thread could break with LibreSSL, I mean, this would make SSL_CTX pretty much pointless
- I *could* imagine sharing the SSL_CTX with multiple threads to create their SSL objects from *might* not be safe with LibreSSL, but no idea how to verify as long as the thread sanitizer gives me "delusional" output 😳

I now implemented a per-thread #pool to reuse #timer objects in #poser (my lib I use for #swad).

The great news is: This improved performance, which is an unintended side effect (my goal was to reduce RAM usage 🙈😆). I tested with the #kqueue backend on #FreeBSD and sure, this makes sense: So far, I needed to keep a list of destroyed timers that's always checked to solve an interesting issue: By the time I cancel a timer with #kevent, the expiry event might already be queued, but not yet read by my event loop. Trying to fire events from a timer that doesn't exist any more would segtfault of course. Not necessary any more with the pool approach, the timer WILL exist and I can just check whether it's "alive".

The result? Same hardware as always, and now swad reaches a throughput of 26000 requests per second with (almost) perfect response times. 🥳

I'm still not happy with memory usage. It's better, and I have no explanation for what I oberved now:

Ran the same test 3 times, 1000 #jmeter threads each simulating a distinct client running a loop for 2000 times doing one GET and one POST for a total of 4 million requests. After the first time, the resident set was at 178MiB. After the second time, 245 MiB. And after the third time, well, 245 MiB. How ...? 🤯

Also, there's another weird observation I have no explanation for. My main thread delegates accepted connections to worker threads simply "round robin". And each time I run the jmeter test, all these worker threads show increasing CPU usage at a similar rate, until suddenly, one single thread seems to do "more work", which stabilizes when this thread is utilizing almost double the CPU as all other worker threads. And when I run the jmeter test again (NOT restarting swad), the same happens again, but this time, it's a *different* thread that "works" a lot more than all others.

I wonder whether I should accept scheduling, memory management etc. pp are all "black magic" and swad is probably "good enough" as is right now. 😆

I now added a #lockfree version of that MPMC job queue which is picked when the system headers claim that pointers are lockfree. Doesn't give any measurable performance gain 😞. Of course the #semaphore needs to stay, the pool threads need something to wait on. But I think the reason I can't get more than 3000 requests per second with my #jmeter stress test for #swad is that the machine's CPU is now completely busy 🙈.

Need to look into actually saving CPU cycles for further optimizations I guess...

Finally getting somewhere working on the next evolution step for #swad. I have a first version that (normally 🙈) doesn't crash quickly (so, no release yet, but it's available on the master branch).

The good news: It's indeed an improvement to have *multiple* parallel #reactor (event-loop) threads. It now handles 3000 requests per second on the same hardware, with overall good response times and without any errors. I uploaded the results of the stress test here:

https://zirias.github.io/swad/stress/

The bad news ... well, there are multiple.

1. It got even more memory hungry. The new stress test still simulates 1000 distinct clients (trying to do more fails on my machine as #jmeter can't create new threads any more...), but with delays reduced to 1/3 and doing 100 iterations each. This now leaves it with a resident set of almost 270 MiB ... tuning #jemalloc on #FreeBSD to return memory more promptly reduces this to 187 MiB (which is still a lot) and reduces performance a bit (some requests run into 429, overall response times are worse). I have no idea yet where to start trying to improve *this*.

2. It requires tuning to manage that load without errors, mainly using more threads for the thread pool, although *these* threads stay almost idle ... which probably means I have to find ways to make putting work on and off these threads more efficient. At least I have some ideas.

3. I've seen a crash which only happened once so far, no idea as of now how to reproduce. *sigh*. Massively parallel code in C really is a PITA.

Seems the more I improve here, the more I find that *should* also be improved. 🤪

#C #coding #performance

Apache JMeter Dashboard

Working on the next release of #swad, I just deployed an experimental build with the server-side #session completely removed.

Then I ran the same #jmeter stress test on it as before. It simulates 1000 distinct clients, all requesting the login form and then POSTing "guest:guest" login to trigger proof-of-work 50 times in a loop, timed in a way so an average of 1000 requests per second is sent.

After running this once, I thought I didn't gain much. The old version had a resident set of 95MiB, the new one 86MiB. But then running it two more times, the resident set just climbed to 96MiB and then 98Mib, while the old version ended up at something around 250MiB. 😳

So, definitely an improvement. Not sure why it still climbs to almost 100MiB at all, maybe this is #jemalloc behavior on #FreeBSD? 🤔

One side effect of removing the session is that the current jmeter test scenario doesn't hit any rate-limiting any more. So, next step will be to modify the scenario to POST invalid login credentials to trigger that again and see how it affects RAM usage.

Finally testing #swad with some simulated heavy load is getting somewhere. I'm testing with #jmeter, simulating 1000 unique clients (each with a random IP address used in X-Forwarded-For) doing a loop getting the #login form and POSTing a "guest login" (which triggers creating a session) 50 times in a row.

With some tweaking (bump up number of working threads and queue size for thread jobs, disable resolving remote host names), swad can deal with that load on my machine. The resident set size goes up to 100MiB. Repeating this a few times, I could bump it up to 300MiB as well, so still unsure whether server-side sessions might be TOO expensive for deployments of swad to a really busy site. 🤔 but for now, I'll leave that.

Here are two relevant graphs from these jmeter tests. I'm not sure why, during rampup, some requests fail in weird ways. It's paradox this doesn't happen any more once "full load" is reached, could there be an issue with jmeter itself maybe?

I guess I'll create the next #swad release soon, including a few minor performace tweaks now added and with a new README section with configuration tips for very busy sites.