@trisha_gee pushing tests down to as low a level as possible and making sure they only test one thing should help avoid duplication.
Those validations should be unit tested, and a journey test should not check a validation, for instance.
Innervation tests should test integration, not business logic, etc.
It pays to be specific about the type of tests you use, and what they are for.
@wouterla @trisha_gee
Yes, good organization of tests certainly reduces overlap, but I've never worried about overlap. It seems to me that aiming for the minimum number of tests is a concern for efficiency, not effectiveness.
What is the driving force to reduce the number of tests?
@trisha_gee @wouterla
In my experience, "time to run tests" and "number of tests" are not closely correlated.
What makes their tests slow?
- Dependency on external systems, e.g. database?
- Intentional delays waiting for system state to stabilize?
- Stochastic tests to look for infrequent events?
- or something else?
@trisha_gee If the individual tests are fast, I can imagine you'd start looking at other optimizations.
My comments are mostly triggered by this uncertainty about duplication. That seems to imply that the tests are not very targeted. Which would mean that, to me, they're not the right type of tests.
@gdinwiddie
@trisha_gee @gdinwiddie @wouterla #intellij has test cover run, probably with that I would go over one buy one, and in maximum 1 week I would clear 10k tests, the number looks big but when you jump in it's very simple task to find intersection and cleanup 🤓
another approach would be write #gradle or #mvn plugin to give a report of intersecting tests, implementing this probably will take a week or 3 and cleaning up would take another week, min 4 weeks, maximum 2 months this task would be done.
@trisha_gee Yes, that is a very common problem. In most cases, the type of tests is a major reason for the slowness. Tests that exercise too much code, and thus are not very specific in what they test.
Moving towards more specific, 'lower level', unit-type tests allows for faster execution, and less duplication in logic exercised.
In that way, this is not about being selective in which tests to run/keep, but in being selective in which type of tests to use for which purpose.
@trisha_gee @gdinwiddie @wouterla
I focus on prioritizing the tests.
Run the fast ones first, and frequently. And those most likely to fail -- those most closely related to the code that was changed, and those that have failed recently.
Defer running slow tests to a batch/server process. Or run them overnight.
...
@trisha_gee @gdinwiddie @wouterla
I would like to see xUnit testing frameworks integrated with code coverage frameworks, and change tracking. I'd like to see it keep track of the covered lines for each test, and first run tests that run through the lines that changed. And newly added tests (before that).
Also prioritize tests that have failed recently.
Prioritize faster tests over slower. Defer very slow tests to the very end.
Multi-threading and multi-process should also be considered.
@trisha_gee Whenever I have to change a a feature with it test I tried to detect duplicate test. I use the code coverage feature to delete a test and detect how coverage change. I am speaking of different levels of test that might have duplicates. I also migue use a ParameterizedTest to remove duplications of code if possible.
I don't do any active finding because if a code and its tests don't have to change I don't change it.
@trisha_gee Advice I have heard (but not yet applied, sigh) is to run the test suite under a profiler and look for hot spots - any code that is executed by more than ~5 tests is suspect.
This could be because of common setup code (a code smell) or because of test overlap. Either way, refactor.
I'm inclined to think about "eliminate 'hot spot' code" is ...
"Wow! What a problem! We've found that all the tests for all the different ways of pricing an insurance policy must first create a Policy. We need to eliminate that!!!"
Now how is one supposed to do that?
And in what way(s) would that be desirable?
(Or even the least bit acceptable?)