[Twisted-Python] Test coverage requirements

Glyph Lefkowitz glyph at twistedmatrix.com
Mon Feb 27 17:46:11 MST 2017


> On Feb 27, 2017, at 3:33 PM, Jean-Paul Calderone <exarkun at twistedmatrix.com> wrote:
> 
> On Mon, Feb 27, 2017 at 6:00 PM, Tristan Seligmann <mithrandi at mithrandi.net <mailto:mithrandi at mithrandi.net>> wrote:
> On Mon, 27 Feb 2017 at 21:54 Glyph Lefkowitz <glyph at twistedmatrix.com <mailto:glyph at twistedmatrix.com>> wrote:
> That said, it has been improving and if it keeps improving at the rate it has been, I expect that we'd be able to put that coverage blocker back in in another 3-4 months.  Perhaps something to talk about at PyCon.
> 
> I think at least one problem that we're suffering from here is our fault, rather than Codecov's: the coverage of the test suite is not stable due to non-determinism in the test suite. That is, the lines executed during a test run are not the same every time due to things like ordering / timing races / etc. This means that "changes" to coverage may show up for a particular PReven though nothing in that PR is actually responsible.
> 
> 
> Changes to Twisted code which are only sometimes covered by the test suite sound like they would violate a 100% coverage rule.  But I guess the experience of looking at a codecov report is so bad/confusing that it's not surprising authors/reviewers might fail to see what's going on and fix the non-deterministic.
> 
> Particularly for code that requires coverage measurements on multiple platforms (ie, you basically can't do it locally), it seems like it would be easier (though, to be clear, bad) to just forget about it and hope everything is covered...
> 
> A tool that pointed out coverage differences between multiple runs of the same version of the code would be a useful thing to start pointing out where these flaws in the Twisted test suite lie, right?  And then each area could be given deterministic test coverage instead...

While this is certainly an issue, I don't think it's the issue we're discussing here.  Unreliability of coverage is largely mitigated by the fact that the main thing we pay attention to is "patch coverage", which can be seen to fluctuate from commit to commit on a branch if the new test coverage is non-deterministic (and rarely is a PR an individual commit).  This is opposed to "coverage delta", which only looks at coverage before / coverage after and is indeed somewhat unpredictable due to old / bad tests.

So I can say when I've had to overrule codecov, it's almost never been because of flapping coverage lines outside of the patch under consideration (and the patches in consideration either have deterministic tests, or I ask the author to add them).

General improvements to build reliability often reduce coverage unreliability as well, so as we've been using Github more, which surfaces status visibility / mergeability to reviewers more, we've been fixing lots of little build-reliability issues and this problem continues to get smaller.

-glyph

-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/twisted-python/attachments/20170227/2236d86e/attachment-0002.html>


More information about the Twisted-Python mailing list