[Twisted-Python] Re: [Twisted-web] CPUShare-Twisted

Sun Jan 22 02:59:12 MST 2006

On Sun, 22 Jan 2006 07:52:18 +0100, Andrea Arcangeli <andrea at cpushare.com> wrote:
>On Sat, Jan 21, 2006 at 11:55:44PM -0500, glyph at divmod.com wrote:
>> If you've pending patches that have not been applied, would you please
>> consider instead to agitate for those patches on the mailing lists, and add
>
>These are the very old ones (ignore the web2 part that is recent).
>
>http://www.cpushare.com/hg/Twisted/?cs=400da64bd5a6

>IIRC you said that (some stuff)

I'm sorry I was unclear, and you typed all that stuff to no effect.  Discussions that draw attention to unapplied patches should really refer to bug URLs in the tracker.

If they don't, nobody can tell how long the patches have been languishing, who was supposed to apply them, or why they weren't applied, unless there are links to dozens of previous mailing list messages in each post.  Also, summaries of these discussions should be attached to the ticket by the reporter or the maintainer, if they advance the issue at all.  Overall, without some support from the tracker, we just don't know whether the issues are really stuck on a serious problem, or whether someone has just become confused about what is required to make progress on the bug.

Even the absence of information on a bug can be useful.  "Why hasn't anybody replied to this for 6 months?  Was there some discussion on IRC?" can lead someone to post a helpful summary of current status if they have a recollection of where it did end up... sometimes, the bugs have even been fixed, and nobody has noted that fact.

>Sure I understand (twisted devs will work from bugtracker)

Thank you.

>My developmnt is generally test-driven.

Maybe, but it sure doesn't sound like it.  If your development is test-driven that means you are used to writing unit-tests *first*, not hacking in a fix and testing *later*, which is what you have repeatedly suggested.  "TDD" is not the same thing as "unit testing".

FWIW Twisted does not require TDD.  I do not personally do TDD much of the time.  I think tests need to be added before a feature is added, but I don't always have a clear enough picture of what the code will look like to write tests, before I've tried to write the code.

>I'm only opposed to unit-test
>mandatory development to fix bugs or add new features. Writing unit-test
>isn't the only way to test code. It's nice to have a unit test, even a
>simple one, but it shouldn't be mandatory.

Hmm.  "Unit Test Mandatory Development" - UTMD.  That sounds like a good acronym.  I think I will use it in the future.

One productive use of this set of threads is that I've repeated our testing policy - UTMD - in a few different ways.  I'm not sure that I can explain it to you (you seem to have some resistance to understanding) but maybe this will be useful to others.

Whether unit tests *should* be required is a discussion that requires some kind of value system.  What's good, what's bad, etc.  Right now unit testing *is* required.  There is a reason.  Unit testing is not the objective here, rather, requiring unit tests provides a mechanism to satisfy a greater requirement.  If you can suggest a better way to achieve that requirement then perhaps we can discuss other strategies.

Twisted is used by lots of different people in lots of different ways.  Before test requirements were adopted, it was quite common for a developer who fixed a bug in one system on one platform broke another system on another platform.  We are trying to improve Twisted, and such changes are not improvements.  Such changes simply muddle around the set of places and times where Twisted works correctly, they don't enlarge it.

Conceptually this makes sense.  Software is extremely complex, etc.  If you want to fix a bug, you need a way to verify that it doesn't introduce a new bug, or at least a way to verify that *previously*-verified behavior is still working as expected, in previously-verified environments.

That is the goal that unit tests serve.  Without unit tests, *we do not know* whether a particular change will continue to work in the face of future changes, or if it broke past changes.  We can reason about the breakage on past changes, at least, but to think that we can actually understand the impact of the patch on a system the size of even something modest like Twisted is hubris.  Every software project has embarrassing releases that break obvious frequently-used functionality - even projects with *better* testing track-records than Twisted has.  I believe Linus coined the term "brown paper release", for the brown paper bag you have to wear on your head to prevent being recognized after such an event.

However, we can't even attempt to reason about future changes, because we can't possibly consider them when reviewing a current change.  Is it likely that other things might break this later?  How could we possibly know, without a way to accurately predict the future?

Consider that different people review different patches to Twisted at different times, and they have different skill levels.  I have written and read a LOT of Twisted code, and I doubt that even I understand 90% of it.  This partial ignorance makes reviewing past changes a lot like reviewing future changes - a change might break something in a Twisted subsystem the reviewer didn't even know about.

Of course, unit tests are imperfect too; we don't have 100% coverage, and even if we did, we wouldn't have 100% coverage in combination.  Still, they are the best option that we know about.

Can you suggest an alternative to unit tests that would accomplish this goal of providing some level of knowledge as to whether Twisted is probably improving or just changing randomly between releases?

Here are some objections which don't really address the question, just so I can head these off before they are asked.  Andrea - some of these are quite silly and I don't mean to imply that you are necessarily going to ask all of these questions, but I am now writing this for a general audience, and these *are* questions others have asked me.

"but, my changes are so simple, what could they break"

There is a story about a butterfly and a hurricaine that you need to read.  Simple changes can have complex effects that break things horribly.

"not EVERYTHING in Twisted has to be tested.  some easy stuff could break, it's not likely since it doesn't change too often, and you could just do another release"

This leads to a game of whack-a-mole.  One bug pops up, you smash it down.  That makes another bug pop up.  You smash it down.  The whole time, you feel like you are being very productive, because you are fixing all these bugs! Really though, you're just making the same two motions over and over again between different releases.  'back and forth' is an oversimplification, of course.  In reality the cycle probably takes hundreds of releases and goes through dozens of features in various combinations.  Nevertheless, things get fixed, and other things break.

"you can just test it manually"

No, you can't.  There is a HUGE combinatorial explosion of work involved - did you test it with every revision?  Did you test it on Windows?  Did you test it on a slow machine?  Did you test it with Python2.3?  Did you test it with Python 2.3 - on Windows?  Did you test the OTHER thing with Python 2.3 on Windows?  Did you test the other thing with Python 2.4 on Windows?  What about FreeBSD?  What about QNX?  What about AIX?!??!?  What about Linux 2.4?

Right now this matrix has over 20,000 units of work in it, just based on the current buildbots and the tests they're supposed to be running (and as you can see on the buildbot page, we are still trying to get the EXISTING features into shape, it is no wonder we don't want to rush to accept new ones quickly).

Every unit test that is added does the work of manual testing on 9 configurations on 4 platforms every time someone does a commit, which is several times a day.  Do the math.  Even replicating our *current* automated testing with a manual replacement would take something like a million dollars a week in tester salaries, if we were to pay them.  Open source does produce some really good free labor, but not NEARLY that much, and Twisted is a small project besides.

"other projects release untested code, why can't you"

Didn't your mother ever tell you "If Billy jumped off a bridge, would you jump off a bridge too?"  (Wait, am I being "Mommist" now? ;-))

"Some projects separate testing from development, such as the LTP"

Forgetting any flames about Linux's stability for a moment (let's be fair: for such a large piece of software that changes so fast, it's amazingly robust), this is the brute force approach.

LTP is sponsored by IBM and SGI.  It is a HUGE project - at one point, I visited the LTP booth at LinuxWorld.  Twisted does not have a booth at any conference.  Just the test project on linux has ten times as many people as Twisted.  With fewer resources, we have to have a better strategy, otherwise we will not find any bugs.  And by the way, even with all that testing that the LTP does, sometimes Twisted finds even regressions in Linux, remember? :)

>> [..] I am sure that it will be full of bugs.
>
>Time will tell.

Code that has passing tests almost by definition has fewer bugs than code which does not have passing tests.  At the very least, it has more bugs in an unresolved quantum state, because you haven't observed them - so the probability of actual bugs is higher.

They whole point of this fork is that you want to put more bugs in and don't want to take the effort to verify that they won't be introduced.  I am not making a prediction about your skill, I am making an observation on the nature of the project.

>Since you made your prediction I'll make mine. I'm sure
>axiom is wasted time in its current API (at least as far as twisted is
>concerned).

Axiom was developed for a specific application.  It is not appropriate for everyone.  Some people like it, some people don't.  The ones who like it can go ahead and use it.

>I don't see how you can advertize axiom saying "We do plan to add some
>later, and perhaps also support other databases in the future.".

>Sure you can add it, but if you do it, the whole axiom api will fall apart
>unless you want to make synchronous queries over the network. The only
>two deferreds you have are during startup and in the testsuite, just
>grep for the word Deferred.

I think you mean we are going to add more Deferreds later?  There will be a different operational mode for 'transact' which returns a Deferred; the exact spelling hasn't been determined yet, but surely the semantics for that mode will be different and it will not work with all existing axiom code.  (Of course, existing axiom code will not invoke that mode, so it will continue to work side-by-side code which does use it happily enough.)

>Making synchronous sql queries in the
>twisted async model is unacceptable for anything serious.

Everyone is welcome to think that Axiom is not very serious.  I am not a serious person.  Twisted, in fact, is not serious, as you yourself pointed out - Twisted.Quotes proves it.

>Infact even
>sqllite queries are obviously unacceptable once the db grows beyond the
>size of the cache

Only if you're not using an index.  An implication of the current axiom model is that you had better be damn sure that you've got indexes in the right place.

>(and for sure you can't scale the queries over
>different servers to have more ram-cache when using sqllite).

Who says your application has to scale by doing multi-machine queries within a single database?  Google's search team doesn't (at least according to the papers they've published), and I think they know something about scale.  You could apply their same general technique, or the one Netezza uses, to Axiom: either (google style) segment your application data into logical groups, and have high-level queries only talk to appropriate nodes, or (netezza) make null queries really fast (netezza has crazy stuff for this, I think, but Axiom would just use indexes), then run every query on every node in parallel, return results to an aggregating node.  It currently requires extra work, but in our application at least, you rarely want to query the whole universe.  At some point I imagine we will add support for that.

Anyway, I hope that indicates that I have considered the issue of scale a little bit.  Right now scale is not my biggest concern but I am confident we can handle it.  If Divmod were to have a potential customer approach us and say, "we want Axiom to scale to a hundred million node cluster, and we have a very complex application, and we want to get to that scale within a month.  can we do that?"  I'd say no.

That doesn't mean it will never scale.  If a potential client were instead to say, "we have a six month timeframe, and such-and-such budget, can we launch something and eventually scale to a billion users with axiom?"  I'd say yes, probably.  Depends on our allocation in the budget, of course ;-).  The application would have to be aware of scaling issues in its own code.

It turns out that you always do anyway.  The "sufficiently parallel cluster" of RDBMS machines is like the "sufficiently smart compiler" that LISP people talk about.  The existence of projects like memcached indicates that there is a general problem with the idea that you can just use one giant database and scale it up and up and up.

>Ironically axiom current api would have a chance to work well with
>threads, with twisted single threaded async model not.

Database-managed concurrency is not the same thing as shared-state threading.    You might superficially implement database-managed concurrency with shared-state threads for convenience, but the whole programming model is different - most importantly, you don't touch locking from application code, ever.  I need to write a blog post about that or something, but I doubt I will do it justice.  There are easily 3 CS Ph.D dissertations in that topic and I am not the person to write them.

In fact, you can use the current Axiom API with threads, mostly, and it works about as well as most other Python ORMs.  There are some concurrency issues (also present in several other systems) which I'd like to fix before that is a suggested use though.

>There are good python storage packages to use with twisted and threads
           ^
That word right there is debatable.  I've used, and even even written, a few of those and I'm not happy with them.  Again - for a particular application.  Divmod's application is very ambitious and it is not clear that Axiom is the best possible approach for it.  But it seems to be working out OK.

>I can't imagine why you insist on making your inferior
>solution with a design that can't work well with twisted.

Your point: Axiom does not work well with Twisted.

Your evidence: you do not think Axiom works well with Twisted.

This is a rhetorical fallacy.  It is called a "circular" argument.

I have a favorite rhetorical fallacy too, but it's not this one.

My point: Axiom works great with Twisted!

My evidence: There are about 30 people in #divmod who think Axiom works great with Twisted.

This rhetorical fallacy is called an "ad populum" argument, and it's still wrong, but it has a bit more heft to it.

>I'm feeling guilty for risking hitting the harddisk for a few msec when

Hard disk?  You mean "filesystem", surely.  Linux decides to put things which are in RAM onto disk and which are on disk into RAM all the time.

>people clicks on the mailing list archives, and infact I keep two
>webservers exactly to avoid hurting the scalability of the ssl one.

Aah.  And how do you do that?  Inter-process communication.

Divmod does have problems that require extremely low-latency response and concurrency, but it turns out that these are the exception, not the rule.  Allen Short is currently putting Voice-over-IP audio playback into a subprocess so that performance does not suffer from delays which are perfectly acceptable for the interactive web app (everybody has to hit the database to display these web pages anyway, and there is only one disk, so the performance is not going to change if it's in parallel) but are excruciatingly long for delays between sound samples.

We also have plans to scale our service up amongst large groups of commodity machines, with separate, small axiom databases running on each one.  Axiom databases do scale up in size better than you have suggested (I have tested very responsive query and insert performance up to ~5G databases so far, and there is no indication it would slow down significantly anywhere up to a terabyte) but you are definitely not going to be able to run a million-subscriber service out of a single Axiom database.

You make spawning a second webserver sound like a really serious problem.  It's not.  When your application need parallelism, to maximize utilization, spawn a process.  Sometimes it's OK to block.

At the beginning of the project, I thought very much like you are suggesting, absolutely terrified of blocking for any reason, reasoning about what the kenel would do, about what my program would do, but without any solid performance numbers.  I got over it and wrote some simple code that stored and retrieved objects with SQLite, then did some basic measurements and discovered that it was actually adequately fast.

>> Perhaps instead you could change the version
>> from SVN-Trunk to 'HG-CPUShare', so that the CPUShare-ness of the code is
>
>Ok, I'll make this change right away. I already did that for the web2 side.

and thanks again for that.

>This is a very fair requirement (changing the version is trivial).
>However I don't see much point in changing the commands if the module
>name is the same. Either I change both, or none.

Definitely the version is the most important thing.  I suggested the command-names because that way pasted shell output without tracebacks would also be visibly identifiable without having to say 'please run xxx --version'.  If you don't think that would be appropriate, I don't mind.