[Twisted-Python] TimAllen changed [#3956 - Add arraysize option to runQuery in adbapi]

Gerrat Rickert grickert at coldstorage.com
Mon Dec 7 11:47:55 EST 2009


 1. About `t.e.adbapi.Transaction.cursor`: It seems that the only
>read/write attribute on a cursor object is `arraysize`, and it seems
>clunky to mess with `Transaction`'s public interface just for that. How
>about leaving the instance variable as `_cursor` and adding
>`getArraySize`/`setArraySize` methods? (a property would be even
better,
>but that requires a new-style class)
>
>I'm kind of ambivalent about the whole approach of this patch, really:
>the only method that `runQuery()` ever calls on the cursor is
>`fetchall()`, which DBAPI-2.0 describes with "Note that the cursor's
>`arraysize` attribute can affect the performance of this operation."
>Presumably in sensible DBAPI modules, `fetchall()` will read chunks as
>large as possible rather than limiting itself to `arraysize`, but
Gerrat
>appears to have found a module that needlessly limits itself, so some
>configuration is needed.
>
>However, do we really need to set a separate `arraysize` for every
>query? Considering we always call `fetchall()`, presumably we'll always
>want to use whatever `arraysize` makes `fetchall()` fastest. I doubt
>there's a value that would work for every DBAPI module in every Twisted
>installation all over the world, but it seems sensible that every query
>in a given `ConnectionPool` would want to use the same arraysize. How
>about adding a `cp_arraysize` keyword parameter to `ConnectionPool`,
and
>applying that setting in `_runInteraction()`? It's a pretty easy way to
>configure `arraysize`, it has no backwards compatibility problems, and
>it shouldn't be too hard for each Twisted user to find a value that
>improves things overall in their environment.
>
>Of course, when you're pushing for performance there's always
>exceptions, and some users might need to set `arraysize` differently
for
>different queries, and maybe use other calls than `fetchall()`. They
can
>continue using `runInteraction()` as they presumably already do.

Sorry I've dropped the ball on this whole request that I initiated.  I'm
kind 
of swamped at work this time of year (I should have more time starting
mid 
Jan.)  I have a minute to weigh in though.

As for the exact details on how this is implemented, I don't have any
strong 
preferences.  I like the suggestion to add the `cp_arraysize` parameter
to 
`ConnectionPool` and apply it in `_runInteraction()`...wish I'd thought
of it.

I also agree that it's unfortunate that the performance of `fetchall()`
is 
impacted by the `arraysize` attribute; but I think that if it's a
deficiency, 
the issue is with the DBAPI-2.0 specification, not with this
implementation. 
(It would be kind of pointless mentioning that `arraysize` could affect
the 
performance of this method, if this method ignored `arraysize`.)

Regards,
	Gerrat



More information about the Twisted-Python mailing list