Ticket #2315 (closed enhancement: fixed)

Opened 4 years ago

Last modified 3 years ago

Request to add int8 string support to basic protocols

Reported by: metcalfc Owned by: therve
Priority: highest Milestone:
Component: core Keywords:
Cc: therve, exarkun Branch:
Author: Launchpad Bug:

Description

It would be helpful if support for an int8 string receiver were added to the basic protocol suite. There already is support for int16 and int32. It would make sense to cover the three basic int types.

I'm including the source for an int8 implementation.

class Int8StringReceiver(protocol.Protocol, _PauseableMixin):
    """A receiver for int8-prefixed strings.

    An int8 string is a string prefixed by 1 bytes, the 8-bit length of
    the string encoded in network byte order.

    This class publishes the same interface as NetstringReceiver.
    """

    recvd = ""

    def stringReceived(self, msg):
        """Override this.
        """
        raise NotImplementedError

    def dataReceived(self, recd):
        """Convert int8 prefixed strings into calls to stringReceived.
        """
        self.recvd = self.recvd + recd
        while len(self.recvd) > 0 and not self.paused:
            length = ord(self.recvd[0]) 
            if len(self.recvd) < length+1:
                break
            packet = self.recvd[1:length+1]
            self.recvd = self.recvd[length+1:]
            self.stringReceived(packet)

    def sendString(self, data):
        """Send an int8-prefixed string to the other end of the connection.
        """
        assert len(data) < 256, "message too long"
        self.transport.write(struct.pack("!h",len(data)) + data)

Attachments

Int8StringReceiver.py Download (1.2 KB) - added by metcalfc 4 years ago.
Implementation of an int8 string receiver
Int8StringReceiverTest.py Download (0.7 KB) - added by metcalfc 4 years ago.
A test case for the Int8 string receiver.

Change History

Changed 4 years ago by metcalfc

Implementation of an int8 string receiver

  Changed 4 years ago by glyph

  • owner changed from glyph to metcalfc

Thank you for your contribution, but this will also need test cases if it is to be accepted.

  Changed 4 years ago by metcalfc

I updated a test based on Int32. Attached.

Changed 4 years ago by metcalfc

A test case for the Int8 string receiver.

  Changed 4 years ago by therve

  • priority changed from normal to highest
  • owner metcalfc deleted
  • keywords review added

  Changed 4 years ago by therve

  • cc therve added
  • keywords review removed
  • owner set to therve

  Changed 4 years ago by therve

  • keywords review added
  • owner therve deleted

Ready to review in int8-2315.

I generalized the problem with creating IntNStringReceiver. I also add tests to Int16 in the process, and some docs.

follow-up: ↓ 7   Changed 4 years ago by exarkun

  • owner set to therve
  • keywords review removed
  • illegal_strings and partial_strings should be adjusted to fit the naming convention.
  • All of the implementations of test_partial, test_receive, and test_tooLongSend are almost identical. It'd be good to factor them into a mixin and use instance attributes overridden on several subclasses for the parts of them which differ.
  • It is probably a defect that sendString raises AssertionError. We should have a custom exception for this case. For compatibility it might need to subclass AssertionError. And I have no idea how we deprecate the ability to catch it with 'except AssertionError:'. Maybe this is a separate ticket, but it'd be good for the new code not to be introduced with this mistake.
  • IntNStringReceiver.dataReceived still talks about int32
  • 'i', 'h', and 'b' are all signed variations of their respective types. There should be tests for longer values, I guess. I guess no one noticed the existing bug in Int32StringReceiver (which used 'i' before this branch) since hardly anyone sends around 2GB strings all at once.

in reply to: ↑ 6 ; follow-up: ↓ 9   Changed 4 years ago by therve

Replying to exarkun:

* illegal_strings and partial_strings should be adjusted to fit the naming convention.

OK done.

* All of the implementations of test_partial, test_receive, and test_tooLongSend are almost identical. It'd be good to factor them into a mixin and use instance attributes overridden on several subclasses for the parts of them which differ.

That seems pretty good.

* It is probably a defect that sendString raises AssertionError. We should have a custom exception for this case. For compatibility it might need to subclass AssertionError. And I have no idea how we deprecate the ability to catch it with 'except AssertionError:'. Maybe this is a separate ticket, but it'd be good for the new code not to be introduced with this mistake.

I've made the new error class, I'm not really sure how to deprecate AssertionError. Semantically it doesn't seem so wrong...

* IntNStringReceiver.dataReceived still talks about int32

OK removed.

* 'i', 'h', and 'b' are all signed variations of their respective types. There should be tests for longer values, I guess. I guess no one noticed the existing bug in Int32StringReceiver (which used 'i' before this branch) since hardly anyone sends around 2GB strings all at once.

We could actually create a Int64 class... Thoughts ?

  Changed 4 years ago by therve

  • keywords review added
  • owner therve deleted

in reply to: ↑ 7   Changed 4 years ago by exarkun

  • owner set to therve
  • keywords review removed

Replying to therve:

[snip] We could actually create a Int64 class... Thoughts ?

We could, but I was only suggesting changing the structFormat strings to use "I", "H", and "B". :)

  Changed 4 years ago by therve

  • owner changed from therve to exarkun
  • keywords review added

Right :).

  Changed 3 years ago by glyph

  • keywords review removed
  • owner changed from exarkun to therve

Woah, woah, woah. Major problem: "!I" is not the same thing as "I". One of the "illegal" strings in Int32TestCase got changed by this patch, and it got changed to a perfectly valid string. Perhaps this interactive session will clearly explain what I mean:

glyph@legion>>> struct.unpack("!I6s", "\x00\x00\x00\x10aaaaaa")
(16L, 'aaaaaa')
glyph@legion>>> struct.unpack("I6s", "\x00\x00\x00\x10aaaaaa")
(268435456L, 'aaaaaa')

I noticed this just by inspecting the code, but it's also helpful to note that half of the AMP tests fail and they hang halfway through.

More minor things I noticed before realizing this:

I think the exception should be called StringTooLongError; TooLongStringError just sounds backwards to me (and the docstring implies that is the correct word ordering).

As long as you're modifying sendString's docstring, add an @type for "data" to indicate it is a str. The fewer places we have ambiguity about unicode, the better :-).

  Changed 3 years ago by therve

  • keywords review added
  • owner therve deleted

OK sorry about that :).

  Changed 3 years ago by exarkun

  • keywords review removed
  • owner set to therve
  • Typos in IntNStringReceiver class docstring
    • @ivar recv:
    • length prefixes protocol
  • Typo in sendString - maxixum
  • test_tooLongSend is catching OverflowError, but I think it means MemoryError. It probably shouldn't catch anything at all though. I'd rather see the MemoryError in the test output than have it appear as though the test passed.

Int8StringReceiver's class docstring talks about an 8-bit length in network byte order. I don't think it makes sense to talk about the byte order of 8 bits worth of data.

A bunch of new classes in test_protocol don't have docstrings and should.

This time through, I notice that the previous implementations of these classes were pretty inefficient, and that has basically been preserved in this version. There should be another ticket for removing the O(n**2) complexity from dataReceived.

  Changed 3 years ago by exarkun

Urr, also, several tests from twisted.test.test_stateful fail.

  Changed 3 years ago by therve

  • owner changed from therve to exarkun
  • keywords review added

That looks better now. test_tooLongSend really means OverFlowError, because it builds an int that doesn't fit in 32 bits. However, the test should pass on a 64 bits machine (I didn't check). Instead I raised a Skip here.

I'm not sure about the complexity. You probably mean that we should hold data received in a list instead of a string ?

  Changed 3 years ago by exarkun

  • owner changed from exarkun to therve
  • keywords review removed

For test_tooLongSend, even on a 64 bit machine, the test will need to allocate 4 gigs of RAM. That's not likely to succeed on any boxes we have access to :) The best suggestion I have for what to do about this is to try and test the base functionality provided by IntNStringReceiver more directly so that the specific subclasses don't need as much testing. For example, IntNStringReceiver could be instantiated and have prefixLength and structFormat set on it directly (by the way, I wonder if having prefixLength separately defined at all is a good idea - it must always be the same as struct.calcsize(structFormat)). Then we know the '!I' case works since it is a data-driven specialization of the general-case code which the tests demonstrate works.

A related point is that none of the length-specific test subclasses actually verifies the behavior is correct for that length. If I change Int32StringReceiver's prefixLength and structFormat to 2 and '!H' respectively, all except one of its tests still pass. This doesn't strike me as ideal. Having a couple length-specific tests that actually test again length-specific would probably be useful.

I'm not sure if I've expressed these ideas very clearly (I don't think I've been doing well at that lately). If I haven't, I'd be happy to make the further changes I think would be beneficial (I'd rather be able to express the ideas clearly, but just writing the code is easier).

Regarding the complexity, yes, that's what I meant.

  Changed 3 years ago by therve

  • cc exarkun added
  • owner therve deleted
  • keywords review added

OK, I've made some adjustements. For test_tooLongSend, I removed it from the 32 bits case. Your suggestion looked like it was already what done for Int16/Int8 so I haven't been too imaginative here.

I've tried to use calcsize, but I've been hit by the lack of a __init__ method in the protocol. The other solution is to use a metaclass and I'm a bit reluctant to do that.

I've added some specific tests to check the data received and send.

I've created #2611 for the complexity.

  Changed 3 years ago by exarkun

  • owner set to therve
  • keywords review removed

I can understand wanting to avoid a metaclass :)

I like the change to use calcsize in the subclasses, anyway.

What about also removing prefixLength and structFormat from the base class, so that any subclass will fail rapidly with an AttributeError if subclasses's forget to do something (instead of bogusly comparing an integer against None)?

recvd is still spelled wrong in the IntNStringReceiver class docstring.

  Changed 3 years ago by therve

  • owner therve deleted
  • keywords review added

Ups forgot of the doc typo. I removed the default attributes, that looks good now.

  Changed 3 years ago by exarkun

  • keywords review removed
  • owner set to therve

Looks good, please merge.

  Changed 3 years ago by therve

  • status changed from new to closed
  • resolution set to fixed

(In [20308]) Merge int8-2315

Author: therve Reviewer: exarkun Fixes #2315

Add Int8StringReceiver to the list of basic protocols, and refactor Int16StringReceiver and Int32StringReceiver for not having duplicate code. Also add tests to Int16StringReceiver.

Note: See TracTickets for help on using tickets.