[Twisted-Python] Supporting a two-part client protocol.

Colin Dunklau colin.dunklau at gmail.com
Tue Feb 4 15:49:50 MST 2020


On Tue, Feb 4, 2020 at 6:12 PM Go Luhng <goluhng at gmail.com> wrote:
>
> Thanks Colin and Barry for the reply. I read the sans-io docs and it
> is an attractive approach.
>
> I believe I have a plan going forward, but I'm not sure what you mean
> by explicit vs implicit state machine, if you care to elaborate.

IntNStringReceiver has a state machine, but it's embedded in the
protocol implementation, so it's implicit:
https://github.com/twisted/twisted/blob/twisted-19.10.0/src/twisted/protocols/basic.py#L682
It's not that easy to tell what's going on there, at first glance. The
dataReceived method has _most_ of the state machine implementation,
but it fiddles with instance attributes, and that length check in
sendString could be considered a parser detail, rather than part of
the protocol itself.

The situation with LineOnlyReceiver is similar:
https://github.com/twisted/twisted/blob/twisted-19.10.0/src/twisted/protocols/basic.py#L421
Now that one is simple enough that it's reasonably clear what's going
on... but it's a good candidate for a simple example (analysis first,
code after).

This is clearly more code, but the benefit from its clearer separation
of concerns is a boon... especially given that this is a reeeeal
simple example dealing with one of the simplest possible protocols.
Your protocol will undoubtedly be much more complex, so the benefit
should be a lot clearer.

In the original, the parsing details are mixed in with the
higher-level semantics of the protocol, especially with respect to the
max line length handling. In the "composed" version (admittedly not
the best name), the parser is explicit, and entirely divorced from the
protocol. It's easier to understand, simpler (even trivial) to test in
isolation, and winds up being reusable outside of a Twisted Protocol.
Hey, this is starting to sound like that sans-io thingie!

To map LineParser's semantics to sans-io terminology, readData is for
getting "input", and iterLines (actually the generator iterator it
makes) produces "events": a "line event", or a "line too darn long"
event (via the exception).

Link for easier viewing
(https://gist.github.com/cdunklau/4f8c72222295680ca20e3d4401f385b1),
reproduced here for list archive posterity:

    import collections

    from twisted.internet import protocol


    class LineParser(object):
        def __init__(self, delimiter, max_length):
            self.delimiter = delimiter
            self.max_length = max_length
            self._buffer = b''
            self._lines = collections.deque()

        def readData(self, data):
            lines = (self._buffer + data).split(self.delimiter)
            self._buffer = lines.pop()
            self._lines.extend(lines)

        def iterLines(self):
            while self._lines:
                line = self._lines.popleft()
                if len(line) > self.max_length:
                    raise LineLengthExceeded(line)
                yield line
            if len(self._buffer) > self.max_length:
                raise LineLengthExceeded(self._buffer)


    class LineLengthExceeded(Exception):
        def __init__(self, culprit):
            super().__init__(culprit)
            self.culprit = culprit


    class ComposedLineOnlyReceiver(protocol.Protocol):
        delimiter = b'\r\n'
        MAX_LENGTH = 16384
        _parser = None

        def dataReceived(self, data):
            """
            Translates bytes into lines, and calls lineReceived.
            """
            if self._parser is None:
                self._parser = LineParser(self.delimiter, self.MAX_LENGTH)

            self._parser.readData(data)
            try:
                for line in self._parser.iterLines():
                    if self.transport.disconnecting:
                        # this is necessary because the transport may
be told to lose
                        # the connection by a line within a larger
packet, and it is
                        # important to disregard all the lines in that
packet following
                        # the one that told it to close.
                        return
                    self.lineReceived(line)
            except LineLengthExceeded as e:
                return self.lineLengthExceeded(e.culprit)

        def lineReceived(self, line):
            """
            Override this for when each line is received.
            @param line: The line which was received with the delimiter removed.
            @type line: C{bytes}
            """
            raise NotImplementedError

        def sendLine(self, line):
            return self.transport.writeSequence((line, self.delimiter))

        def lineLengthExceeded(self, line):
            return self.transport.loseConnection()



More information about the Twisted-Python mailing list