Opened 6 years ago
Last modified 6 years ago
#5565 enhancement assigned
Bring eventlet back home
Reported by: | oubiwann | Owned by: | oubiwann |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | core | Keywords: | eventlet, actor, concurrency |
Cc: | Glyph, Donovan Preston, Jean-Paul Calderone, jesstess, David Reid, Tristan Seligmann | Branch: |
branches/actor-model-coroutines-5565
branch-diff, diff-cov, branch-cov, buildbot |
Author: | oubiwann |
Description
Evenlet was started by Bob Ippolito and Dononvan Preston. Donovan recently mentioned that his whole initial interest in working on eventlet was digging deeply into Python coroutines, implementing an eventloop for them, and exploring lots of edge cases.
At the TSF dinner at PyCon 2012, Donovan and Glyph talked about this, and Donvoan's interest in bringing the lessons learned from eventlet back into Twisted. These lessons, though, take the form of the Actor model (in concurrent programming literature, research, and implementations) and the related Address (receiver, mailbox) model.
In other words, asynchronous message passing.
This ticket aims to flesh out and land Donovan's exploratory code as a new feature in Twisted.
Attachments (1)
Change History (16)
comment:1 Changed 6 years ago by
comment:2 Changed 6 years ago by
Branch: | → branches/5565-actor-model |
---|---|
Cc: | dash David Reid added |
Status: | new → assigned |
comment:3 Changed 6 years ago by
Author: | → oubiwann |
---|---|
Branch: | branches/5565-actor-model → branches/actor-model-coroutines-5565 |
(In [33971]) Created the starting branch for this work.
comment:4 Changed 6 years ago by
Cc: | dash removed |
---|
comment:5 Changed 6 years ago by
Some chatter today on #twisted regarding this feature:
5:48:56 PM jonathanj: since #5565 contains little to no actual information, what's the advantage of such a model? 5:49:29 PM itamar: actors are what erlang uses 5:50:02 PM itamar: and so probably they have some explanations somewhere 5:52:19 PM jonathanj: i understand the model 5:52:45 PM itamar: sadly I haven't ever used it, so I can't say why or if it's better 5:52:48 PM jonathanj: but Erlang didn't just throw Actors in and have something cool, the whole language is modelled around a different thing than what people normally expect from concurenncy 5:52:52 PM itamar: dreid: thoughts? 5:53:03 PM itamar: jonathanj: well, people have been using actors e.g. with java 5:53:05 PM dreid: Actors are great. 5:53:18 PM itamar: because? 5:56:17 PM dash: applying the term 'actors' to a concurrency model in a programming language is about as accurate as using 'lambda' to describe python functions 5:56:35 PM dash: it's an abstract model of computation, that has some vague resemblance to programming language behaviours 5:58:13 PM dreid: itamar: Lots of synchronous programs which communicate via well defined interfaces. 5:59:55 PM dash: the original description of the actor model was entirely asynchronous 6:00:10 PM glyph: jonathanj: the advantage of such a model is that people think that in order to have "easy" concurrency (gevent, eventlet) they need to throw out the baby with the bathwater 6:00:12 PM dash: just messages and more messages 6:00:27 PM glyph: jonathanj: i.e. to do all their socket I/O incorrectly and muddle together their transports and their protocols et cetera et cetera, and generally not use twisted 6:00:54 PM glyph: jonathanj: just as Twisted has support for heavyweight OS threads, we should have support for microthreads, so that people who want to use them can still live happily in the greater Twisted metropolitan area 6:01:11 PM glyph: jonathanj: Actors are basically a slightly less gross version of microthreads 6:01:30 PM dreid: dash: practical actors also encapsulate some state though. 6:01:31 PM glyph: if you write things in an actor-y way you might even have a chance of doing it correctly, whereas if you do it in the usual gevent idiom you are probably just screwed 6:01:36 PM jonathanj: glyph: thank you for writing that 6:02:08 PM jonathanj: glyph: it would be nice to see some more prose on the topic before the feature lands on trunk 6:02:22 PM glyph: jonathanj: my 'deferToCoroutine' comment was supposed to be a pithier version of that 6:03:02 PM itamar: jonathanj: ideally the branch would include a howto 6:03:07 PM dreid: Actors don't care about your concurrency model. 6:03:08 PM itamar: otherwise maybe it shouldn't pass review 6:03:13 PM itamar: *waves at oubiwann* 6:03:25 PM dreid: Which is part of what makes them great. 6:04:53 PM dreid: Depending on the system you're building your actors may be in the same process, a different process on the same machine, or another machine across the globe. And the user code should not change to support this. And in the same process they can be cooperatively scheduled, or preemptively scheduled across different os threads and the user code should not change to support this. 6:05:07 PM dash: dreid: yay, locality 6:05:48 PM dreid: Joe Armstrong would say that erlang's "processes" are not a concurrency model they're an object model. 6:06:09 PM dreid: He'd also say that the mistake smalltak made was assuming all messages needed a response. 6:06:22 PM dash: dreid: Kay would partially agree, I think 6:06:31 PM dash: (E fixes this, FWIW) 6:07:05 PM dreid: dash: probably. Erlang also fixes this. 6:07:38 PM itamar: oubiwann: and a howto, too, I hope? 6:08:48 PM oubiwann: itamar: a howto will be critical; I don't want to land the branch without one 6:09:43 PM itamar: awesome 6:15:35 PM oubiwann: jonathanj: also, there are going to be copious blog posts during all this work 6:15:45 PM oubiwann: so there's going to be lots and lots of info available 6:16:26 PM oubiwann: with a howto providing a solid intro and how to use t.p.coroutines effectively in various cases
comment:6 Changed 6 years ago by
Thoughts on where the code should live:
4:20:04 PM oubiwann: I'd like to see what you might want to name Donovan's actor-model concurrency project 4:20:19 PM oubiwann: and where it might want to live in twisted.* 4:20:46 PM glyph: I think I'm mostly serious about deferToCoroutine 4:21:33 PM glyph: so, maybe an actor/scheduler somewhere in twisted/python? 4:21:38 PM glyph: like ThreadPool? 6:03:33 PM glyph: see twisted.python.threadpool 6:03:50 PM glyph: that seems like about the same level the base actors stuff should be at 6:04:07 PM glyph: then deferToActor can be in twisted.internet somewhere (maybe twisted.internet.task, I dunno) due to the nastiness with the twisted.internet.defer circularity 6:06:03 PM oubiwann: okay, comparing t.p.threadpool and t.i.threads 6:06:06 PM oubiwann: I see your point now 6:11:25 PM glyph: cool
comment:7 Changed 6 years ago by
The comment above was pasted without any formatting. Also, there was a bunch of stuff thrown together. I'll split it into two topics (one in this comment, and another to follow. If I could delete that unformatted comment from trac, I would...).
Discussion of motivation for this feature and general chatter about the actor pattern:
5:48:56 PM jonathanj: since #5565 contains little to no actual information, what's the advantage of such a model? 5:49:29 PM itamar: actors are what erlang uses 5:50:02 PM itamar: and so probably they have some explanations somewhere 5:52:19 PM jonathanj: i understand the model 5:52:45 PM itamar: sadly I haven't ever used it, so I can't say why or if it's better 5:52:48 PM jonathanj: but Erlang didn't just throw Actors in and have something cool, the whole language is modelled around a different thing than what people normally expect from concurenncy 5:52:52 PM itamar: dreid: thoughts? 5:53:03 PM itamar: jonathanj: well, people have been using actors e.g. with java 5:53:05 PM dreid: Actors are great. 5:53:18 PM itamar: because? 5:58:13 PM dreid: itamar: Lots of synchronous programs which communicate via well defined interfaces. 5:59:55 PM dash: the original description of the actor model was entirely asynchronous 6:00:10 PM glyph: jonathanj: the advantage of such a model is that people think that in order to have "easy" concurrency (gevent, eventlet) they need to throw out the baby with the bathwater 6:00:12 PM dash: just messages and more messages 6:00:27 PM glyph: jonathanj: i.e. to do all their socket I/O incorrectly and muddle together their transports and their protocols et cetera et cetera, and generally not use twisted 6:00:54 PM glyph: jonathanj: just as Twisted has support for heavyweight OS threads, we should have support for microthreads, so that people who want to use them can still live happily in the greater Twisted metropolitan area 6:01:11 PM glyph: jonathanj: Actors are basically a slightly less gross version of microthreads 6:01:30 PM dreid: dash: practical actors also encapsulate some state though. 6:01:31 PM glyph: if you write things in an actor-y way you might even have a chance of doing it correctly, whereas if you do it in the usual gevent idiom you are probably just screwed 6:01:36 PM jonathanj: glyph: thank you for writing that 6:02:22 PM glyph: jonathanj: my 'deferToCoroutine' comment was supposed to be a pithier version of that 6:03:07 PM dreid: Actors don't care about your concurrency model. 6:03:25 PM dreid: Which is part of what makes them great. 6:04:53 PM dreid: Depending on the system you're building your actors may be in the same process, a different process on the same machine, or another machine across the globe. And the user code should not change to support this. And in the same process they can be cooperatively scheduled, or preemptively scheduled across different os threads and the user code should not change to support this. 6:05:07 PM dash: dreid: yay, locality 6:05:48 PM dreid: Joe Armstrong would say that erlang's "processes" are not a concurrency model they're an object model. 6:06:09 PM dreid: He'd also say that the mistake smalltak made was assuming all messages needed a response. 6:06:22 PM dash: dreid: Kay would partially agree, I think 6:06:31 PM dash: (E fixes this, FWIW) 6:07:05 PM dreid: dash: probably. Erlang also fixes this.
comment:8 Changed 6 years ago by
Some more discussion on general requirements for the branch:
6:02:08 PM jonathanj: glyph: it would be nice to see some more prose on the topic before the feature lands on trunk 6:03:02 PM itamar: jonathanj: ideally the branch would include a howto 6:03:08 PM itamar: otherwise maybe it shouldn't pass review 6:03:13 PM itamar: *waves at oubiwann* 6:07:38 PM itamar: oubiwann: and a howto, too, I hope? 6:08:48 PM oubiwann: itamar: a howto will be critical; I don't want to land the branch without one 6:09:43 PM itamar: awesome 6:15:35 PM oubiwann: jonathanj: also, there are going to be copious blog posts during all this work 6:15:45 PM oubiwann: so there's going to be lots and lots of info available 6:16:26 PM oubiwann: with a howto providing a solid intro and how to use t.p.coroutines effectively in various cases 6:31:01 PM dreid: oubiwann: No docstrings, no tests, shouldn't be called coroutines. twisted.python.actors might be better. 6:31:19 PM glyph: oubiwann: tests first!!! 6:31:45 PM exarkun: It can't be tests first, because it's a copy of code fzZzy wrote ten years ago, right? 6:31:51 PM exarkun: (and it didn't have tests then either) 6:32:52 PM glyph: exarkun: it's a copy of an _idea_ he had ten years ago, based on an idea Alan Kay had thirty years before that 6:33:05 PM glyph: exarkun: but the original idea was smalltalk, which is where test-driven development was invented so QED or something 6:33:28 PM exarkun: I thought it was actually the code 6:34:23 PM glyph: the code was code he wrote last week.
comment:9 Changed 6 years ago by
FYI, I'm tracking work item planning and status here:
https://blueprints.launchpad.net/twisted/+spec/actor-model-implementation
comment:10 Changed 6 years ago by
dreid and oubiwann hosted an irc meeting yesterday regarding initial thoughts on features, interfaces, etc.
Transcript available here:
Summary
- dreid discussed his experiences in Erlang
- dreid offered suggestions on defining the actor model interface in Twisted
- dreid expressed an interest in implementing actors on top of t.i.task.Cooperator
- oubiwann expressed concerns about potential overhead in using t.i.task.Cooperator
- oubiwann preferred the use of messageReceived (Twisted) vs. results sending a message back to the actor (Erlang, selective receive)
- ralphm noted the similarities of selective receive and how t.w.xish.util.EventDispatcher handles XMPP stanzas
- we briefly discussed interaction with defereds
- dreid said that we should support "links" which, in general, means that an actor needs to have a way of killing itself and if it raises an exception, it should die; additionally, when an actor dies, other actors need to get a message about that
- oubiwann downloaded a bunch of research on the actor model (many from the 70s, 2 from the 80s, and 2 from the 00s) and posted a link to the collection
Links
- https://launchpad.net/corotwine
- http://twistedmatrix.com/documents/current/api/twisted.python.context.ContextTracker.html
- http://doc.pypy.org/en/latest/stackless.html
- https://github.com/boundary/scalang/
- http://pragprog.com/book/jaerlang/programming-erlang
- http://www.twistedmatrix.com/users/oubiwann/actorModel/papers/
- http://pre.aps.org/abstract/PRE/v82/i5/e056104
- http://www.amazon.com/dp/0671657135
- https://github.com/dreid/cotools
comment:11 Changed 6 years ago by
Cc: | Tristan Seligmann added |
---|
comment:12 follow-up: 13 Changed 6 years ago by
Don't forget that isolation between actors is an essential component of any actor system. This will be difficult to get right in Python. In python-actors and mailbox.py I simply forced messages between Actors to be serialized to json, and deserialized before delivery. This is gross, but it discourages casual sharing. Itamar suggested that some immutable data structures could be provided to use for message passing, and I had discussed with glyph the idea of a Unique wrapper that somehow gives up the local reference to some data when passing the message. Aliasing is always a problem though.
deferToActor should probably use a process pool and a shared memory segment if possible to make the benefits even more obvious. This would allow code written using the new apis to saturate all the cores on a machine, even in Python. I was thinking a thread pool before, but I don't even see the point of this because of the gil. Existing Twisted code could get immediate benefit by using this to offload any actual blocking computations onto other processes, without having to use any of the message passing or supervision features of actors. (Is there any sort of built in process pool stuff in twisted already?)
deferToActor should probably take a filename, and not a function. Otherwise, how does the function get into the other process? I have found in both python and js that having spawn take a filename instead of a function also helps to reinforce the idea of isolation between actors. With each actor in it's own file, it feels obvious and natural that there's isolation. When passing a function to spawn, it's not as obvious. What happens to closed-over variables? Mutable closed-over variables?
comment:13 Changed 6 years ago by
Replying to fzzzy:
deferToActor should probably take a filename, and not a function. Otherwise, how does the function get into the other process? I have found in both python and js that having spawn take a filename instead of a function also helps to reinforce the idea of isolation between actors. With each actor in it's own file, it feels obvious and natural that there's isolation. When passing a function to spawn, it's not as obvious. What happens to closed-over variables? Mutable closed-over variables?
I can see what you're saying here, but, augh, no. It should take a function, but a function with the verified property of actually existing as a bound name in a top-level module when you import it, and if the function lacks that property, it should give you a clear error message explaining what you have to do to fix it.
For example, look at picloud.com's documentation and imagine what they must be doing to support such a system.
Passing a filename means you can't deploy your Python code in unusual ways, like in a zipfile. And if you're going to be building a ginormous cluster of machines dedicated to running auto-deployed actor code, you will really want your stuff to be able to be in an optimized deployment container, not an exploded pile of stuff on the filesystem.
comment:14 Changed 6 years ago by
As a follow up to fzzzy's post in comment number 12 on this ticket, we had a chat on IRC. The full log is here:
Here's a summary:
- isolation is definitely where it's at
- the process pool is good idea
- passing code to another node/actor could be done with AMP or using Erlang's protocol
- interoperability with other languages is a good thing
- maybe we can use therve's twotp to support this (see ticket #5569)
- we definitely want to be able to execute functions on remote notes (horray!)
- we also want higher-level code to wrap the actor model code, e.g., deployment infrastructure
- the majority of folks seem to agree that passing a function will be better than a filename, though fzzzy's sentiment is appreciated by all
Changed 6 years ago by
Attachment: | actor-model-twisted-irc-meeting-01.txt added |
---|
externally linked logfile
comment:15 Changed 6 years ago by
External links break (even if they're hosted on twistedmatrix.com). Trac supports attachments, please use those.
As I put it in the referenced conversation, "We have deferToThread, why not deferToCoroutine".