wiki:GoogleSOC

Version 23 (modified by itamar, 3 years ago) (diff)

python 3 readded

Google Summer of Code 2012

PyCon 2012 sprint  Entertaining metrics PyCon 2011 sprint

Getting in Touch

If you're interested in participating, please email itamar at itamarst dot org. (If you don't get a timely response, try the mailing list, my spam filter can be aggressive sometimes).

If applicable, you may also be interested in http://twistedmatrix.com/trac/wiki/WomenOutreach2012

The basic process:

  1. You get in touch, start getting familiar with the Twisted development process (see below).
  2. We create a project proposal together (see http://www.google-melange.com/gsoc/document/show/gsoc_program/google/gsoc2012/faqs#student_application_looks)
  3. You apply on the Google Summer of Code web page (https://www.google-melange.com/gsoc/homepage/google/gsoc2012) for the Python Software Foundation group, with the project proposal we created.
  4. Your project is (hopefully) approved - not all students will be accepted, unfortunately, depending on how many slots Google gives the project and how many applicants there are.

Project Ideas

These are just ideas, we're open to other ideas as well.

IPv6

IPv6: IPv6 is the new networking standard for the Internet; World IPv6 Day is happening soon (http://www.worldipv6day.org/). We are still missing SSL, UDP, and multicast IPv6 support, not to mention IPv6 DNS lookups. Add IPv6 support for these transports with a full API and test coverage for these additions. Add relevant examples and documentation.

Requirements: In addition to Python, a decent understanding of networking and socket APIs.

Debugging UI

Build a user interface (a GUI, or maybe even Web-based). The idea is to have a debugging user interface for running a Twisted process that shows existing listening ports and open connections, and allows you to view bytes flowing over the transports. This would look you look at a running Twisted program and see what was going on inside and what data was flowing through it. (Twisted currently includes a half-broken "gladereactor" that is a gtk-based half-finished implementation of this). You'll learn about Twisted's internals, and get to build an interesting user interface and present information in a useful way.

Requirements: In addition to Python, some knowledge of either GUI programming or Web programming.

Sphinx documentation

Finish converting our documentation from a custom format and set of tools that we maintain (Lore) to the Python community standard - RestructuredText (ReST) as used by http://sphinx.pocoo.org/. In particular, resolve all tickets in the Lore to Sphinx milestone and then help transition the release process and twistedmatrix.com to Sphinx by participating in a release. You'll learn about releasing software, documentation tools, and maybe some data transformation (from HTML to RestructuredText).

Requirements: In addition to knowledge of Python, knowledge of HTML and perhaps ReST would be useful.

Don't break in other locales

Twisted currently has some bugs when used in certain non-English-configuration ("locales"), e.g. French. For example, date formatting might be wrong, or uppercase/lowercase conversion might work unexpectedly. As a first step you'd want to create a custom locale for Linux that was malicious about everything (e.g. lower-casing returns random unexpected bytes, dates are funky, etc.). This would be used to find places where Twisted is making US-centric assumptions, but would actually be a useful project for the Linux community in general. Then, fix resulting bugs in Twisted. Some relevant background material: http://www.moserware.com/2008/02/does-your-code-pass-turkey-test.html

Requirements: This is probably easier to do on Linux than on Windows, so some basic familiarity with Linux, and a computer configured to use it, in addition to knowledge of Python.

Automatic Coding Standard Enforcement

Twisted applies certain naming and style standards to all contributed code. These are things like giving all classes and functions docstrings, documenting all parameters and instances attributes, having the correct amount of whitespace between definitions, etc. Currently, a human reviewer needs to check all of these things and point out violations to contributors. The development process would be sped up by a tool which can automatically make these simple, mechanical checks, freeing up human reviewer time to focus on more important aspects of proposed changes. Integration with the issue tracker, to automatically generate this feedback before a ticket is submitted for review, would also be a big win. For background about the Twisted development process, see ReviewProcess.

Requirements: Little Twisted-specific knowledge is necessary, but general Python knowledge, perhaps with prior parsing experience (or a willingness to learn about parsing).

Expanded Endpoints Support

Twisted recently added two new related high-level APIs, one for specifying what address servers should listen for connections on, the other for specifying what address a client should connect to. These APIs are known as "endpoints", mostly implemented in twisted.internet.endpoints. Only a few of the many kinds of addresses that Twisted supports have had endpoint support added to them so far. Additionally, there are some kinds of connections which it was very awkward to set up before the invention of the endpoints APIs, so there is no convenient way to set them up at all.

Some endpoints Twisted does have are for TCP (eg, when using the string syntax to create a TCP server, "tcp:port=80:interface=192.168.1.1"), SSL, and UNIX domain sockets. Some endpoints it doesn't have, and could be easily implemented by wrapping existing code: serial ports, standard I/O. A bit more difficult, but still mostly using existing code, would be endpoints for child processes and SSH connections. SOCKS proxies would require writing additional (but fairly simple) network code. Rather trickier would be support HTTPS proxies. Many more are possible.

For this project, add a lot of new endpoint implementations to Twisted, starting the easiest and moving on the more difficult ones. Some endpoint implementations will be simple wrappers around existing APIs. Others will involve some network programming, where there was no convenient existing API to provide the functionality at all.

Requirements: General Python knowledge, some networking knowledge would be helpful but that may be acquired over the course of the project.

Python 3 Preparation

In preparation for Python 3 support, there is a lot of cleanup work that needs to be done:

  • Remove use of old APIs that have more modern equivalents.
  • Add support code for features Python 3 is missing (execfile).
  • Get rid of deprecated code that is no longer supported.

A partial list can be found here: http://twistedmatrix.com/trac/query?status=assigned&status=new&status=reopened&group=status&milestone=Python-3.x

There's also additional development to be done, e.g. the FilePath path management library should probably get Unicode support in Python 2, so that it's easier to write code that works in both Python 2 and 3.

The goal is *not* to port Twisted to Python 3, but rather to make it easier to support Twisted on Python 3 later on.

Requirements: Good knowledge of Python 2, and ideally the differences with Python 3.

Other Tasks, Some of which Google may not like, but some of which may be workable

  • Finish half-done documentation: Merge existing documentation from elsewhere, e.g. Conch in 60 seconds, or half-finished documentation that is in the tracker (there's the start of an IMAP howto, for example.)
  • For a particular subproject, review and improve existing examples and howtos and add missing documentation. Every example needs a description, e.g. http://twistedmatrix.com/documents/current/web/examples/index.html has descriptions but http://twistedmatrix.com/documents/current/names/examples/index.html doesn't. Every example should document its purpose, how it is run, and what it should do. Examples should use current coding and documentation standards and shouldn't use deprecated code. Documentation should use current coding and documentation standards in code snippets, and should use the preferred APIs. Some subprojects have very little documentation or examples and simply need more. Audit and update relevant man pages.
  • Coverage: For a particular subproject, go for 100% API documentation and unit test code coverage.
  • Test suite cleanup: overall test suite cleanup, remove deprecations, fix deprecation warnings, get to the bottom of and fix our various recurring Windows errors, fix Windows compiler warnings, fix non-deterministic tests.
  • Finish half-written code, documentation, bug fixes, etc.: We have >200 tickets that at some point had code written and received at least one code review, but for various reasons never made it in. This is a great way to learn because you don't have to start from scratch, and there's a variety of tasks (from minor fixes to documentation to new features) most of which have code reviews as a starting point. See http://twistedmatrix.com/trac/report/16 for a list.
  • PyPy: Have PyPy be a fully supported platform, with all tests passing.

Getting Started with Twisted Development

Here are some of the tools we use:

  1. IRC
  2. the Trac issue tracker
  3. the svn revision control system
  4. the diff and patch utilities

If you have not used IRC before, please go through this short tutorial on installing and using an IRC client: https://openhatch.org/wiki/Open_Source_Comes_to_Campus/UMD/Laptop_setup#Goal_.233:_install_an_IRC_client

Please familiarize yourself with how Twisted uses Trac by exploring the reports from this link: http://twistedmatrix.com/trac/report

With sufficient exploration you should be able to answer the following questions (you don't have to tell me the answers, just make sure you know how to find them):

  1. What is the oldest open Twisted ticket?
  2. How many tickets are currently waiting on review? How do we denote tickets waiting on a review (hint: look at the keyword)
  3. Using the custom query page at http://twistedmatrix.com/trac/query, find out how many open tickets we have that have the "documentation" keyword.

If you have not used svn before, please go through the svn training mission at: http://openhatch.org/missions/svn

If you have not used diff and patch before, please go through the diff and patch training mission at: http://openhatch.org/missions/diffpatch

If you have any questions while going through this material, don't hesitate to get in touch. You can email me, and you can also find me as itamar on the Freenode IRC network; jesstess on Freenode can also be of help if I'm not around.

Once you've gone through this material, the next steps will be to:

  1. Meet up on IRC (stop by the #twisted channel)
  2. Check out and explore the Twisted source code
  3. Read a few parts of the Twisted core documentation and run some example Twisted scripts
  4. Run the Twisted test suite - cd into the checkout folder and run "bin/trial twisted".

And then you'll be in great shape to start working on your first Twisted ticket. We'll suggest a first ticket when you get in touch over email; you'd want to fix it in your Twisted svn checkout, run svn diff, upload the resulting patch to the ticket and then add the "review" keyword to the ticket. This process is described in detail here: http://twistedmatrix.com/trac/wiki/TwistedDevelopment#SubmittingaPatch