Wednesday, March 14, 2012

2012 Language Summit Report

This year's Language Summit took place on Wednesday March 7 in Santa Clara, CA before the start of PyCon 2012. As with previous years, in attendance were members of the various Python VMs, packagers from various Linux distributions, and members of several community projects.

The Namespace PEPs

The summit began with a discussion on PEPs 382 and 402, with Barry Warsaw leading much of the discussion. After some discussion, the decision was ultimately deferred with what appeared to be a want for parts of both PEPs.

As of Monday at the PyCon sprints, both PEPs have been rejected (see the Rejection Notice at the top of each PEP). Martin von Loewis posted to the import-sig list that a resolution has been found and Eric Smith will draft a new PEP on the ideas agreed upon there. Effectively, PEP 382 has been outright rejected, while portions of PEP 402 will be accepted.

importlib Status

Brett Cannon announced that there is a completed and available branch of CPython using importlib at http://hg.python.org/sandbox/bcannon/. See the bootstrap_importlib named branch.

Discussion began by outlining the only real existing issue, which lies in stat'ing of directories. There's a minor backwards incompatibility issue with time granularity. However, everyone agreed that it's so unlikely to be of issue that it's not a showstopper and the work can move forward. Additionally, there was an optimization made around the stat calls, which was arrived at independently by each of Brett, Antoine Pitrou, and P.J. Eby.

The topic of performance came up and Brett explained that the current pure-Python implementation is around 5% slower. Thomas Wouters exclaimed that 5% slower is actually really good, especially given some recent benchmark work he was doing showing that changing compilers sometimes shows a 5% difference in startup time. There was a shared feeling that 5% slower was not something to hold up integration of the code, which pushed discussion happily along.

Brett went on to explain what the bootstrapping actually looks like, even asserting that the implementation finds what could be the first real use of frozen modules! Guido's first response was, "you mean to tell me that after 20 years we finally found a use for freezing code?"

importlib._bootstrap is a frozen module containing the necessary builtins to operate, along with some re-implementations of a small number of functions. Some of the libraries included in the frozen module are warnings, _os (select code from posix), and marshal.

Another compatibility issue was brought up, but again, was decided to be an issue unworthy of halting the progress on this issue. There's a negative level count which is not supported in importlib, used in implicit relative imports, and it was agreed that it's acceptable to continue not supporting it.

The future will likely result in a strip down of import.c, as well as the exposure of numerous hooks as well as exposure of much of the importlib API.

As for merging with the default branch, it was pretty universally agreed upon that this should happen for 3.3 and it should happen soon in order to get mileage on the implementation throughout the alpha and beta cycles. Since this will be happening shortly, Brett is going to follow-up to python-dev with some cleanup details and look for reviews.

Release Schedule PEPs

Discussion on PEPs 407 and 413 followed the importlib talk. Like the namespace PEP discussion, several ideas were tossed around but the group didn't arrive at any conclusion on acceptability of the PEPs.

Immediately, the idea of splitting out the standard library to be on its own was resurrected, which could lend itself to both PEPs. Some questions remain, namely in where would the test suite live. Additionally, there may need to be some distinction between the tests which cover standard libraries versus the tests which cover language features.

The topic of versioning came up, with three distinctions needing to be made. We would seem to need a version of the language spec, a version of the implementation, and a version of the standard library.

Many commenters mentioned that these PEPs make things too complicated. Additionally, there was a question about whether there are enough users who care about either of these changes being made. Several of us stated that we could use the quicker releases, but with so many users being stuck on old versions for one reason or another, there was a wonder of who would take the releases.

Thomas Wouters mentioned a good point about the difficulty in lining up the so-called Python "LTS" releases with other Python consumers who do similar LTS-style releases. Ubuntu and their LTS schedule was a prime example, as well as the organizations who plan releases atop something like Ubuntu. Many of the Linux distribution packagers in attendance seemed to agree.

One thing that seemed to have broad agreement was that shortening the standard library turnaround time would be a good thing in terms of new contributors. Few people are interested in writing new features that might not be released for over a year -- it's just not fun. Even with bug fixes, sometimes the duration can be seen as too long, to the point where users may end up just fixing our problems from within their own code if possible.

Guido went on to make a comment about how we hope to avoid the mindset some have of "my package isn't accepted until it's in the standard library". The focus continues to be on projects being hosted on PyPI, being successful out in the wild, then vetted for acceptance in the standard library after maturity of the project and its APIs.

It was suggested that perhaps speeding up bug fix releases could be a good move, but we would need to check with release managers to ensure they're on board and willing to expend the effort to produce more frequent releases. As with the new feature releases, we need to be sure there's an audience to take the new bug fixes.

There was also some discussion about what have previously been called "sumo" releases. Given that some similar releases are already made by third-party vendors, the idea didn't seem to gain much traction.

Funding from the Python Software Foundation

PSF Chairman Steve Holden joined the group after lunch to mention that the foundation has resources available to assist development efforts, especially given the sponsorship success of this year's conference. While the foundation can't and won't dictate what should be coded up, they're open to proposals about the types of work to be funded.

Steve and Jesse Noller were adamant about the support not only being for all Python implementations, but also for third-party projects. What's needed to begin funding for a project is a concrete proposal on what will be accomplished. They stressed that the money is ready and waiting -- proposals are the way to unlock it.

Some ideas for how to use the funding came from Steve but also from around the room. One idea which started off the discussion was the idea of funding one-month sabbaticals. Then comes the issue of who might be available. Some suggested that freelance consultants in the development community might be the ones we should try to engage. Those with full-time employment may find it harder to acquire such a sabbatical, but the possibility is open to anyone.
Another thought was potential funding of someone to do spurts of full-time effort on the bug tracker, ideally someone already involved in the triage effort. This type of funding would hope to put an end to the times when it takes three days to fix a bug and three years for the patch to be accepted. Some thought this might be a nice idea in the short term, but it could be tough work and burn out the individual(s) involved. If anyone is up for it, they're encouraged to propose the idea to the foundation.

Along similar lines of tracker maintenance, Glyph Lefkowitz of the Twisted project had an idea to fund code reviews over code-writing efforts. Some thought this might be a good way to push forward the regex/re situation, given that the regex is very large and most felt that the only thing holding it back from some form of inclusion is an in-depth review. The cdecimal module was mentioned as another project that could use some review assistance.

The code review funding is also an idea to push forward some third-party project's ports to Python 3, specifically including Twisted, which the group felt was an effort which should receive some of this funding.

Along the way it was remarked that the core-mentors group has been a success in involving new contributors. Kudos to those involved with that list.

virtualenv Inclusion

In about two minutes, discussion on PEP 405 came and went. Carl Meyer mentioned that a reference implementation is available and is working pretty well. A look from the OSX maintainers would be beneficial, and both Ned Deily and Ronald Oussoren were in attendance. It seemed like one of the only things left in terms of the PEP was to find someone to make a declaration on it, and Thomas Wouters put his name out there if Nick Coghlan wasn't going to do t (update: Nick will be the PEP czar).

PEP 397 Inclusion

Without much of a Windows representation at the summit, discussion was fairlyquick, but it was pretty much agreed that PEP 397 was something we should accept. Brian Curtin spoke in favor of the PEP, as well as mentioning ongoing work on the Windows installer to optionally add the executable's directory to the Path.

After discussion outside of the summit, it was additionally agreed upon that the launcher should be installed via the 3.3 Windows installer, while it can also live as a standalone installer for those not taking 3.3. Additionally, there needs to be some work done on the PEP to remove much of the low-level detail that is coupled too tightly with the implementation, e.g., explaining of the location of the py.ini file.

speed.python.org

After generous hardware donations, the http://speed.python.org site has gone live and is currently running PyPy benchmarks. We need to make a decision on what benchmarks can be used as well as what benchmarks should be used when it comes to creating a Python 3 suite. As we get implementations on Python 3 we'll want to scale back 2.7 testing and push forward with 3.x.

The project suffers not from a technological problem but from a personnel problem, which was thought to be another area that funding could be used for. However, even if money is on the table, we still need to find someone with the time, the know-how, and the drive to complete the task. Ideally the starting task would be to get PyPy and CPython implementations running and comparing. After that, there are a number of infrastructure tasks in line.

PEP 411 Inclusion

PEP 411 proposes the inclusion of provisional packages into the standard library. The recently discussed regex and ipaddr modules were used as examples of libraries to include under this PEP. As for how this inclusion should be implemented and denoted to users was the major discussion point.

It was first suggested that documentation notes don't work -- we can't rely only on documentation to be the single notification point, especially for this type of code inclusion. Other thoughts were some type of flag on the library to specify its experimental status. Another thought was to emit a warning on import of a provisional library, but it's another thing that we'd likely want to silence by default in order to not affect user code in the hopes that developers are running their test suite with warnings enabled. However, as with other times we've gone down this path, we run the risk of developers just disabling warnings all together if they become annoying.

As has been suggested on python-dev, importing a provisional library from a special package, e.g., from __experimental__ import foo, was pretty strongly discouraged. If the library gains a consistent API, it penalizes users once it moves from provisional status to being officially accepted. Aliasing just exacerbates the problem.

The PEP boils down to being about process, and we need to be sure that libraries being included use the ability to change APIs very carefully. We also need to make people, especially the library author, aware of the need to be responsive to feedback and open to change as the code reaches a wider audience.

Looking back, Jesse Noller suggested multiprocessing would have been a good candidate for something like this PEP is suggesting. Around this time, it was suggested that Michael Foord's mock could gain some provisional inclusion within unittest, perhaps as unittest.mock. Instead, given mock's stable API and wide use among us, along with the need for a mocking library within our own test suite, it was agreed to just accept it directly into the standard library without any provisional status.

While on the topic of ``regex``'s role within the PEP came an idea from Thomas Wouters that ``regex`` be introduced into the standard library, bypassing any provisional status. From there, the previously known ``re`` module could be moved to the ``sre`` name, and there didn't appear to be any dissenting opinion there.

It should also be noted to users of provisional libraries that the library maintainers would need to exercise extreme care and be very conservative in changing of the APIs. The last thing we want to do is introduce a good library but as a moving target to its users.

Keyword Arguments on all builtin functions

As recently came up on the tracker, it was suggested that wider use of keyword arguments in our APIs would likely be a good thing. Gregory P. Smith suggested that we leave single-argument APIs alone, which was agreed upon. However, the overall change got some push back as "change for change's sake".

In order to support this, the PyArg_ParseTuple function would need to do more work, and it's already known to be somewhat slow. Alternatively, PyArg_Parse is much faster, and the tuple version could take a thing or two from it regardless of any wide scale change to builtins.

There does exist some potential break in compatibility when replacing a builtin function with a Python one, where positional-only arguments suddenly get a potentially conflicting name.

It was widely agreed upon that we should avoid any blanket rules and keep changes to places where it makes sense rather than make wholesale changes. We also need to be mindful of documentation and doc strings being kept to match the actual keyword argument names as well as keep them in sync.

OrderedDict was suggested as the container for keyword arguments, but Guido and Gregory were unsure of use-cases for that. Whether or not we use a traditional or ordered dictionary, it was suggested that we could possibly use a decorator to handle some of this. We could even go as far as exposing something like PyArg_ParseTuple as a Python-level function.

PEP 362, a proposal for a function signature object, would help here and with decorators in general. It seems that all that's left with that PEP is another look and someone to declare on it.

Porting to Python 3

We moved on to talk about Python 3 porting, starting with the current strategies and how they're working out. Single-codebase porting is working better than expected for most of us, although except handling is a bit messy when supporting versions like 2.4. Having a lot of options, from 3to2 to 2to3, then the single codebase through parallel trees, is a really good thing. However, it's hard for us to choose a strategy for projects, so we don't, which is why most documentation tries to lay numerous strategies out there.

It was suggested that documentation could stand to gain more examples of real-world porting examples, ideally pointing to changesets of these projects. The thought of our porting documentation gaining a cookbook-style approach seemed to get some agreement as a good idea.

Hash Randomization

Release candidates are available to all branches receiving security fixes, and in the meantime, David Malcolm found and reported a security issue in the upstream expat project. However, since the upstream fix includes many other fixes at the same time, we should pick up only the security fix at this time and leave the bug fixes for the next bug fix release of the relevant branches.

New dict Implementation

Since the implementation makes sense and the tests pass, it was quickly agreed upon that Mark Shannon's PEP 412 should be accepted. As with other changes agreed upon in this summit, we'd like for the change to be pushed soon in order to get mileage on it throughout the alpha and beta cycles. With this acceptance comes commit access for Mark so that he can maintain the code.

It was also remarked that the only user-visible difference that this implementation brings is a difference in sort ordering, but the recent hash randomization work makes this a moot point.

New pickle Protocol

PEP 3154, mentioned by Lukasz Langa, specifies a new pickle protocol -- version 4. Lukasz mentioned exception pickling in multiprocessing as being an issue, and Antoine solved it with this PEP. While qualified names provide some help, it was agreed upon that this PEP needs more attention.


If you have any questions or comments, please post to python-dev.

Thanks to Eric Snow and Senthil Kumaran for contributing to this post.

Wednesday, August 24, 2011

Meet the Team: Brett Cannon

This post is part of the "Meet the Team" series of posts, which is meant to give a brief introduction to the Python core development team.

Name:Brett Cannon
Location:San Francisco, CA, USA
Home Page:https://profiles.google.com/bcannon
Blog:http://sayspy.blogspot.com

How long have you been using Python?

Since the fall of 2000

How long have you been a core committer?

Since April 2003 (shortly after PyCon 2003).

How did you get started as a core developer? Do you remember your first commit?

I became a core developer thanks to incessantly bugging people to commit patches for me (a trick that doesn't quite work as well as it used to; perk of getting in before Python's popularity spikein 2003/2004). Starting in August 2002 I revitalized the Python-Dev Summaries (which lasted for about 2.5 years). While writing the Summaries I would fairly regularly pick up on little issues that needed fixing. Since I was already talking on python-dev fairly regularly I simply asked folks to check my patches and commit them for me. One day Guido just asked why I didn't commit myself, I said I didn't have commit rights, and then he more or less said "you do now".

As for my first commit (changeset 28686), it was fixing some string escapement in time.strptime() (which happens to be my first contribution to Python itself).

Which parts of Python are you working on now?

I typically focus on the import machinery and making the Python language work well across all VMs.

What do you do with Python when you aren't doing core development work?

I managed to use Python a little bit in my PhD thesis by implementing some server-side stuff in Python. Otherwise all of my personal projects use Python as much as possible. And my future job at Google is going to be mostly in Python.

What do you do when you aren't programming?

I'm somewhat of a movie junkie with selective bits of TV tossed in (losing my television in the summer of 2000 to a heat wave was one of the best things that ever accidentally happened to me; marrying my wife has been the best thing I did on purpose =). Otherwise I read a lot; mostly magazines and websites, but with some book always under progress.

Monday, August 8, 2011

Meet the Team: Michael Foord

This post is part of the "Meet the Team" series of posts, which is meant to give a brief introduction to the Python core development team.

Name:Michael Foord
Location:Northampton UK
Home Page:http://www.voidspace.org.uk/

How long have you been using Python?

I first started using Python as a hobby in 2002. I started using Python full time for work in 2006. When I started programming with Python it was with a group of guys who wanted to write a program to aggregate information from a Play By Email game. None of us had done any programming for a while and we had just decided on using Smalltalk when someone suggested we try Python. I quickly fell in love with Python.

How long have you been a core committer?

I became a core-committer at PyCon in 2009. It was originally because of my involvement with IronPython.

How did you get started as a core developer? Do you remember your first commit?

During the PyCon 2009 sprints I worked with Gregory Smith, another core developer, to incorporate some improvements to unittest contributed by Google.

Which parts of Python are you working on now?

After the initial work on unittest at the PyCon sprint I took on fixing other issues and making improvements to unittest, which was without a maintainer. I became the maintainer of unittest but also contribute to other parts of the standard library.

I'm involved in supporting Python in various other minor ways, such as looking after Planet Python, being a PSF member, helping out on the python.org webmaster alias and so on.

What do you do with Python when you aren't doing core development work?

For my day job I do web development for Canonical. I work on some of the web services infrastructure around the Canonical websites and also some of the services that integrate with Ubuntu itself. That's good fun and its a great team.

In my spare time I work on projects like unittest2 (a backport of the improvements of unittest for other platforms), mock (a testing library that provides mock objects and support for monkey patching in tests) and a whole bunch of other smaller stuff.

I'd like to write more, but having devoted the best part of two years to writing IronPython in Action I doubt I'll take on any large writing projects soon.

What do you do when you aren't programming?

I'm very involved in a church in Northampton (UK), which takes a lot of my time and I help with administration for a charity we run. This is one reason why working for Canonical is good - I can work from home and having put my roots down here I won't move anywhere else (I certainly don't stay for the weather). Needless to say there isn't much Python programming happening in Northampton. My first full time programming gig was with an amazing team in London, which was a two hour door to door commute each way. I managed four years of that, and really enjoyed the job, but having escaped the commute I'm not likely to ever go back.

I also enjoy gaming on the XBox. Unfortunately if I find a game I like I can get sucked into it for weeks so I have to be careful. I've avoided world of warcraft and eve online for this reason... I also organise a monthly geek meet in Northampton. There aren't enough Python programmers for a Python user group but we have a good collection of geeks of all sorts. We normally just get together in a pub and chew the fat or show off our latest gadgets.

Monday, July 11, 2011

A Python Launcher For Windows

Mark Hammond (author of pywin32 and long-time supporter of Python on Windows) has written PEP 397, which describes a new launcher for Python on Windows. Vinay Sanjip (author of the standard library logging module) has recently created an implementation of the launcher, downloadable from https://bitbucket.org/vinay.sajip/pylauncher/downloads

The launcher allows Python scripts (.py and .pyw files) on Windows to specify the version of Python which should be used, allowing simultaneous use of Python 2 and 3.

Windows users should consider downloading the launcher and testing it, to help the Python developers iron out any remaining issues. The launcher is packaged as a standalone application, and will support currently available versions of Python. The intention is that once the launcher is finalised, it will be included as part of Python 3.3 (although it will remain available as a standalone download for users of earlier versions).

Two versions of the launcher are available - launcher.msi which installs in the Program Files directory, and launchsys.msi which installs in Windows' System32 directory. (There are also 64-bit versions for 64-bit versions of Windows).

Some Details About the Launcher

The full specification of the behaviour of the launcher is given in PEP 397. To summarise the basic principles:

  • The launcher supplies two executables - py.exe (the console version) and pyw.exe (the GUI version).
  • The launcher is registered as the handler for .py (console) and .pyw (GUI) file extensions.
  • When executing a script, the launcher looks for a Unix-style #! (shebang) line in the script. It recognises executable names python (system default python), python2 (default Python 2 release) and python3 (default Python 3 release). The precise details can easily be customised on a per-user or per-machine basis.
  • When used standalone, the py.exe command launches the Python interactive interpreter. Command line switches are supported, so that py -2 launches Python 2, py -3 launches Python 3, and py launches the default version.

Simple Usage Instructions

When it is installed, the launcher associates itself with .py and .pyw scripts. Unless you do anything else, scripts will be run using the default Python on the machine, so you will see no change. One thing you might like to do, if you use the console a lot, is to add .py to your PATHEXT variable so that scripts don't get executed in a separate console.

To specify that a script must use Python 2, simply add:

#!/usr/bin/env python2

as the first line of the script. (This is a Unix-compatible form. If you don't need Unix compatibility, #!python2 will do).

If on the other hand, you want to specify that a script must use Python 3, add:

#!/usr/bin/env python3

as the first line.

You can also start the Python interpreter using any of the following commands:

# Default version of Python
py
# Python 2
py -2
# Python 3
py -3

For this to work, the py.exe executable must be on your path. This is automatic with the launchsys version of the installer, but the install directory (C:\Program Files\Python Launcher) must be added manually to PATH with launcher.msi.

Further Reading

The following email threads on python-dev cover some of the key discussions:

CPython 3.2.1 Released

On behalf of the python-dev team, release manager Georg Brandl has announced the final release of CPython 3.2.1. Windows installers and tarballs are available as of July 10, so please consider upgrading to this release.

The What's New document lists all of the new features in 3.2, and the Misc/NEWS file in the source lists each bug fixed.

If you find any issues with this release or any other, please report them to http://bugs.python.org/.

Wednesday, July 6, 2011

3.2.1 Release Candidate 2 Released

Following up a big month of releases in June, the second release candidate of the 3.2.1 line is now ready. Since the first release candidate on May 15, over 40 issues have been fixed. We encourage everyone to test their projects with this candidate to get one last look before the final release of 3.2.1.

What's fixed?

I/O

#1195 spent a few years witout a fix, but a simple addition to clear errors before calling fgets solves the problem of interrupting sys.stdin.read() with CTRL-D inside of input(). The io system saw a cleanup in #12175 with the readall method with None being the return value on a read() which returns None, and a ValueError is now raised when a file can't be opened.

Although this isn't new for RC2, #11272 is an important 3.2.1 fix to input() on Windows - the fixing of a trailing \r. The issue has been reported many times over and affects a many people (distutils upload command anyone?), so hopefully 3.2.1 does the trick for you.

Windows

3.2.0 brought a new feature for Windows: os.symlink support. With that feature came #12084, os.stat was improperly evaluating Windows symlinks, so the inner workings of the various stat functions were corrected.

A user noticed that os.path.isdir was slow, and the fact that it relied on os.stat contributed to that, especially when evaluating symlinks (which are generally twice as slow as regular files). While os.path.isdir isn't anyone's performance bottleneck, it's called numerous times on interpreter startup so changing it in #11583 to use GetFileAttributes gives a tiny speedup to build on.

subprocess

Creating a Popen object with unexpected arguments was causing an AttributeError, but that was reported in #12085 and was fixed by the reporter. Due to a change in 3.2.0, Popen wasn't correctly handling empty environment variables, specifically the env argument. #12383 was created for the issue and was promptly fixed.

...and more!

For a full list of changes through 3.2.1 RC2, check out the change log and download it now!

As always, please report any issues you find to http://bugs.python.org. We appreciate your help in making great Python releases.

Tuesday, June 14, 2011

June Releases - 2.6.7, 2.7.2, 3.1.4

June is a big month for Python releases, with an update coming out of all active branches.

2.6.7

A new source-only release of Python 2.6.7 is available, providing fixes to three security issues. Now that the 2.6 line is in security-mode, these releases will happen on an as-needed basis until October 2013 in source-only form. If you require binary installers, you should consider an upgrade to 2.7 or 3.2.

2.6.7 is the first release to contain a fix to the previously covered urllib vulnerability. Additionally, an smtpd DoS vulnerability (Issue #9129) and SimpleHTTPServer.list_directory XSS vulnerability (Issue #11442) are fixed.

2.7.2

The last minor version of the 2.x line, 2.7, received over 150 bug fixes since 2.7.1 in November 2010. 2.7.2 source and binary installers are available as of June 12, which include the security fixes mentioned in 2.6.7.

A number of crashes are fixed: a situation when Python incorrectly used non-Python managed memory while it was being modified by another thread, when deleting __abstractmethods__ from a class, accessing a memory-mapped file past its length, and several others.

A fix to getpass corrects a regression in regards to CTRL-C and CTRL-Z handling. multiprocessing received a number of fixes, including treating Windows services like frozen executables and a correction to a race condition when terminating multiprocessing.Pool workers. mmap was fixed to work with file sizes and offsets larger than 4 GB even on 32-bit builds, and a TypeError is now raised rather than segfaulting when trying to write to a non-writeable map.

For a full list of changes, see the 2.7.2 news file.

3.1.4

3.1.4 is the last bug-fix release of the 3.1.x line, sending 3.1 into security-mode as the 3.2 line carries on. 3.1.4 contains over 100 bug fixes since the 3.1.3 release in November 2010. As with 2.7.2, binary installers are available as of June 12, and 3.1.4 is the first 3.x release to contain the security fixes listed in 2.6.7.

3.1.4 corrects some problems with __dir__ lookups on objects, dates past 2038 in the Windows implementation of os.stat and os.utime, and a number of 64-bit cleanups. The io library saw a number of changes in returning None when nothing was read and raising appropriate exceptions in other spots. ctypes callback arguments were fixed on 64-bit Windows and a crash was also remedied.

For a full list of changes, see the 3.1.4 news file.

3.2.1

3.2.1 is currently in the release candidate phase, with one round already completed and a second release candidate expected soon. We would greatly appreciate 3.2 users trying out the release candidates to ensure we cover any issues you may be seeing. If you have any bugs to report, please file them on bugs.python.org.