distutils – Carl Banks' Blog

Python is my favorite programming language.

Like all programming languages, it has things I don’t like. The thing about Python is, it has a lot fewer things I don’t like than other languages. A lot. Even the things I don’t like are relatively minor in the end.

So of course I made a page listing what I don’t like.

• Boolean Handling

The Python treatment of booleans is, by far, my biggest gripe with Python. (Which, you know, is a pretty good thing to have as a biggest gripe.)

I think booleans should be completely disjoint from all other types, as they are in Java. If-conditions should have to evaluate to a boolean, or else it should throw an exception. The and-, or-, and not-operators should accept only boolean operands and return only boolean results. (Though I would accept arrays of booleans for the and-, or-, and not-operators; but not for if-conditions.)

I don’t deny that it’s convenient to use an “or” that returns the first “true” value, or that it’s sometimes a marginal improvement in clarity.

I just think this idiom is too error prone, and very often too misleading, to justify its convenience. There’s ordinary carelessness, of course, where someone writes a function like this:


def op(datum=None):
    result = datum or default

while not stopping to consider that the empty string would be a legal value for datum in this case. But there’s a more insidious danger that can’t be chalked up to carelessness: when the acceptable values for datum change after the function is written. This leads to subtle breakage.

As far as the user is concerned, this function’s correct behavior is to “use the default when the argument is not specified”, but as-is the function is not robust to future changes or uses. The function is a poor, non-robust implementation of the desired behavior.

To me this feels like a ticking bomb, a bug waiting to hatch from it’s egg.

More generally: I find that the boundary between what Python considers to be true and false rarely corresponds exactly to what I’m trying to do in cases like the above. Usually the true/false test only works for certain set of expected values that I have in my mind. When the acceptable values are generalized, when you want to expand this function’s role, does the true/false test still work? I find it rarely does.

• True Values

Ok, so Python doesn’t have disjoint booleans. Fine.

But I don’t agree the values Python considers true and false. In particular, I disagree with the notion of empty lists being false.

This is because (unlike for numbers or strings) it’s not self-evident what false should be. Python lists, tuples, sets, and dicts consider empty containers false and all other containers true. However, for other related types this doesn’t hold. Notably, all built-in python iterator types always return true, even if they will yield no more values. This can lead to subtle code breakage in situations like this:


def func(iterable):
    if not iterable:
        return
    initialize()
    for item in iterable():
        do_something_with_item()
    finalize()

Now, this piece of code was written with that the user would pass an list or tuple in. However, if some wily user decides to pass, say, a generator in, the result is that is runs the initialization and finalization even if the generator will yield no values. This is wasteful at best and possibly buggy at worst.

Another example of container-like types that don’t treat emptiness as false are Numpy arrays. They wisely don’t even go there, raising an exception if someone tries to get their truth value. For numerical programming it makes sense to apply boolean operations element-by-element.

The point of all this is that the one-size-fits-all idea that empty is the one true value for false doesn’t work. For various container and container-like objects, it makes sense for false to be something else, or for there to be no notion of false at all.

That’s why I think Python should do away with the notion of emptiness being false, and require an explicit test for emptiness where it’s desired.

• Pathname manipulation

This is a common gripe from Pythonistas. The built-in way to manipulate paths in Python is with the os.path module. One would type os.path.join(dirname,basename) to splice a pathname together from a directory and filename, for instance.

Many Pythonistas don’t like typing all that out for a simple operation. I don’t either, but that’s not my biggest issue with os.path. My issue is that it isn’t that powerful.

One of the greatest things about Python is that it almost never lacks a quick and easy way to do something that ought to be quick and easy. (In fact, Python often makes things that ought to be hard as hell quick and easy.) The glaring exception to this is os.path

Somehow there are many useful things that os.path doesn’t do. A big one would be something like os.path.splitall: completely splitting a filename into a list of path components. Another is relativizing a pathname; os.path can make a relative path absolute but not vice versa.

The annoying thing is it’s right at that point where it’s annoying but not quite annoying enough to move me. I don’t do pathname manipulations enough that I ever feel like working out my own solution to this problem once and for all. And I don’t really like the third-party solutions, so I’m kind of stuck with this annoyance. Maybe I’ll sit down and do it one day.

• Distutils

Distutils is one of those things that almost crucially useful at certain times, but makes me shake my head in astonishment about how much better it could have been.

It was pretty much designed with poor anticipation of user needs. If you need to do something that isn’t exactly what the authors envisioned, it can make life supremely difficult. It’s not versatile at all.

Among things is doesn’t easily do is to let the user specify custom build flags (I have to edit setup.py to do that, and sometimes setup.py is so convoluted in its attempt to be intelligent that I can hardly find the place). I come across this issue a lot since I have Python installed in an unusual location.

• Setuptools

On a related note, there is setuptools, which is becoming the de facto super-distutils, and while it fixes some of distutils’s issues, it’s add its own, and it’s probably even less versatile when it comes to its own issues and alternative needs of users than distutils was.

Besides that, setuptools also routinely and mind-bogglying-ly rudely downloads and installs packages on the user’s behalf without asking. This makes me want to literally makes me want to punch Phillip Eby (the author) through the monitor. Yes, I realize I can shut the behavior off, but it’s annoying.

One thing about setuptools is that is seems needlessly complicated for what it’s trying to do. In particular, it sets up some sort of homebrew metadata scheme (entry hooks are part of this), and many packages can’t be imported without going through package resources. The inability to simply import modules is, in my opinion, a terrible design decision. Modules that do this make things difficult when distributing binary pagakges.

Conversely, authors who distribute packages that depend on these entry hooks have likewise made a bad design decision. It might make sense for enterpise-deployed libraries (where you can depend on users to follow arcane corporate procedures), but not for you average third-party library downloaded off the Internet. PyOpenGL was a notable offender here, and I abandoned it mostly on account of this.

• Lack of a set-and-test

People coming to Python from Perl often criticize if for not supporting the following pattern:


if (/someregexp/) {
    do_something();
} elsif (/someotherregexp/) {
    do_something_else();
} elsif (/someotherregexpagain/) {
    do_still_another_thing();
} else {
    do_default_thing();
}

The idiom is useful for many applications, especially text processing (though it should be refactored when scaling to larger rules). Python, however, doesn’t support this without workarounds. (The issue with Python is that you can’t execute code between if- and elif-conditions, and a regexp search would need to do that.)

That Python doesn’t support this isn’t such a big deal; it’s fairly common but is still a rather specialized use, and the workarounds aren’t too bad. It’s still a bit annoying.

So that’s it. (Well, that’s all I could think of right now.) All in all, if these are my biggest gripes, it’s a pretty good thing. They are really very minor issues compared to other languages.

Tag: distutils

Minor Python (Programming Language) Complaints