The Python IAQ:
|
| Q: What is an Infrequently Answered Question? |
A question is infrequently answered either because few people know the answer or because it concerns an obscure, subtle point (but a point that may be crucial to you). I thought I had invented the term for my Java IAQ, but it also shows up at the very informative About.com Urban Legends site. There are lots of Python FAQs around, but this is the only Python IAQ. (There are a few Infrequently Asked Questions lists, including a satirical one on C.)
| Q: The code in a finally clause will never fail to execute, right? |
What never? Well, hardly ever. The code in a finally clause does get executed after the try clause whether or not there is an exception, and even if sys.exit is called. However, the finally clause will not execute if execution never gets to it. This would happen regardless of the value of choice in the following:
try:
if choice:
while 1:
pass
else:
print "Please pull the plug on your computer sometime soon..."
time.sleep(60 * 60 * 24 * 365 * 10000)
finally:
print "Finally ..."
|
| Q: Polymorphism is great; I can sort a list of elements of any type, right? |
Wrong. Consider this:
>>> x = [1, 1j]
>>> x.sort()
Traceback (most recent call last):
File "<pyshell#13>", line 1, in ?
x.sort()
TypeError: cannot compare complex numbers using <, <=, >, >=
|
(The number 1j is the square root of -1.) The problem is that the sort method (in the current implementation), compares elements using the __lt__ method, which refuses to compare complex numbers (because they are not orderable). Curiously, complex.__lt__ has no qualms about comparing complex numbers to strings, lists, and every other type except complex numbers. So the answer is you can sort a sequence of objects that support the __lt__ method (and possibly other methods if the implementation happens to change).
As for the first part of the question, "Polymorphism is great", I would agree, but Python sometimes makes it difficult because many Python types (such as sequence, and number) are defined informally.
| Q: Can I do ++x and x++ in Python? |
Literally, yes and no; but for practical purposes, no. What do I mean by that?
The deeper question is: why doesn't Python allow x++? I believe it is the same reason why Python does not allow assignments in expressions: Python wants to clearly separate statements and expressions. If you believe they should be distinct, then disallowing ++ is probably the best decision. On the other hand, advocates of functional languages argue that statements should be expressions. I'm with my fellow Dane, Bjarne Stroustrup, on this one. He said in The Design and Evolution of C++ ``If I were to design a language from scratch, I would follow the Algol68 path and make every statement and declaration an expression that yields a value''.
| Q: Can I use C++'s syntax for ostreams: cout << x << y ... ? |
You can. If you don't like writing ``print x, y'' then you can try this:
import sys
class ostream:
def __init__(self, file):
self.file = file
def __lshift__(self, obj):
self.file.write(str(obj));
return self
cout = ostream(sys.stdout)
cerr = ostream(sys.stderr)
nl = '\n'
|
(This document shows code that belongs in a file above the horizontal line and example uses of it below the line.) This gives you a different syntax, but it doesn't give you a new convention for printing--it just packages up the str convention that already exists in Python. This is similar to the toString() convention in Java. C++ has a very different convention: instead of a canonical way to convert an object to a string, there is a canonical way to print an object to a stream (well, semi-canonical---a lot of C++ code still uses printf). The stream approach is more complicated, but it does have the advantage that if you need to print a really huge object you needn't create a really huge temporary string to do it.
| Q: What if I like C++'s printf? |
It's not a bad idea to define a printf in Python. You could argue that printf("%d = %s", num, result) is more natural than print "%d = %s" % (num, result), because the parens are in a more familiar place (and you get to omit the %). Furthermore, it's oh-so-easy:
def printf(format, *args): print format % args, |
Even in a one-liner like this, there are a few subtleties. First, I had to decide whether to add the comma at the end or not. To be more like C++, I decided to add it (which means that if you want a newline printed, you have to add it yourself to the end of the format string). Second, this will still print a trailing space. If you don't want that, use sys.stdout.write instead of print. Third, is this good for anything besides being more C-like? Yes; you need a printing function (as opposed to a print statement) for use in places that accept functions, but not statements, like in lambda expressions and as the first argument to map. In fact, such a function is so handy, that you probably want one that does not do formatting:
def prin(x): print x, |
Now map(prin, seq) will print each element of seq, but map(print, seq) is a syntax error. I've seen some careless programmers (well, OK, it was me, but I knew I was being careless) think it would be a good idea to fit both these functions into one, as follows:
def printf(format, *args): print str(format) % args, |
Then printf(42), printf('A multi-line\n message') and printf('%4.2f', 42) all work. But the ``good idea'' thought gets changed to ``what was I thinking'' as soon as you do printf('100% guaranteed'), or anything else with a % character that is not meant as a formatting directive. If you do implement this version of printf, it needs a comment like this:
def printf(format, *args):
"""Format args with the first argument as format string, and print.
If the format is not a string, it is converted to one with str.
You must use printf('%s', x) instead of printf(x) if x might
contain % or backslash characters."""
print str(format) % args,
|
| Q: Is there a better syntax for dictionary literals? All my keys are identifiers. |
Yes! I agree that it can be tedious to have to type the quote marks around your keys, especially for a large dictionary literal. At first I thought it might be a useful change to Python to add special syntax for this; maybe {a=1, b=2} for what you now have to write as {'a':1, 'b':2}. But then I realized that you can almost have this syntax by defining a one-line function:
def Dict(**dict): return dict |
A reader suggested that Perl has a similar special notation for hashes; you can write either ("a", 1, "b", 2} or (a => 1, b => 2) for hash literals in Perl. This is the truth, but not the whole truth. "man perlop" says "The => digraph is mostly just a synonym for the comma operator ..." and in fact you can write (a, 1, b, 2), where a and b are barewords. But, as Dag Asheim points out, if you turn strict on, you'll get an error with this; you must use either strings or the => operator. And Larry Wall has proclaimed that "There will be no barewords in Perl 6."
| Q: Is there a similar shortcut for objects? |
Indeed there is. When all you want to do is create an object that holds data in several fields, the following will do:
class Struct:
def __init__(self, **entries): self.__dict__.update(entries)
|
Essentially what we are doing here is creating an anonymous class. OK, I know that the class of globals is Struct, but because we are adding slots to it, its like creating a new, unnamed class (in much the same way that lambda creates anonymous functions). I hate to mess with Struct because it is so concise the way it is, but if you add the following method then you will get a nice printed version of each structure:
def __repr__(self):
args = ['%s=%s' % (k, repr(v)) for (k,v) in vars(self).items()]
return 'Struct(%s)' % ', '.join(args)
|
| Q: That's great for creating objects; How about for updating? |
Well, dictionaries have an update method, so you could do d.update(Dict(a=100, b=200)) when d is a dictionary. There is no corresponding method for objects, so you have to do obj.a = 100; obj.b = 200. Or you could define one function to let you do update(x, a=100, b=200) when x is either a dictionary or an object:
import types
def update(x, **entries):
if type(x) == types.DictType: x.update(entries)
else: x.__dict__.update(entries)
return x |
This is especially nice for constructors:
def __init__(self, a, b, c, d=42, e=None, f=()):
update(self, a=a, b=b, c=c, d=d, e=e, f=f)
|
| Q: Can I have a dict with a default value of 0 or [ ] or something? |
I sympathize that if you're keeping counts of something, it's much nicer to be able to say count[x] += 1 than to have to say count[x] = count.get(x, 0) + 1. And as of Python 2.2, it is easy to subclass the builtin dict class to do this. I call my version DefaultDict. Note the use of copy.deepcopy; it wouldn't do to have every key in the dict share the same [] as the default value (we waste time copying 0, but the time lost is not too bad if you do more updates and accesses than initializations):
class DefaultDict(dict):
"""Dictionary with a default value for unknown keys."""
def __init__(self, default):
self.default = default
def __getitem__(self, key):
if key in self: return self.get(key)
return self.setdefault(key, copy.deepcopy(self.default))
|
Note that without DefaultDict the d[w1][w2] += 1 in the bigram example would have to be something like:
d.setdefault(w1,{}).setdefault(w2, 0); d[w1][w2] += 1
| Q: Hey, can you write code to transpose a matrix in 0.007KB or less? |
I thought you'd never ask. If you represent a matrix as a sequence of sequences, then zip can do the job:
>>> m = [(1,2,3), (4,5,6)] >>> zip(*m) [(1, 4), (2, 5), (3, 6)] |
To understand this, you need to know that f(*m) is like apply(f, m). This is based on an old Lisp question, the answer to which is Python's equivalent of map(None,*m), but the zip version, suggested by Chih-Chung Chang, is even shorter. You might think this is only useful for an appearance on Letterman's Stupid Programmer's Tricks, but just the other day I was faced with this problem: given a list of database rows, where each row is a list of ordered values, find the list of unique values that appear in each column. So I wrote:
possible_values = map(unique, zip(*db)) |
| Q: The f(*m) trick is cool. Does the same syntax work with method calls, like x.f(*y)? |
This question reveals a common misconception. There is no syntax for method calls! There is a syntax for calling a function, and there is a syntax for extracting a field from an object, and there are bound methods. Together these three features conspire to make it look like x.f(y) is a single piece of syntax, when actually it is equivalent to (x.f)(y), which is equivalent to (getattr(x, 'f'))(y). I can see you don't believe me. Look:
class X:
def f(self, y): return 2 * y
|
So the answer to the question is: you can put *y or **y (or anything else that you would put into a function call) into a method call, because method calls are just function calls.
| Q: Can you implement abstract classes in Python in 0 lines of code? Or 4? |
Java has an abstract keyword so you can define abstract classes that cannot be instantiated, but can be subclassed if you implement all the abstract methods in the class. It is a little known fact that you can use abstract in Python in almost the same way; the difference is that you get an error at runtime when you try to call the unimplemented method, rather than at compile time. Compare:
## Python
class MyAbstractClass:
def method1(self): abstract
class MyClass(MyAbstractClass):
pass
|
Don't spend too much time looking for the abstract keyword in the Python Language Reference Manual; it isn't there. I added it to the language, and the great part is, the implementation is zero lines of code! What happens is that if you call method1, you get a NameError because there is no abstract variable. (You might say that's cheating, because it will break if somebody defines a variable called abstract. But then any program will break if someone redefines a variable that the code depends on. The only difference here is that we're depending on the lack of a definition rather than on a definition.)
If you're willing to write abstract() instead of abstract, then you can define a function that raises a NotImplementedError instead of a NameError, which makes more sense. (Also, if someone redefines abstract to be anything but a function of zero arguments, you'll still get an error message.) To make abstract's error message look nice, just peek into the stack frame to see who the offending caller is:
def abstract():
import inspect
caller = inspect.getouterframes(inspect.currentframe())[1][3]
raise NotImplementedError(caller + ' must be implemented in subclass')
|
| Q: How do I do Enumerated Types (enums) in Python? |
The reason there is no one answer to this question in Python is that there are several answers, depending on what you expect an enum to be. If you just want some variables, each with a unique integer value, you can do:
red, green, blue = range(3) |
The drawback is that whenever you add a new variable on the left, you have to increment the number on the right. This is not so bad, though, because if you get it wrong Python will raise an error. It's probably better hygiene to isolate your enums in a class:
class Colors:
red, green, blue = range(3) |
Now Colors.red yields 0, and dir(Colors) may be useful (although you need to ignore the __doc__ and __module__ entries). If you need control over what values each enum variable will have, you can use the Struct function from several questions ago as follows:
Enum = Struct Colors = Enum(red=0, green=100, blue=200) |
While these simple approaches usually suffice, some people want more. There are Enum implementations at python.org, ASPN, and faqts. Here is my version, which is (almost) all things to all people, while still being reasonably concise (44 lines, 22 of which are code):
class Enum:
"""Create an enumerated type, then add var/value pairs to it.
The constructor and the method .ints(names) take a list of variable names,
and assign them consecutive integers as values. The method .strs(names)
assigns each variable name to itself (that is variable 'v' has value 'v').
The method .vals(a=99, b=200) allows you to assign any value to variables.
A 'list of variable names' can also be a string, which will be .split().
The method .end() returns one more than the maximum int value.
Example: opcodes = Enum("add sub load store").vals(illegal=255)."""
def __init__(self, names=[]): self.ints(names)
def set(self, var, val):
"""Set var to the value val in the enum."""
if var in vars(self).keys(): raise AttributeError("duplicate var in enum")
if val in vars(self).values(): raise ValueError("duplicate value in enum")
vars(self)[var] = val
return self
def strs(self, names):
"""Set each of the names to itself (as a string) in the enum."""
for var in self._parse(names): self.set(var, var)
return self
def ints(self, names):
"""Set each of the names to the next highest int in the enum."""
for var in self._parse(names): self.set(var, self.end())
return self
def vals(self, **entries):
"""Set each of var=val pairs in the enum."""
for (var, val) in entries.items(): self.set(var, val)
return self
def end(self):
"""One more than the largest int value in the enum, or 0 if none."""
try: return max([x for x in vars(self).values() if type(x)==type(0)]) + 1
except ValueError: return 0
def _parse(self, names):
### If names is a string, parse it as a list of names.
if type(names) == type(""): return names.split()
else: return names |
Here's an example of how to use it:
>>> opcodes = Enum("add sub load store").vals(illegal=255)
>>> opcodes.add
0
>>> opcodes.illegal
255
>>> opcodes.end()
256
>>> dir(opcodes)
['add', 'illegal', 'load', 'store', 'sub']
>>> vars(opcodes)
{'store': 3, 'sub': 1, 'add': 0, 'illegal': 255, 'load': 2}
>>> vars(opcodes).values()
[3, 1, 0, 255, 2] |
Notice that the methods are cascaded, so you can combine .strs, .ints and .vals on a single line after the constuctor. Notice the helpful use of dir and vals, and that they are free of clutter with anything other than the variables you define. To iterate over all the enumerated values, you can use for x in vars(opcodes).values(). Notice that you can have non-integer values for enum variables if you want, using the .strs and .vals methods. Finally, notice that it is an error to duplicate a variable name or value. Sometimes you want to have duplicate values (e.g. for aliases); if you need that, either delete the line that raises a ValueError, or use, for example vars(opcodes)['first_op'] = 0. If there's one thing I dislike most, its the potential for confusion between vals and values; maybe I can think of a better name for vals.
| Q: Why is there no ``Set'' data type in Python? |
Well, there is a proposal to add a Set type, but it is still under review. Until it is approved, you can either create your own Set type, or implement them with either lists or dictionaries, or sorted lists (using the bisect module). The first two easily support literal creation, iteration, membership test, and adding or deleting an element. With sorted lists, you'd need to write your own functions for testing set membership and doing removal. Clearly this is only worth it if you will have many more membership tests than insertions or deletions, and if you have elements that can't be hashed. The code and the time complexity for each operation are summarized here, along with the methods (==, hash, or cmp) that elements of the set must support for each of the three techniques:
| Implementation | Iteration | Membership | Add | Remove | Requirement |
|---|---|---|---|---|---|
| s=[1,2,3] list | for e in s O(n) | e in s O(n) | s.append(e) O(1) | s.remove(e) O(n) | e.__eq__ |
| s={1:1, 2:1, 3:1} dict | for e in s O(n) | e in s
O(1) | s[e] = 1 O(1) | del s[e] O(1) | e.__hash__ |
| s=[1,2,3]; s.sort() sorted list | for e in s O(n) | define a function O(log n) | bisect.insort(s, e) O(n) | define a function O(n) | e.__cmp__ |
The list-based implementation is asymptotically slower, and usually slower in practice as well. Also, its worse than we show above, in that if there is a chance that the element you want to add is already in the set, then you have to do "if e not in s: s.append(e)", and if there is a chance the element you want to delete is not, then you need "if e in s: s.remove(e)". The dictionary-based approach has better asymptotic performance for large sets, but it has two problems. First, it only works for hashable elements, so you can't use it to make a set of lists, or of sets. Tuples, ok. Second, the syntax of literal dictionaries is awkward. You could use Set(1, 2, 3) with the following function:
def Set(*elements):
"Create a set of {e: 1} entries for e in elements."
set = {}
for e in elements:
set[e] = 1
return set |
The main reason why the set proposal has not gone forward is that there is no consensus on what to do about non-hashable elements, particularly mutable types. I would vote for caveat emptor: I would put a hash method on lists (and instances), and the user should know enough not to alter a value that is a key in a dict, just like the user has to know not to alter a value that is held in any other data structure that needs the original copy. But I understand the principle of least surprise, and that core Python features like dicts (and perhaps in the future sets) need to be especially un-surprising.
If Python is not going to provide a set type for you, you need to support it yourself. You can't add new methods to lists and dicts, but you can define some functions that will work on anything that supports the "in" protocol, as follows:
def set(elements):
"""Coerce the input sequence into a set. See also the function Set.
set([10, 20]) ==> {10: 1, 20: 1}; set(_) ==> {10: 1, 20: 1}"""
if isinstance(elements, types.DictType):
return elements.copy()
set = {}
for e in elements:
set[e] = 1
return set
def is_set(s):
"Return true if argument is a dict."
return isinstance(s, types.DictType)
def unique(seq):
"""Remove duplicate elements from seq. Assumes hashable elements.
Ex: unique([1, 2, 3, 2, 1]) ==> [1, 2, 3] # order may vary"""
return set(seq).keys()
def intersection(a, b):
"""Return the intersection of two dicts or sequences.
Ex: intersection([1, 2, 3], [2, 3, 4]) ==> [2, 3]"""
if not is_set(a): a = set(a)
return [x for x in b if x in a]
def union(a, b):
"""Return a sequence with all the elements of a and b.
Ex: union([1, 2, 3], [2, 3, 4]) ==> [1, 2, 3, 4]"""
s = set(a)
for x in b:
s[x] = 1
return s.keys()
def set_difference(a, b):
"""Return the elements of a that are not in b"""
if not is_set(b): b = set(b)
return [x for x in a if x not in b]
def is_subset(a, b):
"""Return true iff a is a subset of b."""
if len(a) > len(b): return false
if not is_set(b): b = set(b)
for x in a:
if x not in b: return false
return true
|
| Q: Should I, could I use a Boolean type? |
Some languages enforce a Boolean type. In Java, for example, you have to say if (i != 0) rather than if (i). The advantage of this strict-typing approach is that it catches some type errors at compile time, and makes it less likely that you'll suffer from writing if (i=1) when you meant if (i==1). As you know, Python has no Boolean type. Instead, certain values are considered false, namely 0, 0.0, 0j, [], {}, '', None, and any instance of a class that returns false for the __nonzero__() method or zero for the __len__() method. Everything else is considered true. The advantage of this promiscuous approach is that code can be more concise. For example, you can write functions that return a useful non-false value when they are applicable, and None when they can't come up with an answer. Then you can write return f() or g() or h() instead of the more verbose
tmp = f() if tmp is not None: return tmp tmp = g() if tmp is not None: return tmp return h() |
Suppose you wanted Python to be more like Java. What can you do? The first step is to use variable names instead of 0 and 1 (as you would in C and most other languages):
false, true = range(2) |
or
false = 0; true = not false |
This is worth doing in almost all cases. You do pay a speed penalty (about 30%) in looking up the name 'true' rather than loading the constant 0 or 1, but if you're that concerned with speed, you probably need to re-code in C.
The next step is to define a Boolean type that is distinct from integers, None and everything else. Here's one possibility:
class Bool:
def __init__(self, val): self.val = not not val
def __nonzero__(self): return self.val
def __cmp__(self, other): return cmp(self.val, other.val)
def __repr__(self): return ['false', 'true'][self.val]
true, false = Bool(1), Bool(0) |
Now you can type in false or true at the interactive prompt and see the answer true returned. This is satisfying, but if you then try 1 + 1 == 2, you'll get back 1, not true. There is no way to impose your Bool type into the rest of the Python types (without breaking other people's code), so its probably not worth doing. The code here does show some interesting idioms, however:
| Q: Can I do the equivalent of (test ? result : alternative) in Python? |
Java and C++ have the ternary conditional operator (test ? result : alternative). Python does not (again, because Python is committed to a clear distinction between expressions and statements). In the last question, we saw ['false', 'true'][self.val] as an alternative for (val==1) ? "true" : "false". Another possibility in Python is self.val and 'true' or 'false'; some find it idiomatic while others find it confusing. It only works when the result after the "and" is guaranteed to be true. If it could be false, you'd better use an if. If you're pathologically committed to squeezing everything into a single expression, you could do:
(test and [result] or [alternative])[0] |
or
[lambda: result, lambda: alternative][not not test]() |
But don't say I told you to do it. You can even package this up in a call. The approved naming convention for variables that mimic keywords is to add a trailing underscore. So we have:
def if_(test, result, alternative=None):
"If test is true, 'do' result, else alternative. 'Do' means call if callable."
if test:
if callable(result): result = result()
return result
else:
if callable(alternative): alternative = alternative()
return alternative
|
Now, suppose for some reason you strongly prefer the syntax "if (test) ..." over "if(test, ..." (and, you never want to leave off the alternative part). You could try this:
def _if(test):
return lambda alternative: \
lambda result: \
[delay(result), delay(alternative)][not not test]()
def delay(f):
if callable(f): return f
else: return lambda: f
|
If u cn rd ths, u cn gt a jb in fncnl prg (if thr wr any).
| Q: What other major types are missing from Python? |
One great thing about Python is that you can go a long way with numbers, strings, lists, and dicts. But there are a few major types that are still missing. For me, the most important is a mutable string. Doing str += x over and over, is slow, and manipulating lists of characters (or lists of sub-strings) means you give up some of the nice string functions. One possibility is array.array('c'). Another is UserString.MutableString, although its intended use is more educational than practical. A third is the mmap module and a fourth is cStringIO. None of these is perfect, but together they provide enough choices. After that, I find I often want a queue of some sort. There is a standard library Queue module, but it is specialized for queues of threads. Because there are so many options, I won't lobby for a standard library implementation of queues. However, I will offer my implementation of three types of queue, FIFO, LIFO, and priority:
"""
This module provides three types of queues, with these constructors:
Stack([items]) -- Create a Last In First Out queue, implemented as a list
Queue([items]) -- Create a First In First Out queue
PriorityQueue([items]) -- Create a queue where minimum item (by <) is first
Here [items] is an optional list of initial items; if omitted, queue is empty.
Each type supports the following methods and functions:
len(q) -- number of items in q (also q.__len__())
q.append(item) -- add an item to the queue
q.extend(items) -- add each of the items to the queue
q.pop() -- remove and return the "first" item from the queue
"""
def Stack(items=None):
"A stack, or last-in-first-out queue, is implemented as a list."
return items or []
class Queue:
"A first-in-first-out queue."
def __init__(self, items=None): self.start = 0; self.A = items or []
def __len__(self): return len(self.A) - self.start
def append(self, item): self.A.append(item)
def extend(self, items): self.A.extend(items)
def pop(self):
A = self.A
item = A[self.start]
self.start += 1
if self.start > 100 and self.start > len(A)/2:
del A[:self.start]
self.start = 0
return item
class PriorityQueue:
"A queue in which the minimum element (as determined by cmp) is first."
def __init__(self, items=None, cmp=operator.lt):
self.A = []; self.cmp = cmp;
if items: self.extend(items)
def __len__(self): return len(self.A)
def append(self, item):
A, cmp = self.A, self.cmp
A.append(item)
i = len(A) - 1
while i > 0 and cmp(item, A[i//2]):
A[i], i = A[i//2], i//2
A[i] = item
def extend(self, items):
for item in items: self.append(item)
def pop(self):
A = self.A
if len(A) == 1: return A.pop()
e = A[0]
A[0] = A.pop()
self.heapify(0)
return e
def heapify(self, i):
"Assumes A is an array whose left and right children are heaps,"
"move A[i] into the correct position. See CLR&S p. 130"
A, cmp = self.A, self.cmp
left, right, N = 2*i + 1, 2*i + 2, len(A)-1
if left <= N and cmp(A[left], A[i]):
smallest = left
else:
smallest = i
if right <= N and cmp(A[right], A[smallest]):
smallest = right
if smallest != i:
A[i], A[smallest] = A[smallest], A[i]
self.heapify(smallest)
|
Notice the idiom ``items or [].'' It would be very wrong to do something like
def Stack(items=[]): return items |
to indicate that the default is an empty list of items. If we did that, then different stacks would share the same list. By making the default value be None (a false value that is outside the range of valid inputs), we can arrange so that each instance gets its own fresh list. One possible objection to the use of this idiom in this example: a user who does
s = Stack(items) |
might expect that s and items become identical, but that only happens when items is not empty. I would say that this objection is not too serious, because no such promise is explicitly made. (Indeed, a user might also expect that items remains unmodified, which is only the case when items is empty.)
| Q: How do I do the Singleton Pattern in Python? |
I assume you mean that you want a class that can only be instantiated once, and raises an exception if you try to make another one. The simplest way I know to do that is to define a function that enforces the idea, and call the function from the constructor in your class:
def singleton(object, instantiated=[]):
"Raise an exception if an object of this class has been instantiated before."
assert object.__class__ not in instantiated, \
"%s is a Singleton class but is already instantiated" % object.__class__
instantiated.append(object.__class__)
class YourClass:
"A singleton class to do something ..."
def __init__(self, args):
singleton(self)
...
|
You could also mess around with metaclasses so that you could write class YourClass(Singleton), but why bother? Before the Gang of Four got all academic on us, ``singleton'' (without the formal name) was just a simple idea that deserved a simple line of code, not a whole religion.
| Q: Is no "news" good news? |
I presume you mean is it good that Python has no new keyword. It is indeed. In C++, new is used to mark allocation on the heap rather than the stack. As such, the keyword is useful. In Java, all objects are heap-allocated, so new has no real purpose; it only serves as a reminder of the distinction between a constructor and other static methods. But making this distinction probably does more harm than good in Java, because the distinction is a low-level one that forces implementation decisions that really should be delayed. I think Python made the right choice in keeping the syntax of a constructor call the same as the syntax of a normal function call.
For example, let's reconsider our ill-fated Bool class. Suppose we wanted to enforce the idea that there should be only one true and one false object of type Bool. One way to do that is to rename the class Bool to _Bool (so that it won't be exported), and then define a function Bool as follows:
def Bool(val):
if val: return true
else: return false
true, false = _Bool(1), _Bool(0) |
This makes the function Bool a factory for _Bool objects (although admittedly a factory with an unusually small capacity). The point is that the programmer who calls Bool(1) should not know or care if the object returned is a new one or a recycled one (at least in the case of immutable objects). Python syntax allows that distinction to be hidden, while Java syntax does not.
There is some confusion in the literature; some people use the term "Singleton Pattern" for this type of factory, where there is a singleton object for each different argument to the constructor. I vote with what I believe is the majority in my definition of Singleton in the previous question. You can also encapsulate this pattern in a class. We'll call it "CachedFactory." The idea is that you write
class Bool:
... ## see here for Bool's definition
Bool = CachedFactory(Bool) |
and then the first time you call Bool(1) the argument list (1,) gets delegated to the original Bool class, but any subsequent calls to Bool(1) return that first object, which gets kept in a cache:
class CachedFactory:
def __init__(self, klass):
self.cache = {}
self.klass = klass
def __call__(self, *args):
if self.cache.has_key(args):
return self.cache[args]
else:
object = self.cache[args] = self.klass(*args)
return object |
One thing to notice is that nothing rests on classes and constructors; this pattern would work with any callable. When applied to functions in general, it is called the "Memoization Pattern". The implementation is the same, only the names are changed:
class Memoize:
def __init__(self, fn):
self.cache = {}
self.fn = fn
def __call__(self, *args):
if self.cache.has_key(args):
return self.cache[args]
else:
object = self.cache[args] = self.fn(*args)
return object |
Now you can do fact = Memoize(fact) and get factorials computed in amortized O(1) time, not O(n).
| Q: Can I have a history mechanism like in the shell? |
Yes. Is this what you want?
>>> from shellhistory import h h[2] >>> 7*8 56 h[3] >>> 9*9 81 h[4] >>> h[2] 56 h[5] >>> 'hello' + ' world' 'hello world' h[6] >>> h [None, 9, 56, 81, 56, 'hello world'] h[7] >>> h[5] * 2 'hello worldhello world' h[8] >>> h[7] is _ is h[-1] 1 |
How does this work? The variable sys.ps1 is the system prompt. By default it is the string '>>> ' but you can set it to anything else. If you set it to a non-string object, the object's __str__ method gets called. So we'll create an object whose string method appends the most recent result (the variable _) to a list called h (for history), and then returns a prompt string that includes the length of the list followed by '>>>'. Or at least that was the plan. As it turns out (at least on the IDLE 2.2 implementation on Windows), sys.ps1.__str__ gets called three times, not just once before the prompt is printed. Don't ask me why. To combat this, I only append _ when it is not already the last element in the history list. And I don't bother inserting None into the history list, because it's not displayed by the Python interactive loop, and I don't insert h itself into h, because the circularity could lead to problems printing or comparing. Another complication was that the Python interpreter actually attempts to print '\n' + sys.ps1, (when it should print the '\n' separately, or print '\n' + str(sys.ps1)) which means that sys.ps1 needs an __radd__ method as well. Finally, my first version would fail if imported as the very first input in a Python session (or in the .python startup file). After some detective work it turns out this is because the variable _ is not bound until after the first expression is evaluated. So I catch the exception if _ is unbound. That gives us:
import sys
h = [None]
class Prompt:
"Create a prompt that stores results (i.e. _) in the array h."
def __init__(self, str='h[%d] >>> '):
self.str = str;
def __str__(self):
try:
if _ not in [h[-1], None, h]: h.append(_);
except NameError:
pass
return self.str % len(h);
def __radd__(self, other):
return str(other) + str(self)
sys.ps1 = Prompt() |
| Q: How do I time the execution of my functions? |
Here's a simple answer:
def timer(fn, *args):
"Time the application of fn to args. Return (result, seconds)."
import time
start = time.clock()
return fn(*args), time.clock() - start
|
There's a more complex answer in my utils module.
| Q: What does your .python startup file look like? |
Currently it looks like this, but it's been changing a lot:
from __future__ import nested_scopes import sys, os, string, time from utils import * ################ Interactive Prompt and Debugging ################ try: import readline except ImportError: print "Module readline not available." else: import rlcompleter readline.parse_and_bind("tab: complete") h = [None] class Prompt: def __init__(self, str='h[%d] >>> '): self.str = str; def __str__(self): try: if _ not in [h[-1], None, h]: h.append(_); except NameError: pass return self.str % len(h); def __radd__(self, other): return str(other) + str(self) if os.environ.get('TERM') in [ 'xterm', 'vt100' ]: sys.ps1 = Prompt('\001\033[0:1;31m\002h[%d] >>> \001\033[0m\002') else: sys.ps1 = Prompt() sys.ps2 = '' |
Thanks to Amit J. Patel, Max M, Dan Winkler, Chih-Chung Chang, Bruce Eckel, Kalle Svensson, Mike Orr, Steven Rogers and others who contributed ideas and corrections.
Peter Norvig