Login With Github

9 Worst Python Practices

Recently I have been checking old systems, some of which become sucked due to the bad coding habits; and I have also written a piece of bad code that caused the server load to soar, so I want to summarize the bad Python coding habits to remind myself to stay away from these "worst practices".

In the examples below, some will cause performance problems, some will lead to hidden bugs or difficulties in future maintenance and refactoring, and others are what I think isn't pythonic enough.

Use mutable objects as default parameters

This bad coding habit should have been seen in various technical articles.

Let's look at the wrong demonstration first:

def use_mutable_default_param(idx=0, ids=[]):



[1, 2]

The most key reasons for this are that:

  1. The function itself is also an object, and the default parameters are binded to the function object.
  2. Theappendmethod will modify the object directly, so the next time the function is called, the binded default parameter is no longer an empty list.

The correct practice is as follows:

def donot_use_mutable_default_param(idx=0, ids=None):
    if ids is None:
        ids = []

Don't specify the exception type in the try...except

Although using try...except in Python won't cause serious performance problems, the practice of capturing all types of exception directly often will mask other bugs and cause bugs that are difficult to trace.

In general, try...except should be used as little as possible so that problems can be discovered early in the development phase. If you want to use try...except, you should specify the specific exception to be captured as much as possible, and write the exception information to the log through the except statement, or directly raise it after being processed.

Redundant code about dict

I can often see such code:

d = {}
datas = [1, 2, 3, 4, 2, 3, 4, 1, 5]
for k in datas:
    if k not in d:
        d[k] = 0 
    d[k] += 1

In fact, you can use the data structure collections.defaultdict to implement such a function more simply and elegantly:

default_d = defaultdict(lambda: 0)
datas = [1, 2, 3, 4, 2, 3, 4, 1, 5]
for k in datas:
    default_d[k] += 1

Again, the code:

# d is a dict
if 'list' not in d:
	d['list'] = []

can be replaced with such a line of code:

# d is a dict
d.setdefault('list', []).append(x)

Similarly, the following two coding ways have a strong C smell:

# d is a dict
for k in d:
	v = d[k]
	# do something

# l is a list
for i in len(l):
	v = l[i]
	# do something

You'd better write it in a more pythonic way:

# d is a dict
for k, v in d.iteritems():
	# do something

# l is a list
for i, v in enumerate(l):
	# do something

Actually, there is another parameter for theenumerate, indicating that where the serial number starts. If you want the sequence number to start from 1, you can useenumerate(l, 1).

Use the flag variable instead of for...else

Again, it's common for such code:

search_list = ['Jone', 'Aric', 'Luise', 'Frank', 'Wey']
found = False
for s in search_list:
    if s.startswith('C'):
        found = True
        # do something when found

if not found:
    # do something when not found
    print('Not found')

In fact, using for...else will be more elegant:

search_list = ['Jone', 'Aric', 'Luise', 'Frank', 'Wey']
for s in search_list:
    if s.startswith('C'):
        # do something when found
    # do something when not found
    print('Not found')

Excessive use of tuple unpacking

In Python, it's allowed to execute the unpack operation to the tuple type:

# human = ('James', 180, 32)
name,height,age = human

The practice is very cool, and it's much wiser than writingname=human[0]. However, it's often abused, and the result is that thehuman will be unpacked in the program everywhere through the code above.

If you need to insert the gender datasexinhumanlater, then all of the unpack operation need to be modified, even if thesex won't be used in some logic.

# human = ('James', 180, 32)
name,height,age, _ = human
# or
# name, height, age, sex = human

There are several ways to solve this problem:

  1. Use the coding way ofname=human[0], and then insert sex=human[3] into the place where needs gender information.
  2. Usedictto representhuman
  3. Usenamedtuple
# human = namedtuple('human', ['name', 'height', 'age', 'sex'])
h = human('James', 180, 32, 0)
# then you can use h.name, h.sex and so on everywhere.

Use import* everywhere

Import* is a lazy behavior that not only pollutes the current namespace, but also invalidates code checking tools such as pyflakes. In the followed process of viewing the code or debug, it's often difficult to find out the source of a third-party function from a bunch ofimport*.

It can be said that the habit is no more than harm.

File operation

Do not usef = open(‘filename')  for file operations. Usewith open(‘filename') as f to make the context manager help you deal with the messy things like closing files in case of exceptions.

Use class.name to determine type

I have encountered a bug: In order to implement a particular function, I wrote a new class B(A), and in B I override several functions of A. The whole implementation is simple, but some functions of A doesn't work. Finally I found the reason was that in some logic code, it judgeentity.__class__.__name__ == ‘A' .

Do not use__class__.__name__, unless you want to limit the current type in the inheritance hierarchy (that is, to shield the subclasses that may appear in the future), and you should use the built-in functionisinstanceinstead. After all, there are so many underscores in the names of these two variables, which means that it is not recommended to be used.

There are multi-layer function calls inside a loop

Multi-layer function calls inside the loop brings the following two hidden risks:

  1. There are no inline functions in Python, so function calls will cause a certain amount of overhead. Especially when the logic is simple, the proportion of the overhead will be considerable.
  2. What's more, when you maintain the code later, you're likely to ignore that the function is called in the loop. So, inside the function, you'll tend to add some functions which have larger overhead but don't have to be called every time, such asTime.localtime(). If it's a straightforward loop, I think most programmers will writetime.localtime() outside of the loop, but they won't if introducing multi-layer function calls.

So I suggest that if it's not particularly complicated logic, it should be written inside the loop directly instead of using function calls. If you must wrap a layer of function calls, you should prompt the subsequent maintainers in the function's naming or comments: This function will be used inside the loop.

Python is a very easy language to get started. Strict indentation requirements and rich built-in data types make most Python code can do a good job. However, it's also easy to write bad code in Python. The above list is only a small part of bad practices. If there is anything missing, please feel free to tell me.



9. Writing new code in Python 2 in 2018.

Pretty good up until that last point on optimising loops.if something is constant throughout all iterations of the loop then it should be calculated once, outside the loop.

If code inside a loop is more readable/maintainable when expressed through functions then do that first. Only if you then have a .speed problem that profiling points to that loop, should you then think of optimising that loop.

I didn't know using mutual object as default parameter issue. Thanks for the good suggestions.

Thanks for this article. I used most of these techniques when reading more of the new features in Python 2.7. However I have discovered new things from this article I didn't know about. 

Example with flag can be replaced to one line pythonic way with function `any` and generator expression.

`found = any((s.startswith('C') for s in search_list))`