Login With Github

Some Tips For Python Variables And Code Quality

How to define and use variables is always one of the first skills to master in any programming language.

There is a a highly significant connection between variables and the quality of the code. And among the many concerns about variables, it's especially important to name variables.

How To Name Variables

There is a famous witticism in the field of computer science:

There are only two hard things in Computer Science: cache invalidation and naming things.
-- Phil Karlton

There is no need to talk about the difficulty of the first "cache invalidation", because anyone who has used the cache will understand. As for the difficulty of the second "naming things", I also have a deep understanding. One of the darkest afternoons I've spent in my career is to sit in front of the monitor and scratch my head in order to get a proper name for a new project.

And what you need to name most when programming are variables. It's important to give a good name to a variable, because proper variable naming can greatly improve the overall readability of the code.

The following points are the basic principles that I would like to follow when naming variables.

1. Variable names should be descriptive and not too broad

Within the acceptable length range, the variable name describes the content it points to as accurately as possible. So, try not to use those words that are too broad as your variable names:

  • BAD: day, host, cards, temp
  • GOOD: day_of_week, hosts_to_reboot, expired_cards

2. Variable names allow to guess their types

Everyone who learns Python knows that Python is a dynamically typed language that has no variable type declarations (at least before PEP 484). So when you see a variable, you have no idea what type it is except by guessing from the context.

However, usually there are some intuitive conventions about the relationship between variable names and variable types. Here are what I have summarized.

"What kind of variable will be treated as a bool type?"

The biggest feature of a Boolean-type variable is that it only has two possible values ​​:"true" or "false". Therefore, it is a good choice to use the words such as is, has, and the like for variable names. The principle is that let the person who reads the variable name think that this variable will only have two values: "true" or "false".

Here are a few good examples:

  • is_superuser: means "whether it is a superuser", and there are only two values: true / false.
  • has_error: means "whether it has an error", and there are only two values: true / false.
  • allow_vip: means "whether to allow VIP", and there are only two values: true / false.
  • use_msgpack: means "whether to use msgpack", and there are only two values: true / false.
  • debug: means "whether to enable debug mode", and it is treated as bool mainly because of convention.

"What kind of variable will be treated as an int/float type?"

If a variable has a name associated with numbers, it will be treated as an int/float type by default. Here are a few common examples:

  • All words that are interpreted as numbers, such as: port, age, radius, etc.
  • Words ending with _id, such as: user_id, host_id
  • Words that begin or end with length/count, such as: length_of_username, max_length, users_count

Note: Do not use an ordinary plural to represent an int-type variable. For example, it is best to use number_of_apples, trips_count instead of apples, trips.

Other types

For the types such as str, list, tuple, and dict, there is no uniform rule that allows us to guess the variable type by name. For example, headers may be a header list or a dict containing header information.

The most recommended way is to write a normative document for these types of variable names. In the document string of functions and methods, the sphinx format (the documentation tool used by Python official documentations) is used to label the types of all variables.

3. Use Hungarian notation appropriately

The first time I knew Hungarian Notation nomenclature was in a blog post by Joel Spolsky. In short, Hungarian nomenclature is used to abbreviate the "type" of the variable and put it to the front of the variable name.

Note that the "type" mentioned here does not refer to the type in the traditional sense (such as int/str/list), but to the type related to your code business logic.

For example, there are two variables in your code: students and teachers, which both point to the list containing the Person object. So after using Hungarian nomenclature you can rewrite these two names like this:

students -> pl_students teachers -> pl_teachers

where pl are the acronyms for person list. So, if you see a variable beginning with pl_, you'll know the type of value it points to.

It's a good choice to use Hungarian Nomenclature in many cases because it can improve the readability of your code, especially when there are many variables and multiple occurrences of the same type. Just take care not to abuse it.

4. The variable name should be as short as possible, but never too short.

We have mentioned earlier that the variable names should be descriptive. However, if you don't impose any restrictions on the principle, you are likely to write out a highly descriptive variable name like this: how_much_points_need_for_level2. If your code is full of such long variable names, it is a disaster for reading code.

The length of a good variable name should be within two or three words. For example, the above name can be abbreviated as points_level2.

You should avoid using short names that have only one or two letters in most cases, such as i, j, and k (which are used for the array index very frequently). It is always better to replace them with names that have a clear meaning, such as person_index.

Exceptions for using short names

Sometimes there are exceptions to the above principles. When we need to use some well-defined but long variable names repeatedly, it is perfectly possible to use abbreviations for short names in order to make the code more concise. But it is best not to use too many such short names in the same code to make it easily understood.

For example, when importing modules in Python, short names are often used as aliases. For example, the common gettext method in Django i18n will be usually abbreviated to _.

5. Other considerations

Here are some other considerations for naming variables:

  • Do not use too similar variable names in the same piece of code, such as using the sequences of users, users1, user3 at the same time
  • Do not use negation words as variable names. For example, you can use is_special instead of is_not_normal

How To Use Variables Better

The above is about how to give good names to variables. Now let's talk about some details that should be paid attention to when using variables.

1. Keep consistency

If the image variable is called photo in a method, then it shouldn't be changed to image in other places, or it will make the reader confused: "Are image and photo the same thing?"

In addition, although Python is a dynamically typed language, it doesn't mean you can use the same variable name to represent the str type, and then to represent the list type. Keep consistency for the variable type to which the same variable name refers.

2. Try not to use globals()/locals()

Perhaps it is so exciting when the first time you find the pair of built-in functions globals()/locals(), and you can't wait to write the following extremely simple code:

def render(request, user_id, trip_id):
    user = User.objects.get(id=user_id)
    trip = get_object_or_404(Trip, pk=trip_id)
    is_suggested = is_suggested(user, trip)
    # Omit three lines by using locals(). Cool!
    return render(request, 'trip.html', locals())

Don't do that. It will only make people who read the code (including yourself after three months) hate you because he needs to remember all the variables defined in the function (Just think about what would happen if the function grew to 200 lines), not to mention locals(), which will pass out some unnecessary variables.

And The Zen of Python says: Explicit is better than implicit. So, you'd better write the code like this:

return render(request, 'trip.html', {
        'user': user,
        'trip': trip,
        'is_suggested': is_suggested
    })

3. The variable should be defined as close as possible to its use.

The principle is known to all. Many people (including me) have a habit when start learning programming: put all the variable definitions together and put them at the top of the functions or methods.

def generate_trip_png(trip):
    path = []
    markers = []
    photo_markers = []
    text_markers = []
    marker_count = 0
    point_count = 0
    ... ...

Doing so will only make your code "looking neat", but it will not help improve the code readability.

Better yet, define the variable as close as possible to its use. Then when you read the code, you can understand the logic of the code better, for you don't have to bother to figure out what the variable is and where it is defined?

4. Use namedtuple/dict reasonably to have the function return multiple values.

Python functions can return multiple values:

def latlon_to_address(lat, lon):
    return country, province, city

# Unpack and define multiple variables one time.
country, province, city = latlon_to_address(lat, lon)

However, there will comes a small problem: What if the latlon_to_address function needs to return District?

If this is the case, you need to find all the places where latlon_to_address is called, and add this extra variable, otherwise ValueError: too many values ​​to unpack will come up:

country, province, city, district = latlon_to_address(lat, lon)
# or use _ to ignore the extra return value
country, province, city, _ = latlon_to_address(lat, lon)

For such a multiple-return-value function, it is more convenient to use namedtuple/dict. When you add a return value, it does not have any destructive effect on the previous function invocations:

# 1. Use dict
def latlon_to_address(lat, lon):
    return {
        'country': country,
        'province': province,
        'city': city
    }

addr_dict = latlon_to_address(lat, lon)

# 2. Use namedtuple
from collections import namedtuple

Address = namedtuple("Address", ['country', 'province', 'city'])

def latlon_to_address(lat, lon):
    return Address(
        country=country,
        province=province,
        city=city
    )

addr = latlon_to_address(lat, lon)

However, there is also a drawback that you can't unpack and define multiple variables at the same time with the previous way of x, y = f() now though the code compatibility for changes has improved. The choice is all up to you.

5. Limit the number of variables in a single function

The ability of the human brain is limited. Studies have shown that the human short-term memory can only remember no more than ten names at the same time. So, when one of your functions is too long (generally, a function that exceeds one screen is considered to be a bit too long), and it contains too many variables, you should split it into multiple small functions in time.

6. Delete those useless variables in time

This principle is very simple and easy to do. But if you don't follow it, then the impact on the quality of your code is devastating. It will make people who read your code have a feeling of being fooled.

here is:

def fancy_func():
    # Reader: Well, it defines a fancy_vars here. 
    fancy_vars = get_fancy()
    ... ...(after a lot of code)

    # Reader: Is it the end? Where did the previous fancy_vars go? Has it been eaten by a cat?
    return result

So, open the intellisense in IDE and clean up the variables that have been defined but not used.

7. Don't define the variable which you don't need to define.

Sometimes, we may think: "Well, this value may be modified/re-used in future", "let's define it as a variable first"

def get_best_trip_by_user_id(user_id):
    user = get_user(user_id)
    trip = get_best_trip(user_id)
    result = {
        'user': user,
        'trip': trip
    }
    return result

In fact, the "future" you think will never come. The three temporary variables in the code can be removed completely:

def get_best_trip_by_user_id(user_id):
    return {
        'user': get_user(user_id),
        'trip': get_best_trip(user_id)
    }

There is no need to sacrifice the current readability of the code for those changes that may occur. If you have a need to define a variable later, then add it later.

Conclusion

The variable is an important part of the programming language, so it’s worth spending a little time thinking when we are defining and using it, which will make your code better.

0 Comment

temp