Write Better Functions

Functions are the building blocks of many applications. Most software developers, current or aspiring, have some experience writing functions. Everyone writes them differently, and there are different approaches to getting the most out of your functions. Bad functions are hard to maintain, more unstable, and lack reusability. Writing better functions will make your application more stable, easier to read and maintain, and make future development easier.

Today, I will walk you through some of the principles I try to follow when writing functions. For my examples, I will use Python, as it is easy to read and widely known. Although Python is a dynamically-typed scripting language, I believe the principles here apply broadly to most other languages.

A Bad Function

Before we review good principles, let’s first look at a bad example. 
print_emails() is the function we’re focusing on here, while main() is just the usage of it.

def print_emails(emails):
    for email in emails:
        valid = True

        if len(email) > 254:
            valid = False

        parts = email.split('@')
        if len(parts) != 2:
            valid = False

        if not (len(parts[0]) > 0 and len(parts[0])) <= 64:
            valid = False

        if len(parts[1]) > 255:
            valid = False

        labels = parts[1].split('.')
        for label in labels:
            if not (len(parts[1]) > 0 and len(parts[1]) <= 63):
                return False

        if valid:
            print(email + ' is valid.')
        else:
            print(email + ' is invalid.')


def main():
    emails = [
        'test@example.com',
        'notavalidemail',
        'also@not@valid',
    ]
    print_emails(emails)


if __name__ == '__main__':
    main()

This function takes a list of emails, validates each one (in a very simplified manner), and prints out a message describing whether an email is valid or not. While it isn’t the worst function you’ve probably ever seen, it can be improved a lot. In the following sections, we’ll discuss some principles for improving function design and apply them to this example.

Principle I: Minimize Knowledge

The first problem with our example function is that its parameter has not been fully simplified. It takes in a list of emails instead of a single email. Why does this matter? Imagine we have only a single email we want to print the validity of. This is how we will have to call print_emails():

some_email = 'someemail@somedomain.com'
print_emails([some_email])

This violates the principle of least knowledge, or the Law of Demeter. The fact that emails are stored in a list is a detail that print_emails() does not need to know. Although this seems like a trivial example, little infractions like this can add up and make a system’s design and maintainability suffer.

Objects have their own set of rules in the Law of Demeter as well. Most importantly, a method or function with an object parameter A may call methods on A, but may not call methods on object B that may be returned from one of A‘s methods. This minimizes coupling and limits the knowledge that your method or function needs.

Application

Now, let’s apply this principle to our example.

def print_emails(emails):
    for email in emails:
        print_email(email)


def print_email(email):
    valid = True

    if len(email) > 254:
        valid = False

    parts = email.split('@')
    if len(parts) != 2:
        valid = False

    if not (len(parts[0]) > 0 and len(parts[0])) <= 64:
        valid = False

    if len(parts[1]) > 255:
        valid = False

    labels = parts[1].split('.')
    for label in labels:
        if not (len(parts[1]) > 0 and len(parts[1]) <= 63):
            return False

    if valid:
        print(email + ' is valid.')
    else:
        print(email + ' is invalid.')

As you can see, we’ve split the function in two. print_email() validates and prints a single email, and print_emails() handles the details of the data structure.

Principle II: Do One Thing, Well

Functions should do only one thing, and do it really well. This principle is part of the Unix philosophy. Functions that try to do too much are not very useful. Many functions can be broken down into smaller ones that can act like building blocks for new, unimagined functionality in the future. Building your application this way results in better maintainability, reduced future development effort, and improved readability of your code.

Consider our example function. What if I have a list of domains I want to determine the validity of? What if I want to validate an email without printing anything? Fortunately, this functionality already exists. Unfortunately, it is embedded in a function that does things we don’t want, so refactoring will be necessary. This kind of issue can be avoided by applying this principle during the original function’s design.

A principle that goes hand-in-hand with this one is keeping your functions short. Long functions are harder to follow and generally break one or more of the principles listed here. There are many different rules of thumb for keeping functions short, but I like to make mine short enough that I never have to scroll to see the entirety of a function.

Application

Before applying this principle to our example, let us first examine what it does in order to determine how we can break it down. Here are the steps:

  1. Validate the email as a whole, checking for max length and a single @ symbol.
  2. Validate the front-part (local part) of the email.
  3. Validate the domain of the email.
  4. Validate each domain label (text between the dots).
  5. Print the email validity message.

We will use these steps to break this function down into smaller functions.

def print_emails(emails):
    for email in emails:
        print_email(email)


def print_email(email):
    if is_valid_email(email):
        print(email + ' is valid.')
    else:
        print(email + ' is invalid.')


def is_valid_email(email):
    if len(email) > 254:
        return False

    parts = email.split('@')
    if len(parts) != 2:
        return False

    return is_valid_local_part(parts[0]) and is_valid_domain(parts[1])


def is_valid_local_part(text):
    return len(text) > 0 and len(text) <= 64


def is_valid_domain(text):
    if len(text) > 255:
        return False

    labels = text.split('.')
    for label in labels:
        if not is_valid_domain_label(label):
            return False

    return True


def is_valid_domain_label(text):
    return len(text) > 0 and len(text) <= 63

Our code is now modular and each function has its own specific job that it does well. Also, we can now validate a domain:

is_valid_domain('concisecoder.io')

Principle III: Give Descriptive Names

Perhaps the best improvement to a function is a good, descriptive name. A descriptive name makes your code more readable, and combined with principles discussed previously, can make your code self-documenting. If your function follows Principle II, your function should do only one thing, and you should be able to come up with a descriptive name easily. If you are having trouble with this, you may need to break your function down more.

A descriptive name describes exactly what a function does, without any ambiguity. For example, a function that sorts Person objects by their age would best be named sort_people_by_age(people) rather than sort(people) or sort_people(people). Notice that your function name doesn’t have to do all of the work; your parameters can do some of the talking too. Give your parameters good names to be most descriptive.

It shouldn’t take too many words to make a descriptive name, but I do tend to err on the long side in favor of fully-spelled words over less-obvious abbreviations. A name that is too long is another red flag that your function may not have a specific enough job.

Application

print_emails() isn’t quite as descriptive as it could be. After all, we’re not just printing emails, are we? We’re printing an email address’s validity. I think print_emails_validity() is a better name. You could be even more explicit and use print_email_addresses_validity(), but I think that is a little too lengthy. Unless you validate both email addresses and messages in your application, I think the former is explicit enough.

def print_emails_validity(emails):
    for email in emails:
        print_email_validity(email)


def print_email_validity(email):
    if is_valid_email(email):
        print(email + ' is valid.')
    else:
        print(email + ' is invalid.')


def is_valid_email(email):
    if len(email) > 254:
        return False

    parts = email.split('@')
    if len(parts) != 2:
        return False

    return is_valid_local_part(parts[0]) and is_valid_domain(parts[1])


def is_valid_local_part(text):
    return len(text) > 0 and len(text) <= 64


def is_valid_domain(text):
    if len(text) > 255:
        return False

    labels = text.split('.')
    for label in labels:
        if not is_valid_domain_label(label):
            return False

    return True


def is_valid_domain_label(text):
    return len(text) > 0 and len(text) <= 63


def main():
    emails = [
        'test@example.com',
        'notavalidemail',
        'also@not@valid',
    ]
    print_emails_validity(emails)


if __name__ == '__main__':
    main()

Summary

Our simple application is now a lot cleaner after applying these three principles. The principle of minimizing knowledge keeps our functions modular and decoupled. Descriptive names make our code self-documenting and easy to understand. Shorter, well-defined functions allow our application to become much more than we originally intended.

These are far from the only principles for writing better functions, but I believe they are the most fundamental. Follow them and enjoy the benefits of cleaner code.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.