In the previous article, writing clean classes was discussed. Starting by showing how a class should be organized; constants, variables, and methods. Then, what is known as a "God Object" has been defined, leading to introduce the ideal size of a class should be and how to measure it. After that, the SOLID principles were presented. Those principles enhance the Object-Oriented Design and keep the system flexible and maintainable.
While each of the previous articles in this series took a component and discussed it separately, this section will consider different parts of writing clean code. We will examine comments, what makes good comments and what makes bad ones. How to cleanly format code using indentation and alignment. Why handle errors using exceptions rather than returned codes. And finally, how to cleanly use third-party written code.
"Don't comment bad code - rewrite it." - Brian W. Kernighan and P.J. Plaugher
Comments are there because programming languages lack expressiveness, and they would be useless if the programming languages were expressive. Uncle Bob described a comment as a failure. That is because we are not able to express ourselves without them.
Code changes. It gets rewritten, restructured, or even deprecated in favor of other. A forgotten comment of a piece of code that has been modified would be outdated. Worse than that, it could be misinformative. One might argue that a comment should be kept in the same phase of the code, accurate and relevant to what the code does. That's all good, but it is easier said than done. Comments tend to be overlooked, and that's simply because developers like to read code rather than read comments. And because of that, developers energy should instead be channeled towards making the code expressive and clean.
What Makes Good Comments?
Although comments considered evil, sometimes they are beneficial. However, if they could be expressed in the code instead, they should be removed. Here is a list of comments that could be considered ok to write.
Legal and Licence Comments
Those are the comments that contain copyrights, authorships, and licensing. Usually found on the top of each source file.
Those are the ones that explain what a piece of code does. We agreed that writing an expressive piece of code; say rename a function or a variable, is better than adding a comment, yet sometimes an informative comment could be useful. For example, the following comment describes what the output format of date and time of 'format_date' function is.
# format_date returns the date and time formatted as: MM/DD/YYYY HH:mm order_date = format_date('2017', '07', '03', '09', '00')
Although, as mentioned, it is better to change the function name to be more descriptive.
- Explanatory Comments
These are the ones that explain the purpose behind a piece of code. The following example shows an attempt to lock an execution of the code to an existence of a file. Even though this is not a good solution to handle the problem, but the comment helped to explain the intent of the programmer.
@celery.task def push_orders_to_billing_service(max_at_once=10000): orders = get_unprocessed_orders(processed='no', limit=maxt_at_once) if orders: """Create a file to prevent other processes from running while pushing orders""" create_lock_file() g = group( fetch_and_push_order.s(order, 0) for order in orders ) g.apply_async()
- Warning Comments
These are placed as a warning of executing an individual piece of code.
# This code takes some time to complete # Don't run unless you have some time to kill
TODOs are jobs that the coder couldn't fulfill for some reason such as tight schedule, or the implementation couldn't be achieved with the current resources. These comments could be helpful also to point out a deprecated feature that should be deleted in the future.
Docblock or Docstring Comments
These are used as documentation, usually for APIs. There are different styles for the doc strings. There are also tools that could generate static documentation websites from these doc blocks, serving as a reference for other developers willing to use the API. The following is an example of the Google style Python docstring.
def function_with_types_in_docstring(param1, param2): """Example function with types documented in the docstring. `PEP 484`_ type annotations are supported. If attribute, parameter, and return types are annotated according to `PEP 484`_, they do not need to be included in the doc string: Args: param1 (int): The first parameter. param2 (str): The second parameter. Returns: bool: The return value. True for success, False otherwise. .. _PEP 484: https://www.python.org/dev/peps/pep-0484/ """
What Makes Bad Comments?
Most of the comments are bad; we already established that. They are mostly justifications and excuses for bad code. The following list contains some of the bad comments types.
- Unclear Comments
The ones that leave you wondering what was the intent of the author.
"Any comment that forces you to look in another module for the meaning of that comment has failed to communicate to you and is not worth the bits it consumes" - Robert C. Martin
- Redundant Comments
Those are the unnecessary comments that contribute nothing to understanding the code. The redundant comment sometimes can be longer than the code itself.
... class User(object): def __init__(self): """Logger: log to a central log""" self._logger = Logger() ...
These are the comments that give false information about a piece of code; function, variable, module and such. They are as destructive as the misinformative names that discussed before. These are one of the worst types.
Those comments restate the obvious and provide no information, which makes them noisy and useless.
... class User(object): """ The username of this user""" _username = '' ...
The golden rule when finding the need of a comment is to rethink, and consider if the intent could be expressed by rewriting the code; maybe by using a function or a variable. A good comment that had not kept updated when the code that it used to express has been rewritten is misinformative and misleading, and therefore is bad.
Code formatting is necessary. It helps to make the code more communicative and shows a professionalism in the developer's work. Code readability affects maintainability and extensibility of the system.
Developers are hired to build software and to add features and functionality to applications, but there is a good chance that this functionality will change in the future. There is a chance that other developers will have to rewrite that functionality. Therefore formatting and code neatness is what differentiate the decent developer from the others, not building the functionality. Because good developers write code that is consumed by others first, and compiler second. The style and discipline remain, but the code not necessarily will.
To start with formatting, keep the file size as small as possible. The smaller, the better. Up to a couple of hundreds is still ok, yet not that efficient. After all, we all agree that a smaller set of code lines is easier to understand.
The code should read like a story; top to bottom. We discussed this when dealing with classes and functions. Just like a good story, the code should be easy to read and follow. Therefore, the ordering of the related functions/methods is important.
Surround top-level function and class definitions with two blank lines.
Method definitions inside a class are surrounded by a single blank line.
Extra blank lines may be used (sparingly) to separate groups of related functions. Blank lines may be omitted between a bunch of related one-liners (e.g. a set of dummy implementations).
Use blank lines in functions, sparingly, to indicate logical sections.
Python accepts the control-L (i.e. ^L) form feed character as whitespace; Many tools treat these characters as page separators, so you may use them to separate pages of related sections of your file. Note, some editors and web-based code viewers may not recognize control-L as a form feed and will show another glyph in its place.
Keep associated functionality close. If multiple functions are related, then they should be held close to each other, making them easier to detect without moving your eyes a bunch.
Limit the line length of all code lines, including comments and doc strings. PEP 8 states that all lines should be restricted to 79 characters maximum, while the line length should be restricted to 72 characters for flowing long blocks of text with fewer structural restrictions such as doc strings and comments.
Limit all lines to a maximum of 79 characters.
For flowing long blocks of text with fewer structural restrictions (doc strings or comments), the line length should be limited to 72 characters.
Limiting the required editor window width makes it possible to have several files open side-by-side, and works well when using code review tools that present the two versions in adjacent columns.
The default wrapping in most tools disrupts the visual structure of the code, making it more difficult to understand. The limits are chosen to avoid wrapping in editors with the window width set to 80, even if the tool places a marker glyph in the final column when wrapping lines. Some web based tools may not offer dynamic line wrapping at all.
Some teams strongly prefer a longer line length. For code maintained exclusively or primarily by a team that can reach agreement on this issue, it is okay to increase the nominal line length from 80 to 100 characters (effectively increasing the maximum length to 99 characters), provided that comments and docstrings are still wrapped at 72 characters.
The Python standard library is conservative and requires limiting lines to 79 characters (and docstrings/comments to 72).
The preferred way of wrapping long lines is by using Python's implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash for line continuation.
Indentation is significant. It comes naturally in some languages like Python and Ruby. Furthermore, I wouldn't contribute to the war of tabs vs. spaces, use whatever pleases you. However, I would like to point out that according to this article on Stackoverflow's blog, developers who use spaces make more money than those who use tabs.
And finally, this might seems obvious, but follow your team's formatting style if it exists.
Error happens. Abnormal input or failure of the machine, things go wrong sometimes. However, the developers are responsible for ensuring that the code is fulfilling its functionality. Therefore, it was difficult to find source code not overwhelmed with error handling logic all over the place. It is important to have error handling logic in the code, but it should obscure the code. Here are some points to consider for handling errors.
When handling errors, one should use exceptions instead of creating his/her own logic such as creating an error flag or return an error code that the caller could check. The problem with that is the caller has to check for the error after the call immediately. This could be forgotten. Throwing an exception is cleaner, and its logic is not obscured by error handling.
Start with the
try-catchstatement first. Because
tryblocks act as a transaction, the
catchstatement has to leave the application consistently regardless of what happens in the
trystatement. Doing so let you define what the user of the code should expect discarding what goes wrong in the
The exception message context should be expressive, making it easier to determine the source and the location of the error.
Third-Party Code Integration
"Don't reinvent the wheel." The most repeated phrase in this industry. Therefore, we use third-party frameworks, libraries, and plugins. They help us by giving more functionalities in less time. However, by using them, we are introducing complexity to our systems. We need to integrate those components and subsystems with our existing system cleanly.
This might seems odd, but when using a third-party code, we should write tests for it. One might argue that this is not our job, yet this serves as the best interest for us. These tests are known as "learning tests". They focus on what we need from the third-party code API.
Learning tests costs nothing because we have to learn the API anyways in order to use it. Writing these tests is an easier and more separate way to gain the desired knowledge and increase the understanding of the third-party code. After all, learning tests confirm that the third-party code work as expected.