2.4. Statements: When Expressions Are Not Enough

Although well-constructed expressions are easy to understand, some python statements will improve the readability of our code in complex situations. A statement is a line or block of code that does not return a value. The simplest example of a statement in Python is the assignment statement, which stores a value in a variable with nothing returned.

In [1]: a = 5

Other useful statements are the function definition, if-elif-else block and try-expect blocks, all discussed below.

2.4.1. Function

We have already seen how the lambda expression store simple expressions, but their body is restricted also be an expression. This disallows, for example, assigning a value to a variable in the body of the function. For complicated situations, we will define functions using the function definition statement.

The syntax for a function definition is:

def name( parameters ):
    """ docstring """
    statements

You can make up any names you want for the functions you create, except that you can’t use a name that is a Python keyword, and the names must follow the rules for legal identifiers that were given previously. The parameters specify what information, if any, you have to provide in order to use the new function. Another way to say this is that the parameters specify what the function needs to do its work.

There can be any number of statements inside the function, but they have to be indented from the def. In the examples in this book, we will use the standard indentation of four spaces. Function definitions are the second of several compound statements we will see, all of which have the same pattern:

  1. A header line which begins with a keyword and ends with a colon.
  2. A docstring that documents the function.
  3. A body consisting of one or more Python statements, each indented the same amount – 4 spaces is the Python standard – from the header line.

docstrings

If the first thing after the function header is a string (some tools insist that it must be a triple-quoted string), it is called a docstring and gets special treatment in Python and in some programming tools.

Another way to retrieve this information is to use the interactive interpreter, and enter the expression <function_name>.__doc__, which will retrieve the docstring for the function. So the string you write as documentation at the start of a function is retrievable by python tools at runtime. This is different from comments in your code, which are completely eliminated when the program is parsed.

By convention, Python programmers use docstrings for the key documentation of their functions.

In a function definition, the keyword in the header is def, which is followed by the name of the function and some parameters enclosed in parentheses. The parameter list may be empty, or it may contain any number of parameters separated from one another by commas. In either case, the parentheses are required. Recall that the parameter list is more specifically known as the formal parameters. This list of names describes those things that the function will need to receive from the user of the function. When you use a function, you provide values to the formal parameters.

The following code gives an example of defining a function named square that computes the square of x. This value is saved to a local variable y and returned using the return statement.

In [1]: def square(x):
   ...:     y = x * x
   ...:     return y
   ...: 

In [1]: number = 10

In [1]: result = square(number)

In [1]: result

More details about creating functions in Python will be provided in the upcoming chapter named Functional Programming.

2.4.2. Chained conditionals

The conditional expression allows us to write simple branching logic in the midst of our expressions, but chaining conditional expressions to form more complicated logic can become hard to read. Python provides a statement-block structure that will allow us to express more complicated Boolean logic in a clearer way. This is sometimes referred to as a chained conditional

In [1]: def whichIsBigger(x, y):
   ...:     """ returns a string describing which value is larger"""
   ...:     if x < y:
   ...:         return "x is less than y"
   ...:     elif x > y:
   ...:         return "x is greater than y"
   ...:     else:
   ...:         return "x and y must be equal"
   ...: whichIsBigger(2,3)
   ...: 

The flow of control can be drawn in a different orientation but the resulting pattern is identical to the one shown above.

../_images/flowchart_chained_conditional.png

elif is an abbreviation of else if. Again, exactly one branch will be executed. There is no limit of the number of elif statements but only a single (and optional) final else statement is allowed and it must be the last branch in the statement.

Each condition is checked in order. If the first is false, the next is checked, and so on. If one of them is true, the corresponding branch executes, and the statement ends. Even if more than one condition is true, only the first true branch executes.

Note

This workspace is provided for your convenience. You can use this activecode window to try out anything you like.

Check your understanding

    data-2-23: What will the following code print if x = 3, y = 5, and z = 2?

    if x < y and x < z:
        print("a")
    elif y < x and y < z:
        print("b")
    else:
        print("c")
    
  • (A) a
  • While the value in x is less than the value in y (3 is less than 5) it is not less than the value in z (3 is not less than 2).
  • (B) b
  • The value in y is not less than the value in x (5 is not less than 3).
  • (C) c
  • Since the first two Boolean expressions are false the else will be executed.

2.4.3. Exception Handling Flow-of-control

Sometimes it is useful to catch certain situations where we can determine that are function will crash. Examples include dividing by zero or trying to open a non-existent file. Python provides a statement for catch these types of mistakes: the try-except statement-block.

2.4.4. What is an exception?

An exception is a signal that a condition has occurred that can’t be easily handled using the normal flow-of-control of a Python program. Exceptions are often defined as being “errors” but this is not always the case. All errors in Python are dealt with using exceptions, but not all exceptions are errors.

Just like everything else in Python, exceptions are values that represent an object of a certain type of class, in this case an exception class. When our code comes upon an error, an exception value is returned.

The try-expect statement can run our code and catch any exception, allowing us to perform some alternate action. Here is an example of the try-expect block for catching division by zero.

In [1]: def safeDivision(x,y):
   ...:     """ computes x/y and returns None on any exception"""
   ...:     try:
   ...:         output = x/y
   ...:         return output
   ...:     except:
   ...:         output = None
   ...:         return output
   ...: print(safeDivision(2,1))
   ...: print(safeDivision(2,0))
   ...: 

This function guarantees that the function call will be error free. There are many other patterns related to exceptions that will be covered in the appendix.

2.4.5. The for loop

Reader familiar with another programming language will probably notice how few loops are used in this book. When working with data, we are able to use other construct called a comprehension to describe most situations that would use a loop in another language. There will be times when more conventional loops are convenient, so we will give a brief description of the Python syntax for loops here.

When we drew the square, it was quite tedious. We had to move then turn, move then turn, etc. etc. four times. If we were drawing a hexagon, or an octagon, or a polygon with 42 sides, it would have been a nightmare to duplicate all that code.

A basic building block of all programs is to be able to repeat some code over and over again. In computer science, we refer to this repetitive idea as iteration. In this section, we will explore some mechanisms for basic iteration.

The for statement allows us to write programs that implement iteration. As a simple example, let’s say we have some friends, and we’d like to send them each an email inviting them to our party. We don’t quite know how to send email yet, so for the moment we’ll just print a message for each friend.

Take a look at the output produced when you press the run button. There is one line printed for each friend. Here’s how it works:

  • name in this for statement is called the loop variable.
  • The list of names in the square brackets is called a Python list. Lists are very useful. We will have much more to say about them later.
  • Line 2 is the loop body. The loop body is always indented. The indentation determines exactly what statements are “in the loop”. The body is performed one time for each name in the list.
  • On each iteration or pass of the loop, first a check is done to see if there are still more items to be processed. If there are none left (this is called the terminating condition of the loop), the loop has finished. Program execution continues at the next statement after the loop body.
  • If there are items still to be processed, the loop variable is updated to refer to the next item in the list. This means, in this case, that the loop body is executed here 7 times, and each time name will refer to a different friend.
  • At the end of each execution of the body of the loop, Python returns to the for statement, to see if there are more items to be handled.

As a program executes, the interpreter always keeps track of which statement is about to be executed. We call this the control flow, or the flow of execution of the program. Flow of control is often easy to visualize and understand if we draw a flowchart. This flowchart shows the exact steps and logic of how the for statement executes.

../_images/new_flowchart_for.png

A codelens demonstration is a good way to help you visualize exactly how the flow of control works with the for loop. Try stepping forward and backward through the program by pressing the buttons. You can see the value of number change as the loop iterates through the list of friends.

(vtest)

2.4.6. Aside: The accumulator pattern

The above program should represent a pattern that is familiar to anyone that has learned to program in an imperative or object oriented language. Here is another program that follows the same pattern.

In [1]: def square(x):
   ...:     runningtotal = 0
   ...:     for counter in range(x):
   ...:         runningtotal = runningtotal + x
   ...:     return runningtotal
   ...: 

In [1]: toSquare = 10

In [1]: squareResult = square(toSquare)

In [1]: squareResult

In the program above, notice that the variable runningtotal starts out with a value of 0. Next, the iteration is performed x times. Inside the for loop, the update occurs. runningtotal is reassigned a new value which is the old value plus the value of x.

This pattern of iterating the updating of a variable is commonly referred to as the accumulator pattern. We refer to the variable as the accumulator. This pattern will come up over and over again. Remember that the key to making it work successfully is to be sure to initialize the variable before you start the iteration. Once inside the iteration, it is required that you update the accumulator.

Note

What would happen if we put the assignment runningTotal = 0 inside the for statement? Not sure? Try it and find out.

Here is the same program in codelens. Step through the function and watch the “running total” accumulate the result.

(sq_accum3)

Note

The accumulator pattern is another pattern that won’t appear very often in this text. In fact, later we will illustrate how pure functional programs use recursion as an alternate to loops in the chapter titled Recursion and how this pattern can be abstracted in the form of the reduce function in Higher Order Functions. Being able to abstract an accumulator pattern to a reduction will be an important skill when using Hadoop and Spark.

2.4.7. The while loop

There is another Python statement that can also be used to build an iteration. It is called the while statement. The while statement provides a much more general mechanism for iterating. Similar to the if statement, it uses a boolean expression to control the flow of execution. The body of while will be repeated as long as the controlling boolean expression evaluates to True.

The following figure shows the flow of control.

../_images/while_flow.png

We can use the while loop to create any type of iteration we wish, including anything that we have previously done with a for loop. For example, the program in the previous section could be rewritten using while. Instead of relying on the range function to produce the numbers for our summation, we will need to produce them ourselves. To do this, we will create a variable called aNumber and initialize it to 1, the first number in the summation. Every iteration will add aNumber to the running total until all the values have been used. In order to control the iteration, we must create a boolean expression that evaluates to True as long as we want to keep adding values to our running total. In this case, as long as aNumber is less than or equal to the bound, we should keep going.

Here is a new version of the summation program that uses a while statement.

(ch07_while1)

You can almost read the while statement as if it were in natural language. It means, while aNumber is less than or equal to aBound, continue executing the body of the loop. Within the body, each time, update theSum using the accumulator pattern and increment aNumber. After the body of the loop, we go back up to the condition of the while and reevaluate it. When aNumber becomes greater than aBound, the condition fails and flow of control continues to the return statement.

The same program in codelens will allow you to observe the flow of execution.

(ch07_while2)

Note

The names of the variables have been chosen to help readability.

More formally, here is the flow of execution for a while statement:

  1. Evaluate the condition, yielding False or True.
  2. If the condition is False, exit the while statement and continue execution at the next statement.
  3. If the condition is True, execute each of the statements in the body and then go back to step 1.

The body consists of all the statements below the header with the same indentation.

This type of flow is called a loop because the third step loops back around to the top. Notice that if the condition is False the first time through the loop, the statements inside the loop are never executed.

The body of the loop should change the value of one or more variables so that eventually the condition becomes False and the loop terminates. Otherwise, the loop will repeat forever. This is called an infinite loop. An endless source of amusement for computer scientists is the observation that the directions written on the back of the shampoo bottle (lather, rinse, repeat) create an infinite loop.

In the case shown above, we can prove that the loop terminates because we know that the value of aBound is finite, and we can see that the value of aNumber increments each time through the loop, so eventually it will have to exceed aBound. In other cases, it is not so easy to tell.

Note

Introduction of the while statement causes us to think about the types of iteration we have seen. The for statement will always iterate through a sequence of values like the list of names for the party or the list of numbers created by range. Since we know that it will iterate once for each value in the collection, it is often said that a for loop creates a definite iteration because we definitely know how many times we are going to iterate. On the other hand, the while statement is dependent on a condition that needs to evaluate to False in order for the loop to terminate. Since we do not necessarily know when this will happen, it creates what we call indefinite iteration. Indefinite iteration simply means that we don’t know how many times we will repeat but eventually the condition controlling the iteration will fail and the iteration will stop. (Unless we have an infinite loop which is of course a problem)

What you will notice here is that the while loop is more work for you — the programmer — than the equivalent for loop. Using a while loop you have to control the loop variable yourself. You give it an initial value, test for completion, and then make sure you change something in the body so that the loop terminates.

So why have two kinds of loop if for looks easier? This next example shows an indefinite iteration where we need the extra power that we get from the while loop.

Note

This workspace is provided for your convenience. You can use this activecode window to try out anything you like.

Check your understanding

    data-2-24: True or False: You can rewrite any for-loop as a while-loop.
  • (A) True
  • Although the while loop uses a different syntax, it is just as powerful as a for-loop and often more flexible.
  • (B) False
  • Often a for-loop is more natural and convenient for a task, but that same task can always be expressed using a while loop.

    data-2-25: The following code contains an infinite loop. Which is the best explanation for why the loop does not terminate?

    n = 10
    answer = 1
    while n > 0:
        answer = answer + n
        n = n + 1
    print(answer)
    
  • (A) n starts at 10 and is incremented by 1 each time through the loop, so it will always be positive
  • The loop will run as long as n is positive. In this case, we can see that n will never become non-positive.
  • (B) answer starts at 1 and is incremented by n each time, so it will always be positive
  • While it is true that answer will always be positive, answer is not considered in the loop condition.
  • (C) You cannot compare n to 0 in while loop. You must compare it to another variable.
  • It is perfectly valid to compare n to 0. Though indirectly, this is what causes the infinite loop.
  • (D) In the while loop body, we must set n to False, and this code does not do that.
  • The loop condition must become False for the loop to terminate, but n by itself is not the condition in this case.
Next Section - 2.5. Exercises