# Python Lesson 4

## 1 Lesson outline

- Native Python data structures: tuples, dicts and sets
- Lists, sets, and dicts comprehensions. Sequence built-ins.
- Python Functions
- Some (hopefully good) advice…

## 2 Native Python data structures: tuples, hashes and sets

### 2.1 Tuples

A *tuple* is a sequence of Python objects similar to a list, values are accessed with square brackets and they can be sliced. Tuples are created with a simple comma separated list of values (parentheses are optional).

# this is a tuple tup_0 = 1,2,3,4,5 # and this is also a tuple tup_1 = (2,3,4,5,6) # and this is a tuple of tuples... tup_2 = ((2,3,-1), (0,1,2.4), 3, (-33,-22)) ### print(tup_0) print(tup_1[3]) print(tup_2[1])

The major difference is the immutable character of the tuple.

# tup_0[3] = 4

# tup_3 = 4, (2,3,-1), [4,4,5,5], True # tup_3[2].append(6) # print(tup_3) # tup_3[3] = False

With the `+`

operator you can join tuples and with `*`

you can concatenate together several copies of the tuple

# print(tup_0 + tup_3) print(4*tup_0)

A common use of tuples if for variable assignation. Whenever you provide a tuple-like expression of variables in the left-hand side of an assignation Python unpacks the values on the right hand side.

# a, b, c, d = tup_2 print(a) print(b) print(c) print(d)

This makes specially easy to swap variable values

# print("a = ", a) print("b = ", b) a,b = b,a print("a = ", a) print("b = ", b)

This feature can also be used in loops for variable assignation

# tup_loop = (2,3,-1), (4,4,5), (5,6,7), (5,-1,0) # for var_1, var_2, var_3 in tup_loop: print("var_1 = {0}, var_2 = {1}, var_3 = {2}".format(var_1,var_2,var_3))

### 2.2 Dicts

Dicts are also called *hashes* and are associative arrays, and can be considered like a list with an index not constrained to being a number, it can be other objects. The index in this case receives the name *key* and therefore hashes are mutable collections of *key-value* pairs of Python objects. The values of a hash can be any Python object but hash keys are required to be immutable objects, therefore they may be scalars or tuples.

They can be created using curly braces and the colon as the separator between keys and values. You can access or set element values as in lists

# hash_0 = {"Guerras Médicas" : ["Termópilas", "Artemisio", "Salamina", "Platea"]: "Even integers", (0,2,4,6,8)} # print(hash_0) # print(hash_0["Guerras Médicas"]) # hash_0["Guerras Médicas"].append("Micala") # hash_0["Fantastic Sea Creatures"] = ("Moby Dick", "The Kraken", "Mermaids") # print(hash_0)

You can also create a hash from a list of tuples of two elements using the `dict`

function.

# seq =(1,3),(2,6),(3,9),(4,12) dict_example = dict(seq) print(dict_example) #

Once a hash is created you can extract from it the keys and the corresponding values with the `keys`

and `values`

methods. The output of the two methods is not ordered but they keep the correspondence between keys and values.

You can extract values from a dict using the `get`

method, extract and remove the value using the `pop`

method, and and you can delete values using `del(hash[key_value])`

.

# print(dict_example) one_get = dict_example.get(1) two_pop = dict_example.pop(2) del(dict_example[3]) print(one_get) print(two_pop) print(dict_example) #

### 2.3 Default values (*)

The following situation is very common, you need to read a hash key, if the key exists, accept the hash value as input and if it does not exist take as an input a default value. This can be achieved with an `if`

block

# if (key_value in a_hash): value = a_hash[key_value] else: value = default_value #

Both `get`

and `pop`

methods working with hashes accept a default value as a second argument, that will be returned in case the hash for the given key is undefined

# value = a_hash.pop(key_value, default_value) #

When setting values, you may also need to set a default value. Imagine you are reading a list of numbers and you want to separate them by their last digit as a dict of lists. We create a vector a random integer values using the function `np.random.randint`

.

# random_nums = np.random.randint(0,5000,[30]) last_digit_hash = {} for number in random_nums: # last_digit = number % 10 # if( last_digit in last_digit_hash): last_digit_hash[last_digit].append(number) else: last_digit_hash[last_digit] = [] last_digit_hash[last_digit].append(number) # print(last_digit_hash) #

The `setdefault`

method allows to greatly simplify this task.

# random_nums = np.random.randint(0,5000,[30]) last_digit_hash = {} for number in random_nums: last_digit = number % 10 # last_digit_hash.setdefault(last_digit, []).append(number) # print(last_digit_hash) #

### 2.4 Sets

A set is an collection of *unique* elements with no particular order. They can be considered as the keys of a hash but without the corresponding values. They can be created with the `set`

literal or with curly braces.

# set_0 = {"a", 0, 1, "bc", 0.33, 0, 1} set_1 = set(["a", "b", "c", "a", "a"]) print(set_0) print(set_1) #

As could be expected, the set data structure supports the mathematical set operations: intersection, union or difference, among others (you can find a complete list of Python set operations in Real Python Sets).

# Union print(set_0.union(set_1)) print(set_0|set_1) # # Intersection print(set_0.intersection(set_1)) print(set_0 & set_1) # # Difference print(set_0.difference(set_1)) print(set_0 - set_1)

### 2.5 Comprehensions and built-in sequence functions

List, dict, and set *comprehensions* are a terse and neat “Pythonic” way to define new structures in your program. In the list comprehensions case they have the syntax

list_0 = [expr for value in collection if condition]

which is equivalent to the loop

list_0 = [] # for value in collection: # if (condition): list_0.append(expr)

The filter condition is not mandatory and may not be present.

For example, we can create using a loop a list, called `list_mults`

, including the integers that are less than 4000 and can be divided exactly by 7 and 13.

list_mults = [] total = 0 for number in range(4000): if (number % 7 == 0 and number % 13 == 0): list_mults.append(number) total+=1 print(total, list_mults)

We can repeat the same task using a comprehension in a more Pythonic way.

list_mults2 = [number for number in range(4000) if (number % 7 == 0 and number % 13 == 0)] # # Checking if both lists are equal. list_mults2 == list_mults

The extension to sets and dicts is direct.

# Dicts {key_expr(iter): value_expr(iter) for iter in collection if condition} # # # Sets {set_expr(iter) for iter in collection if condition}

Apart from comprehensions there are several built-in sequence functions to work with lists and other structures that are quite useful. One of them is `enumerate`

that we have already covered. Other useful built-ins are `sorted`

, `reversed`

, and `zip`

.

The built-in `sorted`

returns a new, sorted, sequence. You can provide to sorted a key, a function that applied to the element provides a value used for the sorting.

random_nums = np.random.randint(0,1000,[30]) print(sorted(random_nums)) print(sorted(random_nums, key=str)) print(random_nums)

In the particular case of lists, you can sort them using the `sort`

method, and this will be an in-place sorting

print(random_nums) print(random_nums.sort()) print(random_nums)

The `reversed`

built-in provides a *generator* to iterate over a sequence in reverse order.

for number in reversed(range(10)): print(number)

The `zip`

built-in associates the elements of two or more given sequences. The ouput is a list of tuples.

names = ["Lisa", "Auxi", "Julia", "Lisanna", "Curro"] random_nums = np.random.randint(0,20,[5]) zipped = list(zip(names, random_nums, sorted(random_nums))) print(list(zipped))

This comes quite handy for the definition of hashes from two sequences

# hash_example = dict(zip(names, random_nums)) print(hash_example) #

It is also used for iterate in a loop over the elements of various sequences

# for (var_1, var_2, var_3) in zip(seq_1, seq_2, seq_3): # # Code block #

For example

for name, value_1, value_2 in zipped: print("Name {0}: ({1}, {2})".format(name, value_1, value_2))

names = ["Lisa", "Auxi", "Julia", "Lisanna", "Curro"] random_nums = np.random.randint(0,20,[5]) zipped = zip(names, random_nums, sorted(random_nums)) for name, value_1, value_2 in zipped: print('Name {0}: ({1}, {2})'.format(name, value_1, value_2))

You can also transform a Python native structure into a Numpy ndarray structure using the `np.array`

command

print(type(names)) npnames = np.array(names) print(type(npnames)) print(npnames.dtype) print(npnames.shape)

Numpy makes an educated guess to assign the best fitting type to the data.

l1 = [1,2,3,4,5,6] l2 = [1, 2., 3.3, 0, 4, -1] npl1 = np.array(l1) npl2 = np.array(l2) print(type(npl1), type(npl2)) print(npl1.dtype, npl2.dtype) print(npl1.shape, npl2.shape)

You can apply `np.array`

to a Numpy ndarray and in this way you obtain a copy of the initial set of data and not a reference to them. A similar command is `np.asarray`

but in this case if the array is already a Numpy ndarray it does not perform the copying.

## 3 Python Functions

### 3.1 Basic concepts

Function definition allows for code wrapping for its later reuse, making life simpler (and they greatly help for organization and optimization). Let’s start with a very simple function transforming from Kelvin to Celsius degrees. Functions start with the `def`

keyword and return their result(s) with the `return`

keyword. If there is no `return`

statement the returned value is `None`

.

def Kelvin_2_Celsius(T): return T - 273.15 # Temp = 273.16 # Water triple point print("{0} K are {1} ºC".format(Temp, Kelvin_2_Celsius(Temp)))

Another simple function, transforming from degrees Fahrenheit to Kelvin, and adding a *docstring* with the info about the function

def Fahren_2_Kelvin(Temp): ''' Function to transform from degrees Fahrenheit to degrees Kelvin. Input: Temp :: Temperature expressed in degrees Fahrenheit. ''' return ((Temp - 32.) * (5./9.)) + 273.15 # Notice that 5/9 and 5./9. are not necessarily equal... (Python 2.7) ###################################### print('Water triple point: ', Kelvin_2_Celsius(273.16), 'ºC') # print('Water freezing point: ', Fahren_2_Kelvin(32), 'K') print('Water boiling point: ', Fahren_2_Kelvin(212), 'K') # print('Water freezing point: ', Kelvin_2_Celsius(Fahren_2_Kelvin(32)), 'ºC') print('Water boiling point: ', Kelvin_2_Celsius(Fahren_2_Kelvin(212.)), 'ºC')

The *docstring* is the multiline string just after the function definition that contains relevant information about the function for the end user. It can be accessed with the function attribute `__doc__`

.

We can define a function into a function. This is shown in the next example, that computes the body mass index used as an example when we explained conditionals

def bmi_range(weight, height): ''' Body mass index Input: weight (kg) height (m) ''' def bmi_val(weight, height): return weight/height**2 # bmi_value = bmi_val(weight, height) # if bmi_value < 15: bmi_r = "Very severely underweight" elif bmi_value < 16: bmi_r = "Severely underweight" elif bmi_value < 18.5: bmi_r = "Underweight" elif bmi_value < 25: bmi_r = "Normal(healthy weight)" elif bmi_value < 30: bmi_r = "Overweight" elif bmi_value < 35: bmi_r = "Obese Class I (Moderately obese)" elif bmi_value < 40: bmi_r = "Obese Class II (Severely obese)" else: bmi_r = "Obese Class III (Very severely obese)" # return bmi_r ## bmi_range(70,1.80)

We can also use functions to benchmark loops versus list comprehensions. We define two functions to compute the square value of the first `N`

and benchmark them using the magic function `%timeit`

.

def f_loop(number): twice = [] for num in range(number): twice.append(num*2) return twice #### # loop %timeit f_loop(10000) # list comprehension %timeit [num*2 for num in range(10000)]

It should be noticed that `range`

and `np.arange`

are not equivalent. Both are iterables but `range`

is a lazy one, hence when `np.arange`

is invoked it produces the full data array, with an extra burden for the system (for more info check Lesson 7). Notice the difference with

def f_loop_arange(number): import numpy as np twice = [] for num in np.arange(number): twice.append(num*2) return twice #### %timeit f_loop_arange(10000)

Of course, the vectorized calculation with *Numpy* is way faster than the previous two

%timeit np.arange(10000)*2

Exercise 4.1 |

We can define these functions in an external file and read the file from the notebook using the magic function `%run`

.

In a function there can be multiple arguments as well as multiple `return`

statements (only one of them will be effective in a given invocation). There may also be no explicit `return`

, which makes the function returns `None`

.

### 3.2 Positional and keyword arguments

With regard to arguments, there are two argument types: *positional* and *keyword* arguments. Both can be found in the following example where we define a function that computes the saturation vapor pressure of water vapor over liquid water or ice for a given temperature.

def Magnus(Temp, ice = False): '''Function to to compute the saturation vapor pressure E(T) in hPa units for water vapor on liquid water or ice according to Magnus formula. Ref. Alduchov and Eskridge, J. Appl. Met. 35 (1996) 601 Arguments: Temp :: Temperature expressed in degrees Celsius. ice :: If True compute E(T) over ice. Example: Magnus(35.0) 56.17569318925043 ''' # import numpy as np # # AERKi and AERK parameters (A, B, C) = (22.587, 273.86, 6.1121) if ice else (17.625, 243.04, 6.1094) # E_value = C*np.exp((A*Temp)/(B+Temp)) # return E_value

The `Temp`

argument is a **positional** one and the `ice`

argument is of **keyword** type. Keyword type arguments always follow positional ones and they are not mandatory. Whenever they are not provided, the default value is assumed and their order is not relevant. Positional arguments can also have keywords added in their invocation to increase code readability.

print(Magnus(35)) print(Magnus(35, ice = True)) print(Magnus(Temp = 35, ice = False))

Frequently, the `None`

type is used as the default value of keyword arguments. This helps preventing unforeseen side effects that can arise whenever a mutable object is used as the default value of a parameter. Such side effects stem from the fact that the default value in the function is assigned only once, at compiling time when the function is defined, and not each time the function is called.

Therefore, when a function is defined the compiler includes an attribute called `__defaults__`

with a reference to the default values of keyword arguments. And this is not recreated anymore once the function is used, which can give rise to unexpected situations. Let’s see an example of this

def ftest(keyw_arg_0 = [], keyw_arg_1 = ["2222"]): keyw_arg_0.append("0000") keyw_arg_1.append("1111") return keyw_arg_0, keyw_arg_1 print(ftest.__defaults__) # print(ftest()) for i in range(5): print(ftest()) # print(ftest.__defaults__)

As mentioned above, this can be solved making use of `None`

and dynamically defining the mutable object at run time

def ftest(keyw_arg_0 = None, keyw_arg_1 = None): if keyw_arg_0 is None: keyw_arg_0 = [] if keyw_arg_1 is None: keyw_arg_1 = ["2222"] keyw_arg_0.append("0000") keyw_arg_1.append("1111") return keyw_arg_0, keyw_arg_1 print(ftest.__defaults__) # print(ftest()) for i in range(5): print(ftest()) # print(ftest.__defaults__)

Exercise 4.2 |

Exercise 4.3 |

There are situations in programming where we do not know the precise number of positional parameters of a function. This is solved in Python with *tuple references* adding an asterisk (`*`

) in front of the last parameter name. For example, we can compute the geometric mean of a set of values as follows

def argeo_mean(first_value, *values): '''Compute the arithetic and geometric mean of a set of values''' gmean = first_value amean = first_value n_terms = 1 for value in values: amean += value gmean *= value n_terms += 1 return amean/n_terms, gmean**(1/n_terms) # print(argeo_mean(1), argeo_mean(1, 2), argeo_mean(1, 2, 4))

You can also use the star operator in function invocation. This operator singularizes each element of the list (unpacking the list). Therefore, if you need to run the previously defined function over a list you can do it as follows

arguments = [1,2,4,32] print(argeo_mean(*arguments)) # which is equivalent to print(argeo_mean(arguments[0],arguments[1],arguments[2],arguments[3]))

The star operator can be used together with `zip`

to easily alter lists structure, for example extracting the different lists that have been previously zipped.

list(zip(*zipped))

We can also have an undeterminate number of keyword parameters in a function. It is possible to pass them as a hash, using the *double asterisk*, `**`

.

def f(a, b = 0, **kwargs): print(kwargs) return a+b # print(f(1)) print(f(1,b=2)) print(f(1,b=2, c=34, d="test", e="My dog", f=None))

### 3.3 Returning multiple values

A function can return several values and not only one. The values are returned as a tuple and can be assigned to different variables or to a data structure.

def E_T_WI(Temp): return Magnus(Temp, ice = False), Magnus(Temp, ice = True) ## ## E_water, E_ice = E_T_WI(22) print(E_water, E_ice) # E_wice = E_T_WI(22) print(E_wice)

But you can also return values as a dictionary

def E_T_WI_hash(Temp): return {"E_Water": Magnus(Temp, ice = False), "E_Ice": Magnus(Temp, ice = True)} ## ## E_wice = E_T_WI_hash(22) print(E_wice) print(E_wice["E_Water"]) print(E_wice["E_Ice"])

Exercise 4.4 |

### 3.4 Variables scope

Another aspect of interest is that any variable defined in a funcion belongs by default to a *local* namespace which is destroyed once the function returns.

s = 10 t = 20 print("0: ",s, t) def function_t(): s = 5 # local variable print("1: ", s, t) return s function_t() print("2: ", s, t)

Note that once we define a variable as local in a function we cannot make any reference previous to the definition to the variable

s = 10 t = 20 print("0: ",s, t) def function_t(): print(s) # ERROR! s = 5 # local variable print("1: ", s, t) return s function_t() print("2: ", s, t)

Variables with the attribute *global* may be defined, which will solve the previous error, but one should be careful with this. Often, the definition of global variables increases the code complexity without offering much in return.

s = 10 print("0: ",s) def function_t(): global s print("1 :", s) s = 5 print("2: ", s) return s function_t() print("3: ", s)

Note that we have changed inside the function the value of the variable.

Exercise 4.5 |

### 3.5 Functions are references

A function name is a *reference* for the function. Therefore, we can assign multiple names to the same function, and if some of these names are deleted we can still access the function through the rest of them.

```
bmi_result = bmi_range
bmi_result(45,1.55)
```

Then, we can pass function names -references- as arguments or parameters to functions. Let’s have a look to a simple example

def call_function(f, temp): print("I'm going to call function f on temperature ", temp) return f(temp) ################################ print(call_function(Fahren_2_Kelvin, 44)) print() print(call_function(Kelvin_2_Celsius, 44))

By the way, if you try to print the name of the function using the *f* argument you will obtain the argument pointer. You can access the function name using the `__name__`

attribute as follows

def call_function(f, temp): print("I'm going to call function", f.__name__," on temperature ", temp) return f(temp) ################################ print(call_function(Fahren_2_Kelvin, 44)) print() print(call_function(Kelvin_2_Celsius, 44))

Another example

def apply_trig(trig_func, exponent, angle): return trig_func(angle)**exponent ######### print(apply_trig(np.sin, 2, np.linspace(0,2*np.pi,20))) print(apply_trig(np.cos, 2, np.linspace(0,2*np.pi,20)))

A function can also output a reference to a new function. A simple example of this is as follows

def f_0(a_value): def f_1(x): return a_value*x*(-a_value + x) # computes a*x**2 - x*a**2 return f_1 #################### g_1 = f_0(1) g_2 = f_0(2) print(g_1(20), g_2(10))

We can use several arguments too

def ellipse(a_value, b_value): def f_ell(x): return b_value*(1-(x/a_value)**2)**0.5 return f_ell ##################################### cal_ell = ellipse(2,1) x_val = np.linspace(-2,2,220) upper_ellipse = cal_ell(x_val) #################################### plt.plot(x_val,upper_ellipse) plt.plot(x_val,-upper_ellipse) plt.axis('equal')

And using the asterisk notation we can also deal with an unknown number of parameters as in this case where we are given a certain number of terms in the Taylor expansion of a given function, the sine function in this case

def taylor_f(x0, *coef_values): def f_t(x): res = 0 for index, coef in enumerate(coef_values): res += coef*(x-x0)**index return res return f_t ####################################### # sin(x) = x - x**3/3! + x**5/5! - x**7/7! sin_0 = taylor_f(0, 0, 1) sin_1 = taylor_f(0, 0, 1, 0, -1/6) sin_2 = taylor_f(0, 0, 1, 0, -1/6, 0, 1/120) ######################################## x_val = np.linspace(0,np.pi,220) s0 = sin_0(x_val) s1 = sin_1(x_val) s2 = sin_2(x_val) plt.plot(x_val,np.sin(x_val)) plt.plot(x_val, s0, label = "Order 1") plt.plot(x_val, s1, label = "Order 3") plt.plot(x_val, s2, label = "Order 5") plt.legend()

Exercise 4.6 |

## 4 Some advice for future programming

- Document with generosity your code. Include docstrings explaining what a function does, what are their arguments, what is the output format and provide an example in the docstring to be able to test the function. Have in mind the known quote
*Documentation is like sex; when it’s good, it’s very, very good, and when it’s bad, it’s better than nothing.* - Use comments also in your code to explain what are you doing. (See previous item). Include expected physical units in your comments.
- Use clear variable names, indicating its purpose. If you are debugging a code
written sometime ago, it is of great help to face with a variable named
*valence*compared with_{neutrons}*vn*, or worse,*n*or, even worse,*x*. - Follow the motto
*Don’t duplicate, reuse often*. This can be applied in different contexts. For example, if there is a constant in your program whose value is`34`

, define it at the beginning of the code (`irrep_label = 34`

) and then use the variable name in the code. When the day arrives that`34`

needs to be changed to`30`

you do not have to find and replace every`34`

instance in your code -a bug prone task- and you only need to change the initial variable assigment. - Again the motto
*Don’t duplicate, reuse often*. If you find yourself repeating lines of code in different functions, create a function and call it. Similar to the previous item, but even more important as the bugs in this case are more difficult to trap. - Before coding, stop for a while, think carefully about the task you are trying to solve and, if it is a complex one, break it into simpler steps and deal with each one of them. Check your code using simple cases, ideally ones that you know their solution.
- When an error happens read the error and your code carefully.
- Insert diagnostics in your code that may depend on a key argument (e.g.
`verbose = False`

) printing them for a given argument value. - Practice RDD (
*Rubber Duck Debugging*). Ask your personal guru. Warning: gurus can be hot tempered. You can also ask in a forum like`stackoverflow`

. Be polite and read (and follow) the forum policy. Sometimes forum guruses can also be hot-tempered. - If your code is complex enough, you might learn about
*breakpoints*.

Created: 2024-02-21 Wed 16:42