Introduction to python

Introduction to python#

Python is the gold standard programming language for data science and machine learning any many many more things. For detailed information, start here.

You can use it for simple tasks like a pocket calculator but also very complex tasks like machine learning.

Let’s start with some simple operations.

Standards Operators as one would use in a pocket calculator

Addition: +
Subtraction: -
Multiplication: *
Division: /
Exponentiation: **

1+3+4

8-9

-1

8*6

6**2

6**2+59+99*55

Exercise 1.0#

Think of a calculation and write the operation in the cell below.

Variables#

Variables can store numbers, text, or more complex structures. Here are two examples. numeric holds a number, character stores a text, and logical holds either True or False. The = sign allows to assign the variable with a piece of information.

Assigning a varialbe simply means, that the variable numeric has the value 10!

The print function allows us then to display the stored information of a variable.

numeric = 10
character = "Hi!"
logical = False

print(numeric)
print(character)
print(logical)
print("Hello World")

10
Hi!
False
Hello World

This is powerful as we also can use the variables to manipulate the associated values.

x = numeric
x+5

z = 9
x+z

Critical here is to remember which type of information is stored in the variable. For example, x+character will lead to an error message. Since we cannot simply add a string to a numerical variable. However, 'x'+character will work. The critical difference here is that with the hyphen indicators, we do not use the variable named x, but the character x and extend the information stored in character with this additional character.

name = "Xenia"
begruessung = "Dear " + name + ", how are you?"

print(begruessung)

name = "Joe"
begruessung = "Dear " + name + ", how are you?"

print(begruessung)

Dear Xenia, how are you?
Dear Joe, how are you?

With the type() function, one can identify the type of information stored in the variable.

type(z)

int

int indicates that in x an integer is stored (i.e., the number without decimals).

type(x+1)

int

float indicates that stored information is a number with decimals.

type(0.5)

float

str indicates that the variable stores string information. Strings are basically just text characters

type(character)

str

Bool stands for boolean they can be used to evaluate two variables. Python == Good Programming Language should return True, whereas SPSS == Good Programming Language should return False

type(logical)

bool

This is probably a good time to talk about naming conventions for your variables. Naming your variables should be informative, while keeping things short. You really want to do that, because if you look at your scripts in the future, you will not remember the logic behind some abstract variable you named (Trust me on this one).

      a = 10 (not really informative)

      AVariableThatICreatedAtTheBeginningOfMyScript = 10 (Non informative and way too long)

      ExperimentalResults = 10 (On point!)

Last but not least: You need to either define your variables using a structure like MyVariable, or my_variable, or whatever you like. You cannot do

      My Variable = 5

In principle, this could work for column names within dataframes (we will learn about these later on), but I highly suggest to always apply these conventions.

Further, if you want to load files from your local device, and they do not follow this convention, things will get really tricky.

Exercise 2.0#

Define two variables and manipulate them.

Logical operators:#

< for less than
> for greater than
<= for less than or equal to
>= for greater than or equal to
== for equal to each other
!= not equal to each other

y = 10
print(x != y)
print(x > z)

False
True

z == 10

False

The result of a comparison is always a logical value (True or False)

`if`, `else` clauses#

These statements are used to differentiate between values based on logical operators.

if indicates that if some statement is True, do something.

The general syntax for these clauses (as well as loops, functions…) is based of indentations (Einrückungen).

      if ... :                                (Dont forget the `:`)
                continue here                 (You can use `tab` after `:` to progress to the next level )
                if 2nd level:
                          continue here       (You can also apply `nested` structures, if you feel that is is neccessary)

if x > z:
    print ("x is greater than z")

x is greater than z

With the else statement, all possible alternatives are covered. In our case, all situations in which x < z. In the following example, we have this situation.

z = z+x

if x > z:
    print ("x is greater than z")
else:
    print ("z is greater than x")    

z is greater than x

More specific statements can be achieved by elif, which means else if that allows specifying additional cases.

if x > z:
    print ("x is greater than z")
elif z == x+50:
    print ("z is equal to x plus 50")
else:
    print ("z is greater than x")  

z is greater than x

We can also use logical operators to check if two conditions are true at the same time. Use the and keyword to combine this and check for two conditions in one line.

To return True, both conditions need to be met.

if x > 10 and z > 5:
          print("Both x and z are greater than 5")
else:
          print("This is false statement")

This is false statement

The keyword or can be used to check if one or another conditions is met

if x > 10 and z > 5:
          print("Both x and z are greater than 5")
elif x > 10 or z > 5:
          print("One of x and z is greater than 5")
else:
          print("None of these statements hold true")

One of x and z is greater than 5

Exercise 3.0#

Define a new variable called temperature and assign to it an integer or float between -10 and 40. This will represent the current temperature in Celsius.

Now, write an if/elif/else block that checks the value of temperature and prints a message based on the temperature range.

If the temperature is below 0, print: “It’s freezing!”

If the temperature is between 0 (inclusive) and 15, print: “It’s a bit chilly”

If the temperature is between 15 (inclusive) and 25, print: “Nice weather”

If the temperature is between 25 (inclusive) and 35, print: “It’s getting hot”

If the temperature is 35 or higher, print: “It’s boiling!”

You can change the value of temperature to test different outputs and better understand how the if/elif/else structure works.

temperature = 40

if temperature < 0:
          print("Its freezing")
elif temperature >= 0 and temperature < 15:
          print("Its a bit chilly")
elif temperature >= 15 and temperature < 25:
          print("Its nice weather")
elif temperature >= 25 and temperature < 35:
          print("Its getting hot!")
else:
          print("Its boiling")

Bonus#

We have two monkeys, a and b, and the parameters a_smile and b_smile indicate if each is smiling. We are in serious trouble if they are both smiling or if neither of them is smiling.

Using if/else statements, return True if we are in trouble.

For this exercise you just need to paste in your solution to the monkey_trouble function. Make sure to add the return True and return False statements indentation after your solutions.

Meaning:

      if my_solution:
                return True
      else other_solution:
                return False

We will talk later more about functions and the return keyword. For now, just try to fill in your solution.

If your solution is correct, you will see True, True, False, False printed below the code cell.

def monkey_trouble(a_smile, b_smile):
          
          if a_smile == b_smile:
                    return True
          else:
                    return False
                    
          

print(monkey_trouble(True, True))   # should return True
print(monkey_trouble(False, False)) # should return True
print(monkey_trouble(True, False))  # should return False
print(monkey_trouble(False, True)) # should return False

True
True
False
False

Variables that include multiple items (Lists):#

We can store multiple pieces of information in one variable. This is what we call a List. A list is a vanilla python datatype. You can store all kinds of information (diverse datatypes) in it.

list_1 = [x,z]
print(list_1)

[10, 19]

list_2 = [x,z,50,18]
list_2

[10, 19, 50, 18]

list_3 = ["x","z",50,78]
list_3

['x', 'z', 50, 78]

list_4 = ["x","z","50","78"]
list_4

['x', 'z', '50', '78']

The len function can be used to count the number of items of a list.

len(list_1)

len(list_3)

Next to len, there are other built-in functions that we can use with the list datatype

len() calculate length of list
max() calculate max of list
min() calculate min of list
sum() calculate sum of list
sorted() return a sorted list
list() cast to type list – convert tuple to list or a generator to list
any() return True if the truthiness of any value is True in the list
all() return True if the truthiness of all the values is True in the list

We can also slice into our lists, returning different values

List slicing allows you to pick out specific elements from a list. In oder to understand the list slicing, you need to know five things:
1. Start index – this is the index from which the slice of a list is taken. This index is included.
2. End Index – this is the index to which the slice is taken up to but not including.
3. Step size – you can specify a skip factor that allows you to skip certain number of values
4. the colon is used separate start and end index and the step size.
5. Start index, end index, and step size are all optional, but you need at least one
Here are some variations of slicing:
1. start index, but no end index – my_list[3:] – will take all elements from index 3 to the end of list, including the last value
2. end index, but no start index – my_list[:10] – will take all elements from start of the of the list to index 10, but not include value at index 10
3. both start and end index – my_list[3:10] – this will include all elements from index 3 all the way up to index 10, but not include index 10.
4. start and end index and step size – my_list[3:10:2] – same as #3 but take every other element.
Shot-out to: Mohammad Zia

We can use a numerical index to return a specific element from our list

list_4[2]

'50'

list_4[1]

'z'

Exercise:#

Try any of the slicing operations with the lists we created

Exercise 4.0#

Did you expect l_4[1]? to return “z”? What index will give you the value “x”, i.e the first element?

In most programming languages, indexing starts at 0! This means, that the first element in your array, list (…) is always the 0th element by index. If you want to get the last element, you can always use -1.

Another way to check specific items in a list would be based on if/else statements. Here the operator in is central.

if x in list_2:
    print ("x is in the list l_2")
else: 
    print ("x is not in the list l_2")

x is in the list l_2

if x in list_3:
    print ("x is in the list l_3")
else: 
    print ("x is not in the list l_3")

x is not in the list l_3

Exercise 5.0#

Use a if/else statement to check if the value 78 is in list_3 and list_4.

if 78 in list_3 and 78 in list_4:
          print("Its in both lists")
elif 78 in list_3 or 78 in list_4:
          print("Its in on of these lists")
else: 
          print("Its no where")

Bonus: Can you think of an even simpler way to check if the value 78 is in list_3and list_4 ?

78 in list_3 or 78 in list_4
78 in list_3 and 78 in list_4

String formatting#

To finish this section, we will talk about the concept of string formatting

String formatting refers to mixing both numerical and string information in your code. This can be super helpful, when wanting to display both datatypes.

There are two ways to do this. The first method is called the F-String. With the f-string we can format selected parts of our string variable.

      f"MyString {Information I want to format}"

We can also call the .format method of our string. The outcome is basically the same, although the use-cases might differ. For the sake of this course you should just know that there are two ways to do this, but we will only practice the f-string later on.

target_list = [0,1,2,3,4,5,6,7]
target = 4

if target in target_list:
          print(f"I am using the f-string method to display the target value {target} in our list called target_list")

if target in target_list:
          print("I am using the .format method to display the target value {} in our list called target_list".format(target))

I am using the f-string method to display the target value 4 in our list called target_list
I am using the .format method to display the target value 4 in our list called target_list