Python Basics: From Zero to Full Monty

September 27, 2017
notes study tutorial python

Python Basics

@drob published a blog post this past week titled “Why is Python Growing So Quickly?” with some interesting findings. I have been meaning to write up some posts, and by that I mean dust off some of my old notes, on introductory Python and R. My path to learning Python started with Learn Python the Hard Way, which may still be a good place to start. Although there are plenty of MOOCs, such as MIT’s Intro to Comp Sci and Programming in Python course, and other offerings available that previously hadn’t been around when I started. I also briefly read [@jakevdp](https://twitter.com/jakevdp)’s A Whirlwind Tour of Python, which looks like a great resource for new learners.

Regardless, I think a learners best friend will be Philip Guo & co.’s PythonTutor. I can’t say enough about how useful this tool is in helping you to understand (and visualize) how your code is processed. It’s pretty sweet.

My hope is to convert these notes to lecture slides (likely using the xaringan package) to help introduce my colleagues to object-oriented programming and Python.

## Download & Install Python I think the best place to start is installing the most up-to-date version of Python. I would recommend Anaconda as this includes Python base, Jupyter notebook, the Spyder IDE and many of the more commonly used packages.

As an aside, my go to IDE of late has been VSCode, which I switched too from Atom. If I remember correctly, Learn Python the Hard Way recommended TextWrangler back in the day, but now they recommend Atom.

pip

If there are some packages that were not included with the Anaconda install, you can use pip to install them. They have a nice tutorial on the basics of using pip here.

For example, to install NumPy:

pip3 install numpy

Hello, World!

print('Hello, World!')
## Hello, World!

## Basic Operations

### Arithmetic Operators

Operation Math Code
Addition a+b a + b
Subtraction ab a - b
Multiplication a×b a * b
Division a÷b a / b
Floor Division a÷b a // b
Modulo amodb a % b
Absolute value a abs(a)
Exponent ab a**b
Square root a import math; math.sqrt(a)
Matrix product AB a @ b
x += 1  # same thing as x = x + 1
x = "foo"
x += "bar"
print(x)
## foobar

### Comparison Operators

Operation Math Code
Greater than a>b a > b
Less than a<b a < b
Equal to a=b a == b
Not equal to ab a != b
Greater than or equal to ab a >= b
Less than or equal to ab a <= b
a = 10
b = 2
print(a > b)
## True
print(a == b)
## False

### Logical Operators

Operation Math Code
And a & b a and b
Or $a b$
Not not a
a = 10
b = 2
print((a > 4) and (b < 4))
## True
x={1,2,3}
y={2,3,4}
print(x & y)
## {2, 3}

### Assignment Operators

Can be interpreted as reassigning a as “a (some operator) b.” For example, a += b should be thought of as reassigning a as a + b.

  • a += b
  • a -= b
  • a *= b
  • a /= b
  • a //= b
  • a %= b
  • a **= b
a = 10
b = 2
a **= b
print(a)
## 100

### Identity Operators

  • a is b
  • a is not b
a = 10
b = 10
print(a is b)
## True
print(a is not b)
## False

### Membership Operators

  • a in b
  • a not in b
a = 10
b = 2
z = [1, 2, 3]
print(a in z)
## False
print(b in z)
## True

## Syntax

I thought about writing a lengthy composition on how I feel about indentation, but to be brief, I love it. I didn’t quite appreciate the wisdom of indentation until I had several lines of code and tried to troubleshoot. I’ve since started using indentation even when coding in R since getting to taste the sweetness in Python.

For style, sometimes you’d just like to have things go over to the next line without disrupting your code. This can be done using \ or enclosing things within parentheses.

For example,

x = 1 + 2 + 3 + 4 +\
    5 + 6 + 7 + 8

Or

x = (1 + 2 + 3 + 4 +
     5 + 6 + 7 + 8)

If you’d like to write multiple statements on the same line, you can use a semicolon (;)

x = 3; y = 7

Variables and Objects

- table of object types, i.e. integer, float, string, list, dictionary, etc.

  • int - int()
  • float - float()
  • str - str()
  • boolean - bool()
  • NoneType
  • list
  • tuple
  • dictionary
x = "Mirza"
print(type(x))
## <class 'str'>

Variables can be assigned to objects, e.g. integer, list, etc. A variable is never assigned to another variable, and will only reference objects. Another way of thinking about this that I think is more intuitive is that the variable is merely a label pointing to the object.

x = 5 # the variable x is pointing to 5 (the object)
y = x # saying that y and x should point to the SAME object
print(y)
## 5
y += 2 # now, y should be pointing to 7 (and x unchanged)
print(x)
## 5
print(y)
## 7

In the above example, print(y) initially gives us the same value as our variable, x. However, we then reassign y to point to another object (in this case, the integer 7) using the function y +=2. The variable x in this instance has not been reassigned so it should be unchanged as it still points to the same object as when it was first assigned.

Let’s now try another example, but this time we will reassign x.

x = [1, 2, 3]
y = x
print(y)
## [1, 2, 3]
x = "foo" # x is now pointing to another object
print("y =", y)
## y = [1, 2, 3]
print("x =", x)
## x = foo

Here, we start by assigning x to an object (a list in this example), and then point y to that same object just as we had done in the previous example. However, here we will then reassign x (in this example, now pointing to a string). Using our print() function, we see that y still points to the same list object it was assigned to and that x refers to the string object it was reassigned to.

What happens if we modify the object?

Mutable vs Immutable

If an object is mutable, then all variables pointing to that object will be changed. However, if an object is immutable it is essenitally “unchangeable.” In other words, once an immutable object is created, its size and contents cannot be changed.

Let’s highlight this with an example using a mutable object, e.g. a list.

x = [1, 2, 3]
y = x
x.append(4) # modify x
print("y =", y)
## y = [1, 2, 3, 4]
print("x =", x)
## x = [1, 2, 3, 4]

In this example, we start out by pointing the variables x and y to the same object. We subsequently modify the mutable object, i.e. change the object itself. Because our variables are still pointing at it, they both represent the same object, which has just been modified.

Methods

Python is an object-oriented programming language and each object type that we encounter has its own set of methods. We’ll get into some of the more commonly used methods for different objects, such as strings, lists, etc. An example of a method was the append method we used for the string above.

y = [1, 2, 3]
y.append(27)
print(y)
## [1, 2, 3, 27]

TIP: use dir() to get list of methods for a data type.

x = "Mirza"
print(dir(x))
## ['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
print(dir(list))
## ['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
print(dir(int))
## ['__abs__', '__add__', '__and__', '__bool__', '__ceil__', '__class__', '__delattr__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', '__floor__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getnewargs__', '__gt__', '__hash__', '__index__', '__init__', '__init_subclass__', '__int__', '__invert__', '__le__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmod__', '__rmul__', '__ror__', '__round__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__trunc__', '__xor__', 'bit_length', 'conjugate', 'denominator', 'from_bytes', 'imag', 'numerator', 'real', 'to_bytes']

Common base functions in Python

  • help, e.g. help(max) or ?max
  • max
  • len
  • del
  • round - e.g. round(3.141592), round(3.141592, 3)
  • sorted - e.g. sorted(my_list, reverse = TRUE)

Numeric Types

  • Integers
  • Floating point numbers
  • Complex numbers
# Division
print(15 / 7)
# Floor Division
## 2.142857142857143
print(15 // 7)
## 2
print(20 // 7)
## 2

Lists

Lists are mutable sequences of objects of any type. Unlike sets and dictionaries, lists are ordered.

tl;dr: Lists are ordered and mutable.

Use indexing to get a single element, slicing to get multiple elements from a list

Zero-based indexing (unlike R)

yo = [1, 2, 3, 4, 5]
print(yo[0])
## 1

Negative indexing, e.g. [-1] gives you the last element of the list

yo = [1, 2, 3, 4, 5]
print(yo[-1])
## 5

Slicing - use : to perform list slicing, - e.g. list[2:4] will select elements 2 and 3 (not 4 though) - make some code to illustrate this - e.g. list[:4] gives you the elements 0 through 3 (again, remember that 4th element is excluded) - e.g. list[3:] gives you the elements from element 3 to the last element of the list

yo = [1, 2, 3, 4, 5]
print(yo[2:4])
## [3, 4]
print(yo[:4])
## [1, 2, 3, 4]
print(yo[3:])
## [4, 5]

Use a second : to stipulate step size with splicing:

yo = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
print(yo[::2])
## [1, 3, 5, 7, 9, 11]
print(yo[1:9:3])
## [2, 5, 8]

You can even have a list of lists, i.e. sub-lists, and so on - Subsetting list of lists, e.g. list[1][0] gives you the 0th element from list 1

listed = [[1, 10], [42, 3], [3, 20], [7, 13]]
print(listed[2])
## [3, 20]
print(listed[1][0])
## 42

Change the value of an element in a list: - For example, list[3] = “Bartleby” - changes the 3rd element of the list to “Bartleby”

list = [1, 2, 3, 4, 5]
list[3] = "Bartleby"
print(list)
## [1, 2, 3, 'Bartleby', 5]

You can also change multiple values using the same approach with splicing:

list = [1, 2, 3, 4, 5, 6, 7, 8]
list[1:4] = ["Bartleby", "the", "scrivener"]
print(list)
## [1, 'Bartleby', 'the', 'scrivener', 5, 6, 7, 8]

Add elements to a list with + or the list.append("something"") method:

Concatenate a list with +:

melville = ["Bartleby"]
print(melville + ["the", "scrivener"])
## ['Bartleby', 'the', 'scrivener']
print(melville) #notice how this remains unchanged
## ['Bartleby']

Add to a list with the append method:

melville = ["Bartleby"]
melville.append("Jones")
print(melville)
## ['Bartleby', 'Jones']

Remove elements from a list - del(list[3]) - removes the 3rd element from the list

list = [1, 2, 3, 4, 5]
del(list[3])
print(list)
## [1, 2, 3, 5]

Lists are mutable

a = [1, 2, 3]
b = a
b[1] = 100
print(a)
## [1, 100, 3]

To keep from running into this problem, you’ll essentially want to copy the list when creating your new variable as opposed to having both a and b pointing to the same list object. You have 2 options to do this:

  • Use b = list(a), or
  • Use b = a[:]
a = [1, 2, 3]
b = list(a)
b[1] = 100
print(a)
print(b)

How many objects are in our list?

Use the len() function:

list = [1, 2, 3, 4, 5]
print(len(list))
## 5

The index method tells you tells you which element of your list is a certain object:

jackson_5 = ["just", "call", "my", "name", "and", "I'll", "be", "there"]
print(jackson_5.index("there"))
## 7

Count the number of times an element occurs in a list with the count method:

list = [1, 4, 1, 3, 5, 3, 5, 2, 5, 2, 6]
print(list.count(2))
## 2

Remove a certain element from a list with the remove method:

list = [1, 4, 1, 3, 5, 3, 5, 2, 5, 2, 6]
list.remove(1)
print(list)
## [4, 1, 3, 5, 3, 5, 2, 5, 2, 6]

Sort the elements of a list with the sort method or sorted() function:

Compared to sorted() function, list.sort() modifies our original list, whereas sorted function creates a completlely new list, i.e. our original list is unchanged.

Using the list.sort() method:

list = [1, 4, 1, 3, 5, 3, 5, 2, 5, 2, 6]
list.sort()
print(list)
## [1, 1, 2, 2, 3, 3, 4, 5, 5, 5, 6]

Using the sorted() function:

list = [1, 4, 1, 3, 5, 3, 5, 2, 5, 2, 6]
sorted(list)
print(list) # using sorted(), list itself remains unchanged
## [1, 4, 1, 3, 5, 3, 5, 2, 5, 2, 6]
print(sorted(list))
## [1, 1, 2, 2, 3, 3, 4, 5, 5, 5, 6]

Reverse the order of elements in a list using the reverse method:

list = [1, 4, 1, 3, 5, 3, 5, 2, 5, 2, 6]
list.reverse()
print(list)
## [6, 2, 5, 2, 5, 3, 5, 3, 1, 4, 1]

Strings

TODO

Strings are immutable

bartleby = 'Bartleby in a singularly mild, firm voice, replied, “I would prefer not to."'
print(bartleby)
## Bartleby in a singularly mild, firm voice, replied, “I would prefer not to."

Determine the length of a string with len()

bartleby = 'Bartleby in a singularly mild, firm voice, replied, “I would prefer not to."'
print(len(bartleby))
## 76

Slicing:

bartleby = 'Bartleby in a singularly mild, firm voice, replied, “I would prefer not to."'
print(bartleby[52:])
## “I would prefer not to."
print(bartleby[-24:])
## “I would prefer not to."
print(bartleby[0:9])
## Bartleby

Searching your string with in:

bartleby = 'Bartleby in a singularly mild, firm voice, replied, “I would prefer not to."'
print("I would prefer not to" in bartleby)
## True

Adding strings, aka concatenation, with +:

hi = "foo" + "bar"
print(hi)
## foobar
first_name = "Herman"
last_name = "Mellville"
full_name = first_name + " " + last_name
print(full_name)
## Herman Mellville
migos = "Versace"
print(3 * migos)
## VersaceVersaceVersace

Convert all the characters to uppercase with string.upper():

not_to = "I would prefer not to."
print(not_to.upper())
## I WOULD PREFER NOT TO.

Convert all the characters to lowercase with string.lower():

loud = "WHAT IS HAPPENING?"
print(loud.lower())
## what is happening?

Capitalize the string with string.capitalize():

uncap = "can i be capitalized?"
print(uncap.capitalize())
## Can i be capitalized?

Replace parts of a string with string.replace(“o”, “ou”), which would replace all of the o’s in our string w/ “ou”

POTUS = "Donald J. Trump"
newPOTUS = POTUS.replace("Trump", "Drumpf") # strings are immutable, so I create a new object, newPOTUS
print(newPOTUS)
## Donald J. Drumpf

Find the index of certaing characters in a string with string.index(“a”):

hello = "Hello"
print(hello.index("o"))
## 4

Create a list of the words in a string using string.split(" “):

bartleby = 'Bartleby in a singularly mild, firm voice, replied, “I would prefer not to."'
bartleby2 = bartleby.split(" ")
print(bartleby2)
## ['Bartleby', 'in', 'a', 'singularly', 'mild,', 'firm', 'voice,', 'replied,', '“I', 'would', 'prefer', 'not', 'to."']
print(type(bartleby2))
## <class 'list'>
yo = "hello\nhow are you\nman?" #using \n gives a new line
print(yo)
# Use rstrip() method to put it all one one line
## hello
## how are you
## man?
yoyo = yo.rstrip()
import string
alphabet = string.ascii_letters
lower_alpha = string.ascii_lowercase
print(alphabet)
## abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
print(lower_alpha)
## abcdefghijklmnopqrstuvwxyz

Tuples

Unlike lists, tuples are immutable. Whereas lists are typically used to store homogenous data, tuples are typically used to store heterogenous data (although, this is not a real rule). Like lists, tuples are also ordered (unlike sets and dictionaries).

In some ways working with tuples is similar to working with lists (except that tuples are immutable).

tup = (1, 2, 3, 4, 5)
print(tup[2])
## 3
print(tup + (6, 7))
## (1, 2, 3, 4, 5, 6, 7)

Tuples are immutable, which is why trying to change an element of a tuple with another object (e.g. 2nd element with 15) or adding another element to our tuple gives us an error

tup = (1, 2, 3, 4, 5)

tup[2] = 15
tup = (1, 2, 3, 4, 5)

print(tup.append(6))

A tuple of one object:

tup1 = (2,)
tuperror = (2)
print(tup1)
## (2,)
print(tuperror)
## 2
print(type(tup1))
## <class 'tuple'>
print(type(tuperror))
## <class 'int'>

Calculating the sum of a tuple:

myTuple = (7, 8, 9)
print(sum(myTuple))
## 24

Unpacking a tuple

myTuple = (7, 8, 9)
(x, y, z) = myTuple
print(x)
## 7
print(y)
## 8
print(z)
## 9

More unpacking tuples:

tuple_list = [(1,2), (3, 4), (5, 6), (7, 8)]
for (x, y) in tuple_list:
  print(x, y)
## 1 2
## 3 4
## 5 6
## 7 8

range()

Often used in for loops, etc. The upside of using range is that it’s light on Python because Python is only having to store 3 numbers - start, stop and step size. The examples below use the list function to show how range is working behind the scenes, but using list in conjunction with range() would only slow things down.

print(list(range(5))) # use list() to see the actual list of objects

print(list(range(1,11))) # set starting point as 1 and end at 10 (so set as 11)

print(list(range(1, 11, 2))) # start at 1, but step size is set as 2

Dictionaries

TODO

Dictionaries map key objects to value objects. The dictionaries themselves are mutable, but the key objects are immutable (e.g. strings, digits, tuples) and the value objects can be either a mutable or immutable object.

Create an empty dictionary: - use {}, or - use dict()

scores = {} # creates an empty dictionary
scores = dict() # also creates and empty dictionary

Indexing in dictionaries:

age = {"Kyrie": 25, "LeBron": 32, "Giannis": 22, "Vince": 40}
print(age["Kyrie"])
## 25

Changing the value object in a dictionary:

For example, today is LeBron’s birthday, so let’s adjust his age to reflect that.

age = {"Kyrie": 25, "LeBron": 32, "Giannis": 22, "Vince": 40}
print(age["LeBron"])
## 32
age["LeBron"] += 1
print(age["LeBron"])
## 33

List the Key or Value object values: - dict.keys() lists the keys - dict.values() lists the values

age = {"Kyrie": 25, "LeBron": 32, "Giannis": 22, "Vince": 40}
print(age.keys())
## dict_keys(['Kyrie', 'LeBron', 'Giannis', 'Vince'])
print(age.values())
## dict_values([25, 32, 22, 40])

Add a key, value object to your dictionary:

age = {"Kyrie": 25, "LeBron": 32, "Giannis": 22, "Vince": 40}
age["Kristaps"] = 22
print(age)
## {'Kyrie': 25, 'LeBron': 32, 'Giannis': 22, 'Vince': 40, 'Kristaps': 22}

Test for object membership:

age = {"Kyrie": 25, "LeBron": 32, "Giannis": 22, "Vince": 40}
print("Dirk" in age)
## False

Sets

Sets are unordered collections of unique items.

num1 = {1, 2, 3, 4, 5}
num2 = {3, 4, 5, 6, 7, 8}
print(num1 | num2) # same thing as num1.union(num2)
## {1, 2, 3, 4, 5, 6, 7, 8}
print(num1 & num2) # same thing as num1.intersect(num2)
## {3, 4, 5}
print(num1 - num2)
## {1, 2}
print(num2 - num1)
## {8, 6, 7}
print(num1 ^ num2) # gives items only appearing in one set
# same thing as num1.symmetric_difference(num2)
## {1, 2, 6, 7, 8}

If, Elif, Else

if some_test:
  [block of code]
elif another_test:
  [block of code]
else:
  [block of code]

For and While Loops

for i in range(5):
  print(i)
## 0
## 1
## 2
## 3
## 4
age = {"Kyrie": 25, "LeBron": 32, "Giannis": 22, "Vince": 40}
# Can use age.keys() to go over the keys in your dictionary
for name in age.keys():
  print(name)
# More simply, can just simply use your dictionary, e.g. age, below
## Kyrie
## LeBron
## Giannis
## Vince
for name in age:
  print(name)
  
# use indexing to also give the value object with the corresponding key  
## Kyrie
## LeBron
## Giannis
## Vince
for name in age:
  print(name, age[name])
  
# If want to iterate in alphabetical order, use sorted()
## Kyrie 25
## LeBron 32
## Giannis 22
## Vince 40
for name in sorted(age.keys()):
  print(name, age[name])
  
# If want in reverse alphabetical order, use reverse=True arument in sorted()
## Giannis 22
## Kyrie 25
## LeBron 32
## Vince 40
for name in sorted(age.keys(), reverse=True):
  print(name, age[name])
## Vince 40
## LeBron 32
## Kyrie 25
## Giannis 22
x = 5
while x > 1:
  x -= 1
print(x)
## 1

List Comprehension

Simple method:

numbers = range(10)
squares = []
for number in numbers:
  square = number ** 2
  squares.append(square)
  
print(squares)
## [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

List Comprehension method

numbers = range(10)
squares = [number**2 for number in numbers]
print(squares)
## [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
total = sum([i**2 for i in range(6)])
print(total)
## 55

Another example, sum the odd numbers from 0 to 9 using list comprehension:

print(sum([i for i in range(10) if i%2 != 0]))
## 25

Upsides of using List Comprehension: - Speed - Simplicity & Elegance


Functions

def and return

def add(a,b):
  mysum = a + b
  return mysum
print(add(34,5))
## 39

Can get mutliple outputs by “returning” a tuple:

def add_and_sub(a,b):
  mysum = a + b
  mydiff = a - b
  return (mysum, mydiff)
print(add_and_sub(5,3))
## (8, 2)
def intersect(s1, s2):
  res = []
  for i in s1:
    if i in s2:
      res.append(i)
  return res
list1 = [1,4,5,3,7,2]
list2 = [3,6,7,1,8]
print(intersect(list1, list2))
## [1, 3, 7]

Example - create a function to calculate factorials:

def factorial(n):
   if n == 0:
     return 1
   else:
     N = 1
     for i in range(1, n+1):
       N *= i
     return(N)
     
print(factorial(1))
## 1
print(factorial(3))
## 6
print(factorial(4))
## 24

Reading and Writing Files

TODO

F = open("new_file.txt", "w") #the "w" tells you you are Writing to the file

F.write("Hello\nWorld!")

F.close()

Importing Data into R

January 21, 2018
howto notes R tutorial

Normal Distribution & Central Limit Theorem

September 19, 2017
notes review study

Machine Learning Notes: An Introduction

September 19, 2017
notes study review machinelearning