MITx: 6.00.1x Introduction to Computer Science and Programming Using Python
- Difference between algorithm and program: An algorithm is a conceptual idea, a program is a concrete instantiation of an algorithm
- Stored program computer: store a sequence of instructions (program) inside the computer
Types of knowledge
- Declarative: Statements of facts, say a statement that is true but doesn't tell you how to do it.
- E.g. Eating chocolate in excess is bad for your health.
- Imperative: Tells you how to do/accomplish something, solve a problem.
- E.g. A recipe to make chocolate.
Properties of a programming language
- Primitive constructs: numbers, strings, simple operators
- E.g. In natural languages like English or Spanish the primitive constructs would be "words"
- Expression: legal combinations of one or more primitives
- Syntax: Which sequence of characters and symbols constitute a well-formed expression (sentence), it doesn't care about meaning or if it states a logical idea
- E.g. in a programming language correct:
2.7182 + 3.1416
; incorrect: 3.1416"abc"
- E.g. in natural language correct: I love ice-cream ; incorrect: cat dog boy
- Static semantics: Which well-formed expression (syntactically correct) have a meaning, make sense
- E.g. in a programming correct:
var = 2.7182 + 3.1416
; Static semantics error: var = 3.1416/"abc"
- E.g. in natural language Static semantics errors : I are big ; yo subo para abajo
- Semantics: What is the meaning of an expression
- E.g. in a programming language expressions always have one meaning
- E.g. in natural language context, intonation can change meaning: "I cannot praise this student too highly" this could mean literally as "I really like this student" or taken in sarcastic manner "Its impossible for me to praise this student"
Types of programming languages
- Compiled: compile and covert all the source code into an executable that has all the sequence of operations that can be executed
- Interpreted: convert each expression on the fly to a sequence of operations that can be executed by the processor to preform that single expression
Core elements of a program
- Turing Complete Language: Language that can fully simulate a Turing machine. It must have sequencing, branching and iteration
Python Objects
- Everything in python is an object and code itself is an object (Every object has a type)
- A particular object is an instance of a type
- Data objects: Things that capture information that can be manipulated to determine more information
- Programs manipulate data objects
- The behavior of an operation/operator/functions is specified by the type of objects involved
- Each object has a type that defines the kinds of things programs can do to it
- Expressions in Python are composed of objects and operators
- Objects Categories
- Scalar Objects
- Cannot be subdivided
- Examples of scalar objects in Python:
int
, float
, bool
NoneType
- Non-scalar
- Have internal structure that can be accessed
- First Class Objects
- Have a type
- They can be elements of data structures
- It can appear in an expression as part of an assignment statement (right-side) and as an argument to a function
Python Modules
- A module is a
.py
file containing a collection of Python definitions (functions, classes, etc.) and statements (declared objects, variables, etc.)
import <module_name>
reads an execute all statements of that module file and creates a namespace for that module so you can access the variables and objects of the module using <module_name>.<object_name>
from <module_name> import *
reads an execute all statements of that module file and binds the current scope to all the objects defined in the module, this means
it places the object names defined in the module name
directly into the caller's symbol table (the file that has the from <module_name> import *
),
any objects that already exist with the same name will be overwritten
statements in a module are executed only the first time a module is imported
statements in a module will only be executed the first time a module is imported in other words when interpreter reaches the import <module_name>
line
When the interpreter executes an import <module_name>
statement it searches for a module_name.py
in the following directories the all the search directories end up in the Python variable sys.path
:
- The current directory or directory from which the script was run
- The list of directories contained in
PYTHONPATH
environment variable
- An installation-dependent and OS-dependent list of directories configured when python was installed
built-in function dir()
returns a sorted list of names in the current local symbol table
When a .py
file is imported as a module Python sets the special __name__
to the name of the module. However, if a file is run as a standalone script, __name__
is set to __main__
Packages and Package Initialization
Functions
- Functions are first class objects
- Higher order programming; treat functions as data objects
map
applies a function to a collection of suitable elements
list(map(abs, [1, -2, -3, 4, 5]))
list(map(min, [1, -2, -3, 4, 5], [2, 12, 4, 1, 6]))
Collections
- Collections: Group of variables that can be of the same or different type.
- Types of collections:
()
for tuples
[]
for lists
{}
for dictionaries
Tuples
- Immutable = No-changeable lists (constant lists) so the elements on a tuple cannot be changed
- Ordered sequence, this means the elements added stay in the order they were added
- They are a generalization of strings
- Common Operations: Concatenation (
+
creates a new tuple), Indexing ([n]
), Slicing ([s:e:i]
), Repeats (*
)
("one",)
is a tuple, meanwhile ("one")
is not, it evaluates to a string
Lists
- List are mutable, so they can be modified after they are created, a list is also an ordered sequence of information (elements added stay in the order they were added)
- Since lists are mutable when creating a list from other objects what you add to the list are actually references/pointers to the objects.
In this case we have two distinct paths (references) to a data object, this is called aliasing, we can alter objects to either path but effect will be visible through both
- Mutation with iteration usually requires cloning of lists
- Common Operations:
- Concatenation/Addition:
+
creates a new list
- Indexing:
[n]
- Slicing:
[s:e:i]
- Repeats
*
- Append:
mylist.append(element)
mutates/modifies the list by adding only one element to the list, does not create a new list
- Extend:
mylist.extend(olist)
mutates/modifies the list by adding all the elements of another list, does not create a new list
- Delete/Remove:
mylist.remove(element)
, del(mylist[index])
or mylist.pop()
mutates/modifies the list by removing the specified element
- Sorting:
mylist.sort()
mutates/modifies the list or sorted(mylist)
does not mutate list, it return a new sorted list
- Reverse:
mylist.reverse()
mutates/modifies the list
- String Operations:
- Convert string to list:
list(mystr)
- Convert list to string:
''.join(mylist)
or 'charToJoin'.join(mylist)
- Splitting converts string to list
mystr.split(' ')
Avoid mutate during iteration on the same list due to loop iterators that cant be aware of changes on the fly
Dictionaries
- Dict is a generalization of lists because indexes don't have to be integers, the indexes can be any value that is immutable, called keys
- Dict is a collection of
<key, value>
pairs, entries are unordered and can only be accessed by key not an index, because it is unordered it may or may not stay in the order in which you added them
- Keys must be unique, of immutable type (
int
, float
, tuple
, bool
) and hashable
- Operations:
- Insertion:
mydict[newkey] = newvalue
mutates/modifies the dictionary
- Delete/Remove:
del(mydict[key])
mutates/modifies the dictionary
- Get Keys:
mydict.keys()
gets an iterator to the keys (commonly converted to a list for convenience list(mydict.keys())
)
- Get Values
mydict.values()
gets an iterator to the values (commonly converted to a list for convenience list(mydict.values())
)
Testing and Debugging
- Test suite: collection of inputs that has a high likelihood of reveling bugs
- Black-Box Testing:
- Test suite designated without knowledge of the implementation
- The test paths through the speciation
- Consider boundary cases
- Glass-Box Testing:
- Test suite designated with knowledge of code implementation
- Test every potential path in the code
- In loops, test loop not entered, loop executed once, loop executed more than once, an exit scenarios
- Unit testing checks that a module or function works, that it produces the expected output
- Integration testing checks that the system as a whole works
- Drivers: (in a testing context) is a piece of code that does the testing for use. So it sets up an environment or things that are need to run test, invoke the test with predefined inputs, save results and report them
- Stub: Simulates parts of program used for testing
- Regression testing: Go back and re-run tests that passed before
- Bug Qualifiers
- Overt bugs have an obvious manifestations E.g. code crashes or runs forever
- Covert bugs don't have obvious manifestation E.g. code return a values but it is not the expected one
- Persistent bugs occur every time code is run
- Intermittent bugs only occur sometimes even if run on same input
Exceptions and Assertions
- We use exceptions to handle unexpected behavior in a particular way and we use assertions to check for code assumptions
- Use
try
, except
, else
(executed if try completes with no exceptions) finally
(always executed after try, else and except) and raise Exception("message")
to raise exceptions
- Some types of exceptions
SyntaxError
: Python can't parse program
IndexError
: Trying to access beyond list limits
NameError
: local or global name not found, referencing non-existent variable
AttributeError
: attribute reference fails
TypeError
: operand doesn't have correct type, mixing data types inappropriately
ValueError
: operand type okay, but value is illegal
IOError
: IO system reports malfunction
OOP
- A method is another name for a procedural attribute
- Data attributes also known as instance variables, are variables created for each instance (object) of this class, they are specific to an instance and belong to this instance
- Class variables are defined inside the class but outside any class methods and these belong to the class, they are shared among all objects/instances of the class
- When calling a method of an object Python always passes (a reference to) the object as the first argument (
self
)
- The
__init__
method of an object is the constructor of the class
- The basic idea is to wrap together data that belongs together as a unit with methods both to access that data, manipulate that data, change that data, giving us this abstraction of a class as an entity that we can simply use
- Substitution principle any important behavior that we have on a superclass should be supported by all subclasses
- Use
isinstance()
to check for defined class types
- Python does not enforce access to attributes
- Special methods of objects in Python to overwrite:
__add__(self, other)
-> self + other
__sub__(self, other)
-> self - other
__eq__(self, other)
-> self == other
__lt__(self, other)
-> self < other
__gt__(self, other)
-> self > other
__len__(self)
-> len(self)
__str__(self)
-> print(self)
or str(self)
class Point(object):
def __init__(self, x, y):
self.x = x
self.y = y
def __str__(self):
return "("+str(self.x)+" , "+str(self.y)+")"
def __add__(self, o):
return Point(self.x+o.x, self.y+o.y)
A = Point(1, 2)
B = Point(7, 7)
C = A+B
print(A, B, C)
Generators
- Generators are special kinds of objects that return partial results of a method when hitting a
yield
statement.
- A method with
yield
statements is called through a reference to that method and using the next
method to execute the code of the method
yield
suspends the execution and returns a value, returning from a generator raises a StopIteration
exception
- Generator separates the concept of computing a very long sequence/quantity of data from the actual process of computing them explicitly
- Generates a new object/data as needed as part of another computation
def gen_fib():
fibn_1 = 1
fibn_2 = 0
while True:
next = fibn_1 + fibn_2
yield next
fibn_2 = fibn_1
fibn_1 = next
gen_ins = gen_fib()
print(gen_ins.__next__())
for n in gen_ins:
print(n)
Algorithmic Complexity
- It is based on counting the number of operations executed as function of the size of the input, considering mostly the worst case and average case
- Order of Growth: express the growth of program's runtime as input size grow, for this it uses Big OH/O() notation
- Complexity classes: Constant, linear, logarithmic, log-linear, polynomial (quadratic, cubic, etc..), exponential, etc.
Pylab
- Install
pylab
: pip install -U matplotlib
If you don't have a GUI environment you can save the result of the plot to a figure. For example, basic plotting:
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
x = []
y = []
for i in range(0,30):
x.append(i)
y.append(x[i]**2)
plt.plot(x, y)
plt.plot(x, y, 'ro')
plt.axis([min(x), max(x), min(y), max(y)])
plt.savefig('fig.png')
plt.show()
plt.close()
Other Python Notes
- Alexandria was the capital of ancient Egypt
- Python has garbage collection system
- In Python, the keyword
None
is frequently used to represent the absence of a value. None
is the only value in Python of type NoneType
- The operators
in
and not in
test for collection membership <element> in <collection>
evaluates to True
if element is a member of the collection when used in a for loop the element is assigned to the members of the collection
range
is a built-in function that generates a sequence of integers
- Amortize: spread out big cost over a period of time
- A good hash function minimizes collisions
References