In this week’s lecture, you learned about selection sort and insertion sort, two iterative sorting algorithms. Today, you’ll get some more practice with these sorting algorithms, some running-time analysis, and develop a simple sorting visualization tool using pygame.
When you first learn a new algorithm like selection sort or insertion sort, it can be tempting to just memorize the code for that algorithm. However, mindlessly memorizing code isn’t actually a good strategy for gaining deep understanding of what that algorithm is doing, or how to implement it. So in the first part of this tutorial, we’ll use a different strategy to reinforce your learning of these two sorting algorithms: implementing variations of them.
Download tutorial9_part1.py
and open it. Inside, you’ll see our implementations of selection sort
and insertion sort from lecture, as well as a few new functions to
implement. Each of these functions can be implemented using a very
similar approach as one of our two sorting algorithms, but it’ll be up
to you to figure out what changes you need to make.
As you complete each function, we recommend comparing your implementation against the code from lecture, so that you can see both the similarities and differences in the code.
In the prep reading for this week, you learned about the binary search algorithm, which can search for an item in a list faster than the standard linear search algorithm, but only works on sorted list. Here’s the implementation from Section 18.1:
def binary_search(lst: list, item: Any) -> bool:
"""Return whether item is in lst using the binary search algorithm.
Preconditions:
- is_sorted(lst)
"""
= 0
b = len(lst)
e
while b < e:
# Loop invariants
assert all(lst[i] < item for i in range(0, b))
assert all(lst[i] > item for i in range(e, len(lst)))
= (b + e) // 2
m if item == lst[m]:
return True
elif item < lst[m]:
= m
e else: # item > lst[m]
= m + 1
b
# If the loop ends without finding the item, the item is not in the list.
return False
The course notes briefly discussed the running time of binary search, but only at a high level. Intuitively, at each loop iteration the size of the sublist being searched decreases by a factor of 2, and so there can be at most (approximately) \(\log_2 n\) iterations in total. But how do we formalize this intuition?
We run into trouble because we don’t have a predictable formula the
values of variables b
and e
after \(k\) iterations, since how the search range
changes depends on the contents of lst
and the item being
searched for.
We can reconcile the intuition with our more formal approach by explicitly introducing and analysing the behaviour of a new variable. Specifically: we let \(r = e - b\) be a variable representing the size of the search range (\(r\) stands for “range”).
Using this definition, answer the questions below. By the end of this
part, you’ll find a good upper bound on the worst-case running time of
binary_search
.
Let \(n\) represent the length
of the input list lst
. What is the initial value of \(r\), in terms of \(n\)?
For what value(s) of \(r\) will the loop terminate?
Prove that at each loop iteration, if item
is not
found then the value of \(r\) decreases
by at least a factor of 2. More precisely, let \(r_k\) and \(r_{k+1}\) be the values of \(r\) immediately before and after the \(k\)-th iteration, respectively, and prove
that \(r_{k+1} \leq
\frac{1}{2}r_k\).
Hints: do a proof by cases. Use the property that for all \(x \in \R\), \(x - 1 < \floor{x} \leq x\).
Find an exact upper bound number of iterations that could occur
(in terms of \(n\)), and use this to
show that the worst-case running time of binary_search
is
\(\cO(\log n)\).
In the final part of today’s tutorial, you’ll develop a cool way of visualizing our two different sorting algorithms using pygame.1
To visualize a sorting algorithm being performed on a list of length \(n\), the pygame window is divided into an \(n\)-by-\(n\) grid, where:
To start, please download the starter file tutorial9_part3.py
.
We’ve included the sorting algorithms from lecture (the versions
including the key
parameter), and then the tutorial
exercises divided into “grayscale” and “colour” parts.
Under “Tutorial exercises (grayscale verion)”, implement the
function draw_list
, which is responsible for drawing a
single row of the grid. You can test your work in our starter
implementation of visualize_selection_sort
, which draws
just the top row of the grid. Try creating a random list of 20 numbers
between 0–255, and calling visualize_selection_sort
on this
list. You should see an image similar to the one pictured above, except
only the top row is displayed (and the rest of the window is
white).
Implement the visualization functions
visualize_selection_sort
and
visualize_insertion_sort
. Your implementations can
copy-and-paste from our algorithms, but should also call your
draw_list
helper in the appropriate place in the loop body.
See visualize_selection_sort
’s docstring for some
implementation notes that apply to both of these functions.
After you’re done, try calling each function, choosing a random list of size 10, 100, and 200. You should see each sorting algorithm visualized in a top-down manner. Make sure you can identify which part of the visualization is the “sorted” vs. “unsorted” part of each list, and how the two algorithms differ in what makes up the sorted part.
Now repeat Questions 1 and 2 for the “Tutorial exercises (colour
version)” section. For those functions, the input list being sorted is
now a list of tuples storing the RGB values of colours.
Unfortunately, just sorting these RGB tuples directly isn’t very
visually pleasing, and so we’ve provided a helper function
rgb_to_hsv
that converts each tuple to a different colour
representation known as Hue-Saturation-Value (HSV).
So in the visualization functions, you’ll need to run the sorting algorithms on this list of tuples, but also use this helper function as the “sort key”, using the generalized sorting technique we introduced in lecture.
To test your work, try generating some random lists of colours and running your visualization algorithms on them. (Remember, each random colour should be a random tuple with 3 elements in the range 0–255.) You should see a beautiful 🌈 rainbow 🌈 of colours in the “sorted part” of each algorithm!
This tutorial exercises is inspired by https://corte.si/posts/code/visualisingsorting/.↩︎
You might recall grayscale colours from our first tutorial all the way back in CSC110!↩︎
The reason for this is a bit subtle. Normally, we would show both the initial state of the list and the state of the list after each of the \(n\) loop iterations. But this results in \(n + 1\) rows total, which breaks the symmetry of our visualization. But for insertion sort, the initial list state and the state after the first loop iteration are identical, and so we can skip drawing the initial state. For selection sort, the states of the list at the beginning and end of the last iteration are equal, and so we can skip drawing the very last state. ↩︎