Insertion Sort and the Philosophy of Algorithm Analysis
Article Image
Article Image
read

In the previous posts we explored growth rates, Big-O, and induction as the mathematical backbone of reasoning about algorithms. Today we begin our journey through sorting algorithms, and we start with the most human of them all: Insertion Sort.

Think of a musician arranging sheet music on a stand, sliding each page into the correct position relative to those already placed. Insertion sort is exactly that: an algorithm that orders elements by repeatedly inserting each one into its proper place in an already sorted prefix of the array.


The Algorithm

The problem: given an array $A[1..n]$, rearrange its elements in non-decreasing order.

The pseudocode is as follows:

Insertion-Sort(A, n)
1  for j := 2 to n
2      x := A[j]
3      i := j − 1
4      while i > 0 and A[i] > x
5          A[i+1] := A[i]
6          i := i − 1
7      A[i+1] := x

At each step, the element $A[j]$ is placed into the correct position among $A[1..j-1]$.


Correctness and the Invariant

To prove that an algorithm works, we need more than intuition. We rely on loop invariants, which play the same role as the induction hypothesis in mathematical proofs.

Here is the invariant:

Before each iteration of the outer loop (line 1), the subarray $A[1..j-1]$ is already sorted in non-decreasing order.

  • Base case: When $j = 2$, the subarray $A[1..1]$ has one element and is trivially sorted.
  • Inductive step: Suppose the invariant holds for some $j$. The algorithm inserts $A[j]$ into its proper place, shifting larger elements to the right. The result is that $A[1..j]$ is sorted.
  • Termination: When $j = n+1$, the invariant implies that $A[1..n]$ is sorted.

Thus, by induction, the algorithm is correct.

This mirrors the logic of falling dominoes: if the first stands, and each one ensures the next, all must fall.


Time Complexity Analysis

Algorithm analysis is not about counting milliseconds, but about identifying how the number of operations grows with input size $n$.

We assume each line costs 1 unit of time. Let $T(n)$ denote the worst-case running time.

  • Line 1 executes $n-1$ times.
  • Lines 2, 3, and 7 also execute $n-1$ times.
  • Lines 4, 5, and 6 form the inner loop. In the worst case (when the array is in descending order), the loop runs approximately $1 + 2 + \dots + (n-1)$ times.

This sum is

\[1 + 2 + \dots + (n-1) = \frac{n(n-1)}{2}.\]

So, combining all contributions,

\[T(n) \leq \frac{3}{2}n^2 + \frac{7}{2}n - 4.\]

Asymptotically, the quadratic term dominates, and we write:

\[T(n) = \Theta(n^2).\]
  • Best case (already sorted array): The inner loop never shifts elements, so $T(n) = \Theta(n)$.
  • Worst case (reverse sorted): Every insertion shifts nearly the entire prefix, so $T(n) = \Theta(n^2)$.

Insertion sort thus adapts: it is fast for nearly sorted arrays, but slow for large disordered ones.


Counting Comparisons

Another way to analyze is to count comparisons. Each execution of line 4 performs $A[i] > x$.

  • Worst case: $(n^2 - n)/2$ comparisons.
  • Best case: $n-1$ comparisons.

This perspective reveals the algorithm’s bottleneck: comparisons, not assignments, drive complexity.


Recursive Formulation

Though naturally iterative, Insertion Sort can be rewritten recursively:

Insertion-R(A, n)
1  if n > 1
2      Insertion-R(A, n−1)
3      x := A[n]
4      i := n−1
5      while i > 0 and A[i] > x
6          A[i+1] := A[i]
7          i := i−1
8      A[i+1] := x

Here, the array of size $n-1$ is sorted recursively, and then $A[n]$ is inserted.

The recurrence relation for the worst case is:

\[T(n) = T(n-1) + 3n + 2, \quad T(1)=1.\]

Solving this recurrence (by expansion or induction):

\[T(n) = \frac{3}{2}n^2 + \frac{7}{2}n - 4.\]

Again, quadratic time. Recursion brings no asymptotic gain, but it illuminates the connection between algorithmic recursion and mathematical induction.


Reflection: Beyond the Quadratic Wall

Insertion Sort is not efficient for large $n$, but its simplicity and adaptability make it fundamental:

  • For small arrays, it is often faster than more complex algorithms.
  • In hybrid approaches (like Timsort, used in Python), Insertion Sort handles small subarrays inside faster algorithms.
  • Its analysis is the gateway to algorithmic thinking: we learn invariants, best/worst cases, and asymptotics.

In a sense, Insertion Sort is the “etude” of algorithmics: like scales for pianists, it is not meant to impress, but to train the mind and hands.


Exercises for the Reader

  1. Prove formally the loop invariant of Insertion Sort.
  2. Derive the exact number of assignments (line 5) in the worst case.
  3. Show that the number of comparisons in the worst case is exactly $(n^2-n)/2$.
  4. Implement Insertion Sort recursively in Python and measure its performance against the iterative version.
  5. Compare the behavior of Insertion Sort with Selection Sort for arrays that are already sorted or reverse sorted.

Beyond the Algorithm

Insertion Sort is the first glimpse of a larger story. From the quadratic simplicity of $O(n^2)$, we will soon encounter the elegance of divide and conquer methods, where logarithms dance with linear terms to yield $O(n \log n)$ complexity.

Just as in music, where practice scales prepare us for symphonies, Insertion Sort prepares us for the great algorithms to come.


References

Blog Logo

Richardson Lima


Published

Image

Richardson Lima

A brain dump about technology and some restrict interests.

Back to Overview