Insertion Sort and the Philosophy of Algorithm Analysis

read

In the previous posts we explored growth rates, Big-O, and induction as the mathematical backbone of reasoning about algorithms. Today we begin our journey through sorting algorithms, and we start with the most human of them all: Insertion Sort.

Think of a musician arranging sheet music on a stand, sliding each page into the correct position relative to those already placed. Insertion sort is exactly that: an algorithm that orders elements by repeatedly inserting each one into its proper place in an already sorted prefix of the array.

The Algorithm

The problem: given an array $A[1..n]$, rearrange its elements in non-decreasing order.

The pseudocode is as follows:

Insertion-Sort(A, n)
for j := 2 to n
    x := A[j]
    i := j − 1
    while i > 0 and A[i] > x
        A[i+1] := A[i]
        i := i − 1
    A[i+1] := x

At each step, the element $A[j]$ is placed into the correct position among $A[1..j-1]$.

Correctness and the Invariant

To prove that an algorithm works, we need more than intuition. We rely on loop invariants, which play the same role as the induction hypothesis in mathematical proofs.

Here is the invariant:

Before each iteration of the outer loop (line 1), the subarray $A[1..j-1]$ is already sorted in non-decreasing order.

Base case: When $j = 2$, the subarray $A[1..1]$ has one element and is trivially sorted.
Inductive step: Suppose the invariant holds for some $j$. The algorithm inserts $A[j]$ into its proper place, shifting larger elements to the right. The result is that $A[1..j]$ is sorted.
Termination: When $j = n+1$, the invariant implies that $A[1..n]$ is sorted.

Thus, by induction, the algorithm is correct.

This mirrors the logic of falling dominoes: if the first stands, and each one ensures the next, all must fall.

Time Complexity Analysis

Algorithm analysis is not about counting milliseconds, but about identifying how the number of operations grows with input size $n$.

We assume each line costs 1 unit of time. Let $T(n)$ denote the worst-case running time.

Line 1 executes $n-1$ times.
Lines 2, 3, and 7 also execute $n-1$ times.
Lines 4, 5, and 6 form the inner loop. In the worst case (when the array is in descending order), the loop runs approximately $1 + 2 + \dots + (n-1)$ times.

This sum is

\[1 + 2 + \dots + (n-1) = \frac{n(n-1)}{2}.\]

So, combining all contributions,

\[T(n) \leq \frac{3}{2}n^2 + \frac{7}{2}n - 4.\]

Asymptotically, the quadratic term dominates, and we write:

\[T(n) = \Theta(n^2).\]

Best case (already sorted array): The inner loop never shifts elements, so $T(n) = \Theta(n)$.
Worst case (reverse sorted): Every insertion shifts nearly the entire prefix, so $T(n) = \Theta(n^2)$.

Insertion sort thus adapts: it is fast for nearly sorted arrays, but slow for large disordered ones.

Counting Comparisons

Another way to analyze is to count comparisons. Each execution of line 4 performs $A[i] > x$.

Worst case: $(n^2 - n)/2$ comparisons.
Best case: $n-1$ comparisons.

This perspective reveals the algorithm’s bottleneck: comparisons, not assignments, drive complexity.

Recursive Formulation

Though naturally iterative, Insertion Sort can be rewritten recursively:

Insertion-R(A, n)
if n > 1
    Insertion-R(A, n−1)
    x := A[n]
    i := n−1
    while i > 0 and A[i] > x
        A[i+1] := A[i]
        i := i−1
    A[i+1] := x

Here, the array of size $n-1$ is sorted recursively, and then $A[n]$ is inserted.

The recurrence relation for the worst case is:

\[T(n) = T(n-1) + 3n + 2, \quad T(1)=1.\]

Solving this recurrence (by expansion or induction):

\[T(n) = \frac{3}{2}n^2 + \frac{7}{2}n - 4.\]

Again, quadratic time. Recursion brings no asymptotic gain, but it illuminates the connection between algorithmic recursion and mathematical induction.

Reflection: Beyond the Quadratic Wall

Insertion Sort is not efficient for large $n$, but its simplicity and adaptability make it fundamental:

For small arrays, it is often faster than more complex algorithms.
In hybrid approaches (like Timsort, used in Python), Insertion Sort handles small subarrays inside faster algorithms.
Its analysis is the gateway to algorithmic thinking: we learn invariants, best/worst cases, and asymptotics.

In a sense, Insertion Sort is the “etude” of algorithmics: like scales for pianists, it is not meant to impress, but to train the mind and hands.

Exercises for the Reader

Prove formally the loop invariant of Insertion Sort.
Derive the exact number of assignments (line 5) in the worst case.
Show that the number of comparisons in the worst case is exactly $(n^2-n)/2$.
Implement Insertion Sort recursively in Python and measure its performance against the iterative version.
Compare the behavior of Insertion Sort with Selection Sort for arrays that are already sorted or reverse sorted.

Beyond the Algorithm

Insertion Sort is the first glimpse of a larger story. From the quadratic simplicity of $O(n^2)$, we will soon encounter the elegance of divide and conquer methods, where logarithms dance with linear terms to yield $O(n \log n)$ complexity.

Just as in music, where practice scales prepare us for symphonies, Insertion Sort prepares us for the great algorithms to come.

References

Paulo Feofiloff, Análise de Algoritmos (IME-USP)
T. H. Cormen, C. Leiserson, R. Rivest, C. Stein. Introduction to Algorithms (CLRS), MIT Press.
Carlos Alberto Sanches Alonso, CT-234, Instituto Tecnológico de Aeronáutica (ITA)
Visualizations of Sorting Algorithms (USFCA)

Insertion Sort and the Philosophy of Algorithm Analysis

Richardson Lima

The Algorithm

Correctness and the Invariant

Time Complexity Analysis

Counting Comparisons

Recursive Formulation

Reflection: Beyond the Quadratic Wall

Exercises for the Reader

Beyond the Algorithm

References

Written by

Richardson Lima

Supported by

Richardson Lima

A brain dump about technology and some restrict interests.