Introduction to DSA & Complexity Analysis
Introduction to DSA & Complexity Analysis
What is DSA and Why Does It Matter?
Every program you write manipulates data. The way you structure that data and the steps you take to process it determines whether your program handles 100 users or 100 million users.
Real-world analogy: Imagine finding a name in a phone book. You could start from page 1 and read every name until you find it — that works for 10 names, but with 10 million names it takes forever. Or you could open the middle, check if the name is before or after, and repeat — finding any name in about 23 steps regardless of size. Same problem, completely different approach. That’s the difference DSA makes.
Data Structure — how you organize data in memory (array, tree, hash table, graph…)
Algorithm — the step-by-step procedure to solve a problem using that data
Part 1 — Complexity Analysis
Why We Measure Complexity
We need a way to compare algorithms that is: - Independent of hardware (a fast computer makes every algorithm faster — that doesn’t help comparison) - Independent of language (Python vs C++ implementation details) - Focused on growth (how does performance change as input grows?)
Big O Notation
Big O describes an algorithm’s worst-case growth rate as input size n grows toward infinity.
Formal definition: f(n) = O(g(n)) means there exist constants c and n₀ such that f(n) ≤ c·g(n) for all n ≥ n₀.
Practical meaning: We drop constants and lower-order terms, keep only the dominant term.
T(n) = 3n² + 5n + 100
= O(n²) ← drop constants, drop lower terms
Why drop constants? Because constants depend on hardware and implementation. O(2n) and O(n) both describe linear growth — they scale identically.
Common Complexity Classes
From fastest to slowest:
| Notation | Name | Example | n=1000 operations |
|---|---|---|---|
| O(1) | Constant | Array access | 1 |
| O(log n) | Logarithmic | Binary search | ~10 |
| O(n) | Linear | Linear search | 1,000 |
| O(n log n) | Linearithmic | Merge sort | ~10,000 |
| O(n²) | Quadratic | Bubble sort | 1,000,000 |
| O(n³) | Cubic | Matrix multiply (naive) | 1,000,000,000 |
| O(2ⁿ) | Exponential | All subsets | 2^1000 (impossible) |
| O(n!) | Factorial | All permutations | 1000! (astronomical) |
Analogy for each:
- O(1): Looking up a word in a dictionary you memorized — instant regardless of size
- O(log n): Finding a word in an actual dictionary by halving pages — 23 steps for a million words
- O(n): Reading every page of a book to find a phrase
- O(n log n): Sorting a deck of cards using merge sort
- O(n²): Checking every pair of people in a room for shared birthdays
- O(2ⁿ): Trying every possible combination on a lock with n dials
Rules for Calculating Complexity
Rule 1 — Drop constants:
O(5n) = O(n)
O(1000) = O(1)
Rule 2 — Drop lower-order terms:
O(n² + n) = O(n²)
O(n + log n) = O(n)
Rule 3 — Sequential steps add:
for (int i = 0; i < n; i++) { ... } // O(n)
for (int i = 0; i < n; i++) { ... } // O(n)
// Total: O(n) + O(n) = O(2n) = O(n)
Rule 4 — Nested steps multiply:
for (int i = 0; i < n; i++) // O(n)
for (int j = 0; j < n; j++) // O(n) each
{ ... }
// Total: O(n × n) = O(n²)
Rule 5 — Different inputs use different variables:
for (int i = 0; i < a; i++) { ... } // O(a)
for (int i = 0; i < b; i++) { ... } // O(b)
// Total: O(a + b) NOT O(n)
Worked Examples
Example 1: What is the complexity?
int sum = 0;
for (int i = 1; i <= n; i++)
for (int j = 1; j <= i; j++)
sum++;
Inner loop runs 1, 2, 3, …, n times → total = n(n+1)/2 = O(n²)
Example 2:
int i = n;
while (i > 1)
i = i / 2;
How many times can you divide n by 2 before reaching 1? log₂(n) times → O(log n)
Example 3:
for (int i = 0; i < n; i++)
for (int j = 0; j < log(n); j++)
{ ... }
Outer loop: n times. Inner loop: log n times each. Total: O(n log n)
Space Complexity
Space complexity measures the extra memory used by an algorithm (not counting input).
// O(1) space — only a few variables
int sum(vector<int>& arr) {
int total = 0;
for (int x : arr) total += x;
return total;
}
// O(n) space — extra array proportional to input
vector<int> duplicate(vector<int>& arr) {
vector<int> result(arr.size()); // n extra space
for (int i = 0; i < arr.size(); i++)
result[i] = arr[i];
return result;
}
// O(n) space — recursion stack depth n
int factorial(int n) {
if (n == 0) return 1;
return n * factorial(n-1); // n frames on stack
}
Other Notations
| Notation | Meaning |
|---|---|
| O(f) | Upper bound — worst case (most common) |
| Ω(f) | Lower bound — best case |
| Θ(f) | Tight bound — both best and worst are same |
Example — linear search: - Best case: Ω(1) — element is first - Worst case: O(n) — element is last or not present - No tight bound Θ — best and worst differ
Example — merge sort: - Always O(n log n) regardless → Θ(n log n)
Part 2 — Asymptotic Analysis in Practice
Amortized Analysis
Sometimes a single operation is expensive, but the average over many operations is cheap.
Example — dynamic array (vector) push_back: - Most pushes: O(1) — just add to end - Occasional resize: O(n) — copy everything to new array - But resizing doubles capacity each time, so it happens rarely
Amortized cost: Over n pushes, at most n copies total → O(1) amortized per push.
Best, Average, Worst Case
Quick sort example: - Best case: O(n log n) — pivot always splits evenly - Average case: O(n log n) — random pivot is usually decent - Worst case: O(n²) — pivot always picks smallest/largest (sorted input)
This is why randomized pivot selection is important in practice.
Practice Problems
What is the time complexity of each?
// a) for (int i = 0; i < n; i += 2) { ... } // b) for (int i = n; i > 0; i /= 3) { ... } // c) for (int i = 0; i < n; i++) for (int j = i; j < n; j++) { ... } // d) int fib(int n) { if (n <= 1) return n; return fib(n-1) + fib(n-2); }An algorithm takes 1 second for n=1000. Estimate time for n=10000 if it is:
- O(n)
- O(n²)
- O(n log n)
Rank these functions from slowest to fastest growth: n!, 2ⁿ, n³, n², n log n, n, log n, 1
Answers
Problem 1:
a) O(n) — loop variable increases by 2, still n/2 iterations = O(n)
b) O(log n) — dividing by 3 each time
c) O(n²) — triangular sum: n + (n-1) + ... + 1 = n(n+1)/2
d) O(2ⁿ) — each call branches into 2 more, depth n
Problem 2:
a) O(n): 10× input → 10× time → 10 seconds
b) O(n²): 10× input → 100× time → 100 seconds
c) O(n log n): 10× input → ~11× time → ~11 seconds
Problem 3 (slowest to fastest):
n! > 2ⁿ > n³ > n² > n log n > n > log n > 1
References
- Cormen et al. — Introduction to Algorithms (CLRS) — Chapter 3
- Sedgewick & Wayne — Algorithms (4th ed.) — Chapter 1
- MIT 6.006 — Lecture 1: Algorithmic Thinking
- Big-O Cheat Sheet — bigocheatsheet.com