🎯 Situation

A BI analyst asked me how to rank customers by revenue within each region in SQL — showing each customer's rank alongside their individual revenue and the total. He'd tried to do it with a subquery and a GROUP BY and had gotten close but couldn't get the per-region ranking right without collapsing the detail rows. One window function replaced the entire subquery approach and ran 3x faster.

👉 Window functions perform calculations across a set of rows related to the current row — without collapsing those rows into one. That's the key distinction from GROUP BY. GROUP BY returns one row per group. Window functions return one row per original row, with an additional calculated column.

⚠️ Challenge

📈 The 4 window function families

  • Ranking: ROW_NUMBER(), RANK(), DENSE_RANK() — assign a sequential number to rows within a partition
  • Aggregation: SUM() OVER(), AVG() OVER(), COUNT() OVER() — calculate a total or average without GROUP BY
  • Navigation: LAG(), LEAD() — access previous or next row values (month-over-month comparisons)
  • Frame-based: SUM() OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) — rolling windows

🔨 The OVER() clause — how window functions work

  • PARTITION BY — defines the groups (like GROUP BY, but doesn't collapse rows)
  • ORDER BY — defines the row order within each partition
  • ROWS/RANGE BETWEEN — defines the frame (how many preceding/following rows to include)
  • Example: SUM(revenue) OVER (PARTITION BY region ORDER BY date) calculates running total by region
  • If PARTITION BY is omitted, the window spans the entire result set

🔍 Analysis

The 4 most useful patterns with real SQL:

-- 1. Rank customers by revenue within each region
SELECT customer_name, region, revenue,
    RANK() OVER (PARTITION BY region ORDER BY revenue DESC) AS region_rank
FROM customer_revenue;

-- 2. Running total by month
SELECT month, revenue,
    SUM(revenue) OVER (ORDER BY month) AS running_total
FROM monthly_revenue;

-- 3. Month-over-month change (replaces self-join)
SELECT month, revenue,
    LAG(revenue) OVER (ORDER BY month) AS prev_month,
    revenue - LAG(revenue) OVER (ORDER BY month) AS change
FROM monthly_revenue;

-- 4. 3-month moving average
SELECT month, revenue,
    AVG(revenue) OVER (
        ORDER BY month
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS moving_avg_3m
FROM monthly_revenue;
These 4 patterns cover the majority of window function use cases in BI work. The ranking pattern alone replaces the most common 'top N per group' subquery pattern that slows down many production reports.

✓️ Best Practice

When to use window functions vs. other approaches:

  • Running totals and moving averages → always use window functions, not subqueries
  • Rankings within groups → RANK() OVER (PARTITION BY) replaces correlated subqueries
  • Month-over-month → LAG() replaces self-joins on the same table
  • Percentage of total → SUM(revenue) OVER () in the denominator (no PARTITION BY = full table sum)
  • Top N per category → ROW_NUMBER() in a CTE, then filter WHERE row_num <= N

💡 Summary

Window functions are the single biggest SQL productivity unlock for BI analysts who already know GROUP BY and JOINs. They eliminate the most convoluted subquery patterns and make calculations like running totals, moving averages, and per-group rankings a one-liner. The OVER() clause syntax feels strange the first time — it makes complete sense after the second.

👉 If you know GROUP BY and JOIN, you're 90% of the way to window functions.

The OVER() clause is the 10% that makes the other 90% twice as powerful.