🎯 Situation
Une analyste junior a rejoint l'équipe d'un client avec de solides compétences Power BI mais presque pas de SQL. En une semaine, elle était bloquée sur trois tâches différentes : joindre deux tables que Power Query gérait mal, calculer une comparaison mois sur mois que DAX compliquait, et identifier les clients sans commandes depuis 90 jours. Les trois étaient des requêtes SQL de 5 lignes.
⚠️ Challenge
📈 Why SQL before Power Query
- SQL runs on the database — transformation happens before data reaches Power BI
- Joins in SQL are faster and more predictable than Power Query merges on large tables
- SQL is readable by any developer — logic documented in one place
- Views can be reused across Power BI, Python, Excel, and any other tool
🔧 What you need to know first
- SELECT, FROM, WHERE, GROUP BY — the absolute basics
- JOINs: INNER, LEFT — 90% of cases use one of these two
- Aliases: AS keyword for clean column names
- NULL handling: IS NULL, COALESCE, ISNULL
🔍 Analyse
The 5 patterns:
1. Aggregation with GROUP BY — revenue, count, average per dimension (customer, region, product). The foundation of every BI report.
2. Month-over-month comparison with LAG() — window function that accesses the previous row. Replaces complex DAX time intelligence for many use cases.
3. LEFT JOIN to find missing records — customers with no orders, products never sold, regions with no data. The anti-join pattern that's impossible to express cleanly in Power Query.
4. CTE (WITH clause) — break a complex query into named steps. Makes 30-line queries readable. Every BI analyst who writes SQL regularly should use CTEs by default.
5. CASE WHEN for conditional columns — the SQL equivalent of IF statements. Segment customers, categorize products, flag anomalies — all in a single SELECT.
✓️ Bonne pratique
Each pattern as a concrete example:
Pattern 1 — Aggregation:
SELECT region, SUM(revenue) AS total_revenue, COUNT(DISTINCT customer_id) AS customers
FROM orders GROUP BY region ORDER BY total_revenue DESC;
Pattern 2 — Month-over-month with LAG:
SELECT month, revenue, LAG(revenue) OVER (ORDER BY month) AS prev_month,
revenue - LAG(revenue) OVER (ORDER BY month) AS mom_change
FROM monthly_revenue;
Pattern 3 — Find missing records (LEFT JOIN):
SELECT c.customer_id, c.name FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
AND o.order_date >= CURRENT_DATE - INTERVAL '90 days'
WHERE o.order_id IS NULL;
Pattern 4 — CTE:
WITH monthly AS (SELECT DATE_TRUNC('month', order_date) AS month, SUM(revenue) AS rev FROM orders GROUP BY 1)
SELECT month, rev, LAG(rev) OVER (ORDER BY month) AS prev FROM monthly;
Pattern 5 — CASE WHEN:
SELECT customer_id, total_revenue,
CASE WHEN total_revenue > 100000 THEN 'Tier 1'
WHEN total_revenue > 50000 THEN 'Tier 2'
ELSE 'Tier 3' END AS customer_tier
FROM customer_revenue;
💡 Synthèse
Five patterns. That's the starting point. Not the end — but the foundation that unlocks 80% of the SQL work a BI analyst encounters in practice. Learn them until writing them is automatic, then build from there.
👉 SQL mastery for BI analysts isn't about knowing everything.
It's about knowing these 5 patterns cold — and building from there.