-- CREATE: Insert new data
INSERT INTO employees (name, salary) VALUES ('John', 50000);
-- READ: Retrieve data
SELECT * FROM employees WHERE id = 1;
-- UPDATE: Modify existing data
UPDATE employees SET salary = 55000 WHERE id = 1;
-- DELETE: Remove data
DELETE FROM employees WHERE id = 1;
Why it matters: CRUD is the foundation of database work. Demonstrating solid CRUD knowledge shows proficiency in fundamental database operations and SQL understanding.
Real applications: Every web application performs CRUD operations—user registration (CREATE), viewing profiles (READ), changing settings (UPDATE), and account deletion (DELETE).
Common mistakes: Forgetting WHERE clauses causing UPDATE or DELETE to affect all rows, using SELECT without proper filtering returning unnecessary data, or not understanding transaction context of CRUD operations.
-- Single row INSERT
INSERT INTO users (name, email) VALUES ('Alice', 'alice@example.com');
-- Multiple rows in one INSERT
INSERT INTO users (name, email) VALUES
('Bob', 'bob@example.com'),
('Charlie', 'charlie@example.com');
-- INSERT...SELECT copies from another table
INSERT INTO users_backup SELECT * FROM users WHERE status = 'inactive';
-- INSERT IGNORE: skip on constraint violation
INSERT IGNORE INTO users (id, name) VALUES (1, 'Duplicate');
-- REPLACE: delete & insert if key exists
REPLACE INTO users (id, name) VALUES (1, 'NewName');
Why it matters: Mastering INSERT variations allows efficient bulk operations and data migration. Understanding different INSERT modes prevents transaction issues in production.
Real applications: User registration uses single INSERT, data imports use bulk INSERT, archiving uses INSERT...SELECT, and deduplication uses REPLACE.
Common mistakes: Forgetting column list causing positional mismatch, not using multi-row INSERT for bulk data (much slower), or using REPLACE when UPDATE intended causing unnecessary deletions.
-- INSERT IGNORE: Skip row on constraint violation
INSERT IGNORE INTO users (id, name) VALUES (1, 'NewName');
-- If id=1 exists, this row is skipped, existing data untouched
-- REPLACE: Delete old & insert new
REPLACE INTO users (id, name) VALUES (1, 'NewName');
-- If id=1 exists, delete it then insert new row
-- Practical example:
-- INSERT IGNORE for API uploads with potential duplicates
INSERT IGNORE INTO import_logs (id, status) VALUES (100, 'processed');
-- REPLACE for configuration updates where last-write-wins
REPLACE INTO config (setting, value) VALUES ('theme', 'dark');
Why it matters: Choosing the right insertion method prevents data loss or unwanted duplicates. This shows understanding of constraint handling in production scenarios.
Real applications: User imports use INSERT IGNORE to skip duplicates. Configuration tables use REPLACE for last-write-wins updates. Cache invalidation uses REPLACE.
Common mistakes: Using REPLACE unintentionally losing data, using INSERT IGNORE thinking it's like upsert, or not understanding REPLACE triggers cascade deletes on dependent tables.
-- Basic SELECT with filtering
SELECT name, email FROM users WHERE status = 'active';
-- SELECT with calculations
SELECT name, salary * 1.1 as salary_with_bonus FROM employees;
-- SELECT with sorting and limit
SELECT name FROM customers ORDER BY registration_date DESC LIMIT 10;
-- SELECT with aliases
SELECT u.name as customer_name, COUNT(o.id) as total_orders
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id
HAVING COUNT(o.id) > 5
ORDER BY total_orders DESC;
Why it matters: SELECT is the most frequently used SQL statement. Query performance depends on proper WHERE clauses, appropriate joins, and appropriate LIMIT usage.
Real applications: Every dashboard query, report, and search feature uses SELECT. Pagination uses OFFSET LIMIT. Analytics use GROUP BY aggregations.
Common mistakes: SELECT * without LIMIT returning millions of rows, missing WHERE clauses filtering unnecessary data, or using SELECT without indexes on filter columns causing scan delays.
-- Update single column
UPDATE users SET status = 'inactive' WHERE id = 5;
-- Update multiple columns atomically
UPDATE orders
SET status = 'shipped', shipped_date = NOW()
WHERE id = 100;
-- Update based on calculation
UPDATE employees SET salary = salary * 1.05 WHERE department = 'Sales';
-- Update from another table
UPDATE users u
SET u.status = 'verified'
WHERE u.id IN (SELECT user_id FROM email_confirmations);
Why it matters: Proper WHERE usage prevents accidentally updating all records. Understanding UPDATE performance impact shows database design maturity.
Real applications: Status changes (order shipped), bulk updates (annual raises), verification marking (email confirmed), and field calculations (age increments) all use UPDATE.
Common mistakes: Forgetting WHERE clause updating entire table, using string comparison for numeric IDs, or updating without transaction context losing changes on rollback.
-- Delete specific row
DELETE FROM users WHERE id = 5;
-- Delete multiple rows with condition
DELETE FROM orders WHERE status = 'cancelled' AND amount < 10;
-- Delete all rows (use carefully!)
DELETE FROM temporary_data;
-- Delete with JOIN (cascade simulation)
DELETE u FROM users u
WHERE u.status = 'inactive'
AND u.last_login < DATE_SUB(NOW(), INTERVAL 1 YEAR);
Why it matters: Understanding DELETE consequences prevents accidental data loss. Production environments often avoid hard deletes due to audit and recovery needs.
Real applications: Cleanup tasks delete expired sessions. Soft deletes mark records as deleted without actual removal. Archive operations move old data before deletion.
Common mistakes: Forgetting WHERE clause deleting entire tables, not understanding DELETE triggers cascade deletes, or deleting production data without backups causing permanent loss.
-- TRUNCATE: Fast removal of all rows, resets AUTO_INCREMENT
TRUNCATE TABLE sessions;
-- Next insert will have id = 1 again
-- DELETE: Slower row-by-row, preserves AUTO_INCREMENT
DELETE FROM sessions;
-- Next insert will continue from last id
-- Comparison in transaction context:
-- TRUNCATE cannot be rolled back in all MySQL versions reliably
-- DELETE can be rolled back within transactions
-- Practical: TRUNCATE for development, DELETE for production
Why it matters: Understanding TRUNCATE vs DELETE impacts production cleanup scripts and transaction safety. Misuse causes unexpected behavior in sequential operations.
Real applications: Test cleanup uses TRUNCATE. Production uses DELETE with WHERE for targeted removal. Cache tables use TRUNCATE on invalidation.
Common mistakes: Using TRUNCATE in transactions relying on rollback, using TRUNCATE without confirming intent, or using DELETE for entire table clearing when TRUNCATE is faster.
-- Get top 10 recent posts
SELECT * FROM posts ORDER BY created_at DESC LIMIT 10;
-- Pagination: Page 2 with 10 per page (OFFSET 10)
SELECT * FROM users ORDER BY id LIMIT 10 OFFSET 10;
-- Alternative syntax: LIMIT offset, count
SELECT * FROM users ORDER BY id LIMIT 10, 10;
-- Common pagination pattern
SET @page = 2;
SET @per_page = 20;
SELECT * FROM users
ORDER BY id
LIMIT (@page - 1) * @per_page, @per_page;
Why it matters: LIMIT prevents memory exhaustion and improves response times. Pagination is essential for user-facing interfaces handling large datasets.
Real applications: Search results show 10-50 items per page using LIMIT. APIs return paginated data. Reports limit output for readability.
Common mistakes: SELECT without LIMIT on large tables returning millions of rows, using large OFFSET causing performance degradation, or not using ORDER BY making LIMIT results unpredictable.
-- Get unique customer states
SELECT DISTINCT state FROM customers;
-- Get unique product categories
SELECT DISTINCT category FROM products;
-- Count distinct emails
SELECT COUNT(DISTINCT email) as unique_customers FROM users;
-- Multiple columns: unique combinations
SELECT DISTINCT city, state FROM customers;
-- Returns one row per unique city-state combination
-- DISTINCT impacts performance
SELECT DISTINCT name FROM customers; -- Slow on large tables
-- Better: Use GROUP BY or index for query optimization
Why it matters: DISTINCT is useful for data exploration but can cause performance issues. Understanding when DISTINCT is necessary vs using GROUP BY shows query optimization knowledge.
Real applications: Reporting unique visitors, finding all states where customers exist, counting unique email subscribers, generating distinct category lists.
Common mistakes: Using DISTINCT without proper indexing causing full scans, not filtering with WHERE before DISTINCT, or using DISTINCT on entire rows instead of specific columns unnecessarily.
-- Aggregate functions
SELECT COUNT(*) as total_users FROM users;
SELECT SUM(salary) as total_payroll FROM employees;
SELECT AVG(price) as average_product_price FROM products;
SELECT MIN(salary) as lowest_salary, MAX(salary) as highest_salary FROM employees;
-- COUNT(DISTINCT...) counts unique values
SELECT COUNT(DISTINCT city) as unique_cities FROM customers;
-- GROUP BY with aggregates
SELECT department, COUNT(*) as emp_count, AVG(salary) as avg_salary
FROM employees
GROUP BY department;
Why it matters: Aggregate functions are foundational for reporting and analytics. Many interview problems involve correctly using aggregates with GROUP BY and HAVING.
Real applications: Dashboards show user counts, revenue totals, and average order values using aggregates. Analytics compute metrics per segment using GROUP BY aggregates.
Common mistakes: Using aggregate functions without GROUP BY or GROUP BY missing columns, forgetting NULLs are excluded from aggregates, or using wrong aggregate function ( AVG instead of SUM).
-- GROUP BY with aggregates
SELECT department, COUNT(*) as emp_count, AVG(salary) as avg_salary
FROM employees
GROUP BY department;
-- HAVING to filter groups
SELECT city, COUNT(*) as cust_count
FROM customers
GROUP BY city
HAVING COUNT(*) > 5; -- Only cities with more than 5 customers
-- Wrong: Non-aggregated column not in GROUP BY
-- SELECT department, salary, COUNT(*) FROM employees GROUP BY department;
-- Correct: All non-aggregated columns in GROUP BY
SELECT department, COUNT(*) FROM employees GROUP BY department;
Why it matters: GROUP BY with HAVING is essential for complex reporting. Interviewers frequently test understanding of GROUP BY constraints and HAVING vs WHERE.
Real applications: Sales reports breakdown by region, customer segmentation by purchase count, user demographics by age group all use GROUP BY with HAVING.
Common mistakes: Forgetting to include all non-aggregated columns in GROUP BY, using aggregate functions in WHERE instead of HAVING, or confusing HAVING filter order with WHERE.
-- Sort by single column, ascending (default)
SELECT * FROM users ORDER BY registration_date;
-- Sort descending
SELECT * FROM products ORDER BY price DESC;
-- Multiple columns: sort by department, then by salary
SELECT * FROM employees ORDER BY department, salary DESC;
-- Sort by calculated column
SELECT name, salary * 1.1 as updated_salary
FROM employees
ORDER BY updated_salary DESC;
-- Sort by column position (not recommended)
SELECT name, salary FROM employees ORDER BY 2 DESC;
Why it matters: Correct sorting is crucial for user-facing queries showing results in expected order. ORDER BY performance varies significantly with and without indexes.
Real applications: Recent posts sorted by date, products sorted by price, leaderboards sorted by score, search results sorted by relevance.
Common mistakes: ORDER BY on non-indexed columns causing full table scans, sorting text as numbers causing "10" before "2", or forgetting LIMIT with ORDER BY causing massive result processing.
-- Column alias
SELECT salary * 1.1 AS salary_with_bonus FROM employees;
-- Table alias for clarity
SELECT u.name, u.email FROM users AS u WHERE u.status = 'active';
-- Multiple table aliases
SELECT u.name, o.order_date, o.total
FROM users AS u
JOIN orders AS o ON u.id = o.user_id;
-- Alias for calculated fields
SELECT
name,
YEAR(registration_date) as registration_year,
MONTH(registration_date) as registration_month
FROM users;
Why it matters: Aliases greatly improve code readability and maintainability. Well-aliased queries are easier to understand and debug in complex scenarios.
Real applications: Complex joins use table aliases to disambiguate columns. Calculated fields use column aliases for clarity. Reports use meaningful alias names for export headers.
Common mistakes: Overusing unclear single-letter aliases like 'a', 'b', not aliasing calculated fields making output confusing, or forgetting alias context in complex joins causing ambiguity.
-- Comparison operators
WHERE age > 18
WHERE status = 'active'
WHERE salary <> 40000
-- Logical operators
WHERE status = 'active' AND age > 18
WHERE country = 'USA' OR country = 'Canada'
WHERE NOT status = 'deleted'
-- IN for value lists
WHERE status IN ('active', 'pending', 'verified')
-- BETWEEN for ranges
WHERE salary BETWEEN 40000 AND 60000
-- LIKE for patterns
WHERE name LIKE 'J%' -- Starts with J
WHERE email LIKE '%@gmail.com' -- Ends with @gmail.com
Why it matters: WHERE clause efficiency directly impacts query performance. Using indexed columns and efficient operators prevents full table scans in production.
Real applications: Filter active users, find orders in date range, search users by name pattern, locate customers by country all rely on proper WHERE clauses.
Common mistakes: Filter on calculated columns preventing index use, using LIKE '%value%' instead of starting position, or mixing ANDs and ORs without parentheses causing logic errors.
-- Update from another table
UPDATE employees e
SET e.salary = e.salary * 1.05
WHERE e.department_id IN (SELECT id FROM departments WHERE name = 'Sales');
-- Update with JOIN
UPDATE orders o
JOIN customers c ON o.customer_id = c.id
SET o.customer_level = c.membership_level;
-- Bulk reset based on condition
UPDATE orders
SET status = 'expired'
WHERE order_date < DATE_SUB(NOW(), INTERVAL 30 DAY)
AND status = 'pending';
Why it matters: UPDATE...SELECT enables sophisticated data management patterns. Understanding its implications for locking and performance shows production database experience.
Real applications: Employee salary updates from department budgets, inventory synchronization from orders, data corrections based on multiple criteria all use UPDATE...SELECT.
Common mistakes: Forgetting to add WHERE clause updating all rows unintentionally, not understanding update impacts row count returns, or using without transactions risking partial updates on error.