MySQL Fundamentals

Interview questions and answers

MySQL Fundamentals — MySQL Interview Questions

MySQL is an open-source relational database management system that uses SQL for data manipulation. It stores data in tables with rows and columns, maintaining relationships through keys. Unlike NoSQL databases like MongoDB which store unstructured data in JSON documents, MySQL enforces strict schema and ACID compliance. Compared to PostgreSQL, MySQL is lighter-weight and faster for simple queries, though PostgreSQL offers more advanced features like JSON support and custom data types.

-- Create a simple database
CREATE DATABASE users_db;
USE users_db;

-- Create a table
CREATE TABLE users (
    id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100),
    email VARCHAR(100)
);

Why it matters: Understanding MySQL's position in the database ecosystem and its ACID compliance is fundamental for choosing the right database for your application. It tests knowledge of relational vs non-relational databases.

Real applications: WordPress, Magento, Drupal all use MySQL as their default database. E-commerce platforms choose MySQL for transactional integrity and complex joins.

Common mistakes: Confusing MySQL with MariaDB (fork of MySQL), or assuming all relational databases work the same way without understanding InnoDB vs MyISAM storage engines.

Storage engines determine how MySQL stores and retrieves data from tables. InnoDB is the default and recommended engine offering ACID transactions, row-level locking, and foreign key support. MyISAM is older and faster for read-heavy operations but lacks transaction support and uses table-level locking. Memory engine stores data in RAM for ultra-fast access but loses data on shutdown. Selecting the right engine impacts performance, reliability, and feature availability.

-- Create table with specific engine
CREATE TABLE products (
    id INT PRIMARY KEY,
    name VARCHAR(100)
) ENGINE=InnoDB;

-- Check engine of existing table
SHOW CREATE TABLE products;

-- Convert table engine
ALTER TABLE products ENGINE=MyISAM;

Why it matters: Storage engine choice affects transaction support, locking mechanism, and performance. This determines crucial application behavior like rollback capability and concurrent access.

Real applications: Financial systems use InnoDB for ACID guarantees. Analytics systems might use MyISAM for read performance on non-transactional data.

Common mistakes: Using MyISAM for transactional applications, or creating all tables with different engines in the same database causing inconsistent behavior.

MySQL supports numeric types (INT, FLOAT, DECIMAL), string types (VARCHAR, CHAR, TEXT), date/time types (DATE, DATETIME, TIMESTAMP), and specialized types like BLOB for binary data and JSON for structured data. Choosing the correct type is crucial for performance and correctness — for example, using DECIMAL for prices instead of FLOAT avoids floating-point precision errors. VARCHAR is preferred over CHAR for variable-length strings to save storage, while TEXT types support full-text indexing.

-- Numeric: INT, BIGINT, DECIMAL(10,2)
-- String: VARCHAR(255), CHAR(10), TEXT
-- Date: DATE, DATETIME, TIMESTAMP
-- Special: BLOB, JSON, ENUM

CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    total_amount DECIMAL(10,2),
    customer_name VARCHAR(100),
    order_date DATETIME,
    tags JSON
);

Why it matters: Correct data type selection prevents data corruption, improves query performance, and ensures SQL constraints work properly. It reflects database design maturity.

Real applications: E-commerce uses DECIMAL for prices to avoid rounding errors, TIMESTAMP for auto-updated audit fields, and JSON for flexible product attributes.

Common mistakes: Using FLOAT for monetary values, TEXT for small strings, or storing dates as strings instead of DATE type preventing proper ordering and calculations.

A PRIMARY KEY uniquely identifies each row in a table, ensuring no duplicate records exist. It can be a single column (simple pk) or multiple columns (composite pk) and is automatically indexed for fast lookups. Primary keys enforce entity integrity by preventing NULL values and duplicate entries. Every table should have a primary key to optimize queries, maintain data integrity, and enable efficient foreign key relationships. Auto-increment is commonly used for single-column primary keys.

-- Simple primary key
CREATE TABLE users (
    user_id INT PRIMARY KEY AUTO_INCREMENT,
    email VARCHAR(100) UNIQUE
);

-- Composite primary key
CREATE TABLE user_permissions (
    user_id INT,
    permission_id INT,
    PRIMARY KEY (user_id, permission_id)
);

Why it matters: Primary keys are foundational to relational database design. Interviewers assess whether candidates understand uniqueness constraints and data integrity concepts.

Real applications: User management systems use user_id as primary key, linking to all related tables. Order systems use order_id to track every transaction uniquely.

Common mistakes: Using business logic values (email, phone) as primary key instead of surrogate keys, or allowing NULL in primary key columns, or creating tables without any primary key.

A foreign key links records in one table to another table's primary key, enforcing referential integrity — ensuring data consistency across tables. If you delete a user, foreign key constraints can prevent orphaned records or automatically delete related data via CASCADE. This prevents invalid data relationships like an order referencing a non-existent customer. Foreign keys are only enforced with InnoDB engine; MyISAM ignores them silently, leading to data corruption.

CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    name VARCHAR(100)
);

CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
        ON DELETE CASCADE
        ON UPDATE CASCADE
);

Why it matters: Foreign keys maintain data integrity across related tables. Understanding constraints demonstrates database design maturity and prevents subtle bugs from invalid relationships.

Real applications: User-to-profile relationships, post-to-comment hierarchies, order-to-item mappings all rely on foreign keys to prevent orphaned records.

Common mistakes: Using MyISAM engine where foreign keys are ignored, forgetting to define CASCADE behavior causing deletion failures, or creating circular dependencies between tables.

A unique constraint ensures all values in a column are different, preventing duplicates. Unlike primary keys, unique constraints allow NULL values (multiple NULLs can exist). A table can have only one primary key but multiple unique constraints. Unique constraints are indexed automatically for performance. Use unique constraints for email addresses, usernames, or any field that must be globally unique but isn't the table's main identifier.

CREATE TABLE users (
    user_id INT PRIMARY KEY AUTO_INCREMENT,
    email VARCHAR(100) UNIQUE,
    username VARCHAR(50) UNIQUE NOT NULL,
    phone_number VARCHAR(20) UNIQUE
);

Why it matters: Understanding the distinction between primary and unique constraints is crucial for proper database modeling and understanding constraint enforcement mechanisms.

Real applications: User tables use unique constraints on email and username to prevent account creation with existing credentials while user_id is the primary key.

Common mistakes: Making business fields like email the primary key instead of using a surrogate key, or assuming unique constraints prevent all duplicates when they allow multiple NULLs.

Normalization is a process to organize database structure to minimize data redundancy and dependency, reducing storage waste and preventing update anomalies. It involves breaking large tables into smaller related tables. First normal form (1NF) eliminates repeating groups, second normal form (2NF) removes partial dependencies, and third normal form (3NF) removes transitive dependencies. Over-normalization can cause too many joins degrading performance, so sometimes denormalization is acceptable for analytics.

-- Unnormalized: stores repeat
CREATE TABLE orders (
    order_id INT,
    customer_name VARCHAR(100),
    items VARCHAR(500)  -- multiple items in one field
);

-- Normalized: separate tables
CREATE TABLE order_items (
    order_id INT,
    item_id INT,
    FOREIGN KEY (order_id) REFERENCES orders(order_id)
);

Why it matters: Normalization prevents data anomalies and maintains consistency. Showing normalization understanding demonstrates core database design knowledge fundamental to all developers.

Real applications: E-commerce systems normalize products, orders, and line items separately to prevent data duplication. Booking systems normalize users, accommodations, and reservations.

Common mistakes: Over-normalizing causing excessive joins and performance issues, or under-normalizing leading to data duplication and update anomalies requiring manual synchronization.

NULL represents the absence of a value, distinct from empty string or zero. NULL comparisons always return NULL (not TRUE or FALSE), so use IS NULL or IS NOT NULL operators. Aggregate functions like SUM ignore NULLs, and NULL values in calculations make the entire result NULL. NOT NULL constraints force values to be present. Proper NULL handling is critical because wrong queries like WHERE column = NULL always return no rows, a common bug.

-- Wrong: WHERE column = NULL always returns nothing
SELECT * FROM users WHERE phone = NULL;  -- Returns 0 rows

-- Correct: Use IS NULL
SELECT * FROM users WHERE phone IS NULL;

-- NULL in calculations makes result NULL
SELECT name, salary + bonus FROM employees;
-- If bonus is NULL, entire result for that row is NULL

-- Use COALESCE to provide defaults
SELECT name, COALESCE(bonus, 0) as bonus FROM employees;

Why it matters: Mishandling NULLs causes subtle bugs where queries return unexpected results. This is a foundational SQL concept frequently tested in interviews.

Real applications: Optional fields like middle_name, nickname, or secondary_phone are represented as NULL. Reports must carefully handle NULLs in aggregations to avoid missing data.

Common mistakes: Comparing NULL with = or != operators, forgetting NULL is not equal to NULL, treating NULL as empty string, or not considering NULL in aggregate functions causing wrong totals.

CHAR stores fixed-length strings, always padding with spaces to reach the declared length, using more storage but offering faster access predictability. VARCHAR stores variable-length strings, using only the actual bytes needed plus overhead, saving storage for varying-length data. CHAR is optimal for fixed-length fields like country codes or status flags, while VARCHAR suits names or descriptions. VARCHAR typically performs slightly slower due to variable management, but the storage savings justify its use in most cases.

-- CHAR: Fixed-length, padded with spaces
CREATE TABLE countries (
    country_code CHAR(2),  -- Always 2 bytes
    country_name CHAR(50)  -- Always 50 bytes even if name is 5 chars
);

-- VARCHAR: Variable-length, no padding
CREATE TABLE customers (
    name VARCHAR(100),     -- Uses actual string length
    address VARCHAR(255)
);

Why it matters: String type choice impacts storage efficiency and performance. Understanding the tradeoffs demonstrates attention to database optimization and design principles.

Real applications: User names, addresses should use VARCHAR for storage efficiency. Status codes, country codes, state abbreviations use CHAR for predictable fixed length.

Common mistakes: Overusing CHAR causing wasted space, using VARCHAR for inherently fixed-length values like postal codes, or misunderstanding that CHAR doesn't trim spaces affecting string comparisons.

AUTO_INCREMENT automatically generates sequential numbers for new rows, ideal for surrogate primary keys. Each table has one AUTO_INCREMENT column, incrementing by 1 with each insert. You can specify starting values and increment steps, and it continues from the highest value even after deletions. AUTO_INCREMENT provides clean, simple keys without business logic, avoiding issues from using natural keys like email addresses that might change.

-- Define AUTO_INCREMENT
CREATE TABLE products (
    product_id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100),
    price DECIMAL(10,2)
);

-- Insert without specifying AUTO_INCREMENT column
INSERT INTO products (name, price) VALUES ('Laptop', 999.99);
-- product_id automatically becomes 1, 2, 3...

-- Starting from specific value
ALTER TABLE products AUTO_INCREMENT = 1000;

Why it matters: AUTO_INCREMENT is the standard for generating surrogate keys. Understanding when and how to use it demonstrates practical database design patterns.

Real applications: Every user gets a unique user_id via AUTO_INCREMENT, every order gets order_id, every product gets product_id. Universal pattern in enterprise databases.

Common mistakes: Assuming AUTO_INCREMENT values are densely packed without gaps, relying on AUTO_INCREMENT for business logic, or not understanding it resets on table recreation or in certain backup scenarios.

DEFAULT specifies a value automatically inserted when no value is provided during INSERT, reducing the need for NULL handling in application code. Defaults can be literal values (0, 'Active'), functions (CURRENT_TIMESTAMP), or expressions. Using meaningful defaults prevents NULL values, makes queries simpler, and enforces business rules at the database level. Common defaults include status fields defaulting to 'pending', timestamps to current time, and flags to FALSE.

CREATE TABLE posts (
    post_id INT PRIMARY KEY AUTO_INCREMENT,
    title VARCHAR(200),
    content TEXT,
    status VARCHAR(20) DEFAULT 'draft',  -- Default status
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,  -- Auto-set time
    is_published BOOLEAN DEFAULT FALSE
);

-- Insert without specifying defaults
INSERT INTO posts (title, content) 
VALUES ('My First Post', 'Content here');
-- status becomes 'draft', created_at becomes current time

Why it matters: DEFAULT values reduce NULL handling and implement business rules at the database level. This demonstrates understanding of constraint-based design and data consistency.

Real applications: User accounts default to 'active' status, posts default to 'draft', creation timestamps auto-set to now, soft-delete flags default to FALSE, discount percentages default to 0.

Common mistakes: Forgetting to set meaningful defaults for frequently NULL columns, not using CURRENT_TIMESTAMP for audit fields, or creating DEFAULT values in application code instead of database causing inconsistencies.

CHECK constraints enforce business rules by validating data at the database level — rejecting inserts or updates that violate the condition. This prevents invalid data like negative prices, out-of-range ages, or invalid statuses from entering the database. CHECK constraints work across multiple columns and can use comparison operators and logical AND/OR. Implementing validation at the database prevents bypass through direct SQL manipulation while application-only validation can be circumvented.

CREATE TABLE employees (
    emp_id INT PRIMARY KEY,
    name VARCHAR(100),
    salary DECIMAL(10,2) CHECK (salary > 0),
    age INT CHECK (age >= 18 AND age <= 70),
    status VARCHAR(20) CHECK (status IN ('active', 'inactive', 'on_leave')),
    end_date DATE CHECK (end_date IS NULL OR end_date > CURDATE())
);

Why it matters: CHECK constraints enforce data validity at the database level, preventing invalid data from multiple sources. This demonstrates understanding of constraint-based design and data quality.

Real applications: E-commerce validates prices > 0, inventory > 0. HR systems validate age ranges, salary minimums. Financial systems enforce debit balance limits, percentage ranges.

Common mistakes: Relying only on application-level validation without CHECK constraints, creating overly complex CHECK conditions that become hard to maintain, or forgetting that CHECK constraints don't prevent bypass through direct SQL in some databases.

In MySQL, database and schema refer to the same concept — a logical grouping of tables and objects. Other databases like Oracle or SQL Server distinguish between them where schema is a namespace within a database. A single MySQL server contains multiple databases, each with its own tables, indexes, and user permissions. You select which database to use, and all subsequent queries operate on that database's objects.

-- Create database (same as schema in MySQL)
CREATE DATABASE e_commerce;
CREATE SCHEMA e_commerce;  -- Equivalent

-- Switch to database
USE e_commerce;

-- Create table in selected database
CREATE TABLE products (...);

-- List all databases
SHOW DATABASES;

-- View tables in current database
SHOW TABLES;

Why it matters: Understanding database organization and concepts transfers across SQL database systems. This is fundamental to database administration and multi-tenant application design.

Real applications: Development, testing, and production often use separate databases for isolation. Multi-tenant applications create separate databases per customer for data segregation.

Common mistakes: Confusing MySQL terminology with PostgreSQL or Oracle where schema has different meaning. Not understanding that changing database requires explicit USE statement, causing queries to fail on wrong database.

Comments document SQL code for future maintenance and team understanding. Single-line comments use -- or #, multi-line comments use /* */. Comments are ignored during execution and don't affect performance. Well-commented SQL explains complex business logic, assumptions about data, performance considerations, and the purpose of queries. Comments in stored procedures and triggers are especially important since that code is often revisited long after creation.

-- Single-line comment
SELECT * FROM users WHERE status = 'active';  # Another single-line

/* Multi-line comment
   explaining complex business logic
   across multiple lines */
SELECT u.*, COUNT(o.order_id) as total_orders
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
WHERE u.status = 'active'
GROUP BY u.user_id;

Why it matters: Code readability and maintainability matter in production environments. Understanding documentation practices reflects professional development standards and team collaboration skills.

Real applications: Complex queries with multiple joins get extensive comments explaining the business purpose. Stored procedures include comments about parameters, expected behavior, and assumptions.

Common mistakes: Non-existent or outdated comments that mislead developers, over-commenting obvious code, or using comments to disable code instead of version control, leaving confusing cruft in production SQL.

SHOW DATABASES lists all databases available on the MySQL server that the current user has privileges to access. SHOW TABLES displays all tables within the currently selected database. Both are utility commands for database navigation and discovery. SHOW DATABASES requires no selected database context, while SHOW TABLES requires you to first USE a database, otherwise it throws an error about no database selected.

-- List all databases available to current user
SHOW DATABASES;
-- Output: information_schema, mysql, performance_schema, myapp_db, etc.

-- Must select database first
USE myapp_db;

-- List all tables in selected database
SHOW TABLES;
-- Output: users, products, orders, payments, etc.

-- Show table structure
DESCRIBE users;  -- or SHOW COLUMNS FROM users;

Why it matters: These commands are fundamental navigation tools for database administration. Knowing how to explore database structure demonstrates SQL proficiency and practical experience.

Real applications: Developers use these commands when connecting to unfamiliar databases to understand available data structures. DBAs use them for inventory and permission auditing.

Common mistakes: Trying SHOW TABLES without USE database causing error, assuming SHOW DATABASES includes tables, or not understanding that SHOW DATABASES is limited by current user permissions seeing only allowed databases.

1What is MySQL and how does it differ from other databases?

2What are storage engines in MySQL?

3What are the main data types in MySQL?

4What is a primary key and why is it important?

5Explain foreign keys and referential integrity.

6What is a unique constraint and how does it differ from primary key?

7What is normalization and why is it important?

8What are NULL values and how should they be handled?

9What is the difference between CHAR and VARCHAR?

10What is AUTO_INCREMENT and when should it be used?

11What is the purpose of the DEFAULT keyword?

12What is a CHECK constraint?

13Explain the difference between database and schema.

14What are comments in SQL and why use them?

15What is the difference between SHOW DATABASES and SHOW TABLES?