Month 3, Week 2

Advanced SQL & Database Integration

Giving Your Express API a Permanent Memory

Module 1: Advanced SQL

Unlocking the Power of Relational Data

The Power of Relationships

Last week, we designed tables. The real power of a relational database is not just storing data, but querying data based on the relationships between tables.

To do this, we use JOIN clauses.

INNER JOIN

Returns records that have matching values in both tables. It is the intersection of the two sets.

"Show me all the books that have a valid, existing author."


                        SELECT
                            books.title,
                            authors.name AS author_name
                        FROM
                            books
                        INNER JOIN authors ON books.author_id = authors.id;
                    

LEFT JOIN (or LEFT OUTER JOIN)

Returns all records from the left table (the first one mentioned), and the matched records from the right table. If there is no match, the right side columns are filled with `NULL`.

"Show me all the authors, and if they have any books, show those too."


                        SELECT
                            authors.name AS author_name,
                            books.title
                        FROM
                            authors
                        LEFT JOIN books ON authors.id = books.author_id;
                     

This is useful for finding authors who haven't written any books yet!

GROUP BY & Aggregate Functions

The `GROUP BY` statement groups rows that have the same values into summary rows. It is almost always used with aggregate functions.

Common Aggregate Functions: `COUNT()`, `SUM()`, `AVG()`, `MAX()`, `MIN()`.

"Show me how many books each author has written."


                        SELECT
                            authors.name,
                            COUNT(books.id) AS number_of_books
                        FROM
                            authors
                        LEFT JOIN books ON authors.id = books.author_id
                        GROUP BY
                            authors.name
                        ORDER BY
                            number_of_books DESC;
                    

HAVING: Filtering After Aggregation

The `WHERE` clause filters rows *before* they are grouped. The `HAVING` clause filters groups *after* they have been created by `GROUP BY`.

"Show me only the authors who have written more than 5 books."


                        SELECT
                            authors.name,
                            COUNT(books.id) AS number_of_books
                        FROM authors
                        LEFT JOIN books ON authors.id = books.author_id
                        GROUP BY authors.name
                        HAVING COUNT(books.id) > 5;
                     

Subqueries

A subquery, or inner query, is a query nested inside a larger query.

"Show me all the users who have written at least one post."


                        SELECT *
                        FROM users
                        WHERE id IN (
                            SELECT DISTINCT user_id
                            FROM posts
                        );
                     

Often, a `JOIN` is more performant than a subquery, but subqueries can be very expressive for complex logic.

Mid-Lecture Knowledge Check

Module 2: Integrating Express with PostgreSQL

Giving Your API a Permanent Memory

The Database Driver: `pg`

To talk to PostgreSQL from our Node.js application, we need a driver. The most popular and well-supported one is `pg`.


                        npm install pg
                    

The Problem with Single Connections

You could create a new connection to the database for every single request, but this is incredibly inefficient.

Establishing a database connection is a slow, expensive operation. Doing it for every request would destroy your application's performance under load.

The Solution: Connection Pooling

A connection pool is a cache of database connections maintained so that the connections can be reused for future requests.

When a request needs to talk to the database, it "checks out" a connection from the pool. When it's done, it "releases" the connection back to the pool, where it's available for the next request.

The `pg` library has a built-in `Pool` class for this.

Setting Up the Connection Pool

Create a dedicated file to configure and export your pool. This ensures you only have one pool for your entire application.

`db/index.js`


                        const { Pool } = require('pg');

                        const pool = new Pool({
                            user: 'your_db_user',
                            host: 'localhost',
                            database: 'your_db_name',
                            password: 'your_db_password',
                            port: 5432,
                        });

                        // Export a query function
                        module.exports = {
                            query: (text, params) => pool.query(text, params),
                        };
                    

The #1 Security Threat: SQL Injection

This is a vulnerability that occurs when an attacker can interfere with the queries that an application makes to its database.

VULNERABLE CODE (DO NOT DO THIS):


                        const id = req.params.id; // e.g., '123'
                        // What if an attacker sends an ID like: '123; DROP TABLE users;'
                        const queryText = `SELECT * FROM users WHERE id = ${id}`;
                        db.query(queryText); // DISASTER!
                     

The Solution: Parameterized Queries

You NEVER build a query by concatenating strings with user input. Instead, you use parameterized queries.

You write your SQL with placeholders (`$1`, `$2`, etc.), and you provide the user input as a separate array of parameters. The database driver safely combines them, ensuring the input is treated as a value, not as executable code.


                        const id = req.params.id;
                        
                        // The SQL has a placeholder
                        const queryText = 'SELECT * FROM users WHERE id = $1';
                        
                        // The user input is passed as a separate array
                        const values = [id];

                        // The `pg` library handles the rest safely.
                        const { rows } = await db.query(queryText, values);
                    

This is the only acceptable way to write database queries.

Module 4: Refactoring the CRUD API

From In-Memory to Persistent

Refactoring `getAllUsers`

`controllers/userController.js`


                        const db = require('../db'); // Import our pool

                        const getAllUsers = async (req, res, next) => {
                            try {
                                const { rows } = await db.query('SELECT * FROM users ORDER BY id ASC');
                                res.json(rows);
                            } catch (err) {
                                next(err);
                            }
                        };
                    

Refactoring `createUser`


                        const createUser = async (req, res, next) => {
                            try {
                                const { username, email, password } = req.body;
                                const hashedPassword = await bcrypt.hash(password, 10);
                                
                                const text = 'INSERT INTO users(username, email, password_hash) VALUES($1, $2, $3) RETURNING *';
                                const values = [username, email, hashedPassword];
                                
                                const { rows } = await db.query(text, values);
                                res.status(201).json(rows[0]);
                            } catch (err) {
                                next(err);
                            }
                        };
                     

`RETURNING *` is a PostgreSQL feature that returns the newly created row after an `INSERT`.

In-Class Practical Exercise

Refactoring the `GET /:id` Endpoint

You will be given the Express API project with a configured database pool. Your task is to refactor the `getUserById` controller function to fetch data from the database instead of the in-memory array.

  1. Open the `controllers/userController.js` file.
  2. Locate the existing `getUserById` function that uses `users.find()`.
  3. Rewrite the function to be `async`.
  4. Wrap the logic in a `try...catch` block.
  5. Write the parameterized SQL query to select a user by their ID.
  6. Use `await db.query()` to execute it, passing the `id` from `req.params`.
  7. Check if `rows[0]` exists. If not, send a 404 error.
  8. If a user is found, send it back as a JSON response.
  9. Pass any errors to the `next` function in the `catch` block.

Final Knowledge Check