Giving Your Express API a Permanent Memory
Unlocking the Power of Relational Data
Last week, we designed tables. The real power of a relational database is not just storing data, but querying data based on the relationships between tables.
To do this, we use JOIN clauses.
Returns records that have matching values in both tables. It is the intersection of the two sets.
"Show me all the books that have a valid, existing author."
SELECT
books.title,
authors.name AS author_name
FROM
books
INNER JOIN authors ON books.author_id = authors.id;
Returns all records from the left table (the first one mentioned), and the matched records from the right table. If there is no match, the right side columns are filled with `NULL`.
"Show me all the authors, and if they have any books, show those too."
SELECT
authors.name AS author_name,
books.title
FROM
authors
LEFT JOIN books ON authors.id = books.author_id;
This is useful for finding authors who haven't written any books yet!
The `GROUP BY` statement groups rows that have the same values into summary rows. It is almost always used with aggregate functions.
Common Aggregate Functions: `COUNT()`, `SUM()`, `AVG()`, `MAX()`, `MIN()`.
"Show me how many books each author has written."
SELECT
authors.name,
COUNT(books.id) AS number_of_books
FROM
authors
LEFT JOIN books ON authors.id = books.author_id
GROUP BY
authors.name
ORDER BY
number_of_books DESC;
The `WHERE` clause filters rows *before* they are grouped. The `HAVING` clause filters groups *after* they have been created by `GROUP BY`.
"Show me only the authors who have written more than 5 books."
SELECT
authors.name,
COUNT(books.id) AS number_of_books
FROM authors
LEFT JOIN books ON authors.id = books.author_id
GROUP BY authors.name
HAVING COUNT(books.id) > 5;
A subquery, or inner query, is a query nested inside a larger query.
"Show me all the users who have written at least one post."
SELECT *
FROM users
WHERE id IN (
SELECT DISTINCT user_id
FROM posts
);
Often, a `JOIN` is more performant than a subquery, but subqueries can be very expressive for complex logic.
Giving Your API a Permanent Memory
To talk to PostgreSQL from our Node.js application, we need a driver. The most popular and well-supported one is `pg`.
npm install pg
You could create a new connection to the database for every single request, but this is incredibly inefficient.
Establishing a database connection is a slow, expensive operation. Doing it for every request would destroy your application's performance under load.
A connection pool is a cache of database connections maintained so that the connections can be reused for future requests.
When a request needs to talk to the database, it "checks out" a connection from the pool. When it's done, it "releases" the connection back to the pool, where it's available for the next request.
The `pg` library has a built-in `Pool` class for this.
Create a dedicated file to configure and export your pool. This ensures you only have one pool for your entire application.
`db/index.js`
const { Pool } = require('pg');
const pool = new Pool({
user: 'your_db_user',
host: 'localhost',
database: 'your_db_name',
password: 'your_db_password',
port: 5432,
});
// Export a query function
module.exports = {
query: (text, params) => pool.query(text, params),
};
This is a vulnerability that occurs when an attacker can interfere with the queries that an application makes to its database.
VULNERABLE CODE (DO NOT DO THIS):
const id = req.params.id; // e.g., '123'
// What if an attacker sends an ID like: '123; DROP TABLE users;'
const queryText = `SELECT * FROM users WHERE id = ${id}`;
db.query(queryText); // DISASTER!
You NEVER build a query by concatenating strings with user input. Instead, you use parameterized queries.
You write your SQL with placeholders (`$1`, `$2`, etc.), and you provide the user input as a separate array of parameters. The database driver safely combines them, ensuring the input is treated as a value, not as executable code.
const id = req.params.id;
// The SQL has a placeholder
const queryText = 'SELECT * FROM users WHERE id = $1';
// The user input is passed as a separate array
const values = [id];
// The `pg` library handles the rest safely.
const { rows } = await db.query(queryText, values);
This is the only acceptable way to write database queries.
From In-Memory to Persistent
`controllers/userController.js`
const db = require('../db'); // Import our pool
const getAllUsers = async (req, res, next) => {
try {
const { rows } = await db.query('SELECT * FROM users ORDER BY id ASC');
res.json(rows);
} catch (err) {
next(err);
}
};
const createUser = async (req, res, next) => {
try {
const { username, email, password } = req.body;
const hashedPassword = await bcrypt.hash(password, 10);
const text = 'INSERT INTO users(username, email, password_hash) VALUES($1, $2, $3) RETURNING *';
const values = [username, email, hashedPassword];
const { rows } = await db.query(text, values);
res.status(201).json(rows[0]);
} catch (err) {
next(err);
}
};
`RETURNING *` is a PostgreSQL feature that returns the newly created row after an `INSERT`.
You will be given the Express API project with a configured database pool. Your task is to refactor the `getUserById` controller function to fetch data from the database instead of the in-memory array.