Optimizing SQL Queries
Passionate Software Developer with a strong enthusiasm for data, technology, and entrepreneurship to solve real-world problems. I enjoy building innovative digital solutions and currently exploring new advancements in data, and leveraging my skills to create impactful software solutions. Beyond coding, I have a keen interest in strategic thinking in business and meeting new people to exchange ideas and collaborate on exciting projects.
Sub-queries
Subquery, also know as nested query is a query embedded within another SQL statement such as SELECT, WHERE, FROM, JOIN, INSERT, UPDATE, or DELETE to enable retrieval of data from one or more tables based on the results of an inner query.
Even though they come at the cost of lower readability and lower performance, they offer alternative options to JOINS, functions, and window functions.
When there are subqueries present, the inner query is executed first, then the outer query is evaluated.
Subquery in SELECT
A subquery in the SELECT section of a query always returns a scalar value as shown below;

A correlated subquery is a type of SELECT query that uses values from the outer query. The inner query executes, referencing these value(s), and returns the result to the outer query, and this happens row by row.
Example;

For row1, Sample_name = Alpha and Analysis_type = ‘Basic’, The subquery then references Analysis_type, which is ‘Basic’ for this row. The subquery then executes, using ‘Basic’ as the condition to filter the Analysis_costs table on, and returns a single value of 50. Finally, Cost is updated with the subquery result, and the next row is evaluated.
Subquery in FROM/JOIN
Subqueries can be used in FROM and JOIN to create intermediate or derived tables, which can be queried again. This is useful when we want to use aggregated data along with column data.
Example: Let’s calculate the total cost of all samples analyzed. We can use the previous query where we calculated the cost and SUM() the column.
We are using the previous query result as derived table, which the outer query uses to sum with. If we use a subquery in FROM, we have to alias the derived table using AS.
SELECT
SUM(Cost)
FROM
(SELECT
ws_out.Sample_name,
ws_out.Analysis_type
(SELECT
Cost
FROM
Analysis_ocsts
WHERE
Analysis_type = ws_out.Analysis_type
) AS Cost
FROM
Water_sample AS ws_out) AS Total_cost;
Subqueries can add complexity to SQL statements often making them harder to understand. However, there are cases where they offer a more concise and efficient solution.
Subqueries in WHERE/HAVING
Subqueries can be used in WHERE and HAVING to enable customized or advanced filtering of results.
For instance; If we are to get records in the Water_samples table that have above average purity, we can write a query like this:
SELECT
Sample_name,
Purity,
AVG(Purity)
FROM
Water_samples
WHERE
Purity > (SELECT
AVG(Purity))
FROM
Water_samples
They can also return scalar values if used with comparison operators or can return a single column of values if used with the IN() operator.
Common Table Expressions
Also referred to as CTE, Common Table Expression is a query that exists within the context of a larger query and the result can be stored such that they can be referenced later by other queries.
CTEs simplify difficult queries by breaking them down into smaller more understandable components and temporarily storing the result for later reference by other queries.
A CTE is accessible within the larger query in which it is defined.
Below is the syntax of a CTE:
WITH
CTE_name (Column_list) AS (
Query
)
SELECT
Column_list
FROM
CTE_name;

Role of CTE in code optimization
Readability: CTEs simplify complex queries to smaller intuitive logical parts making them readable. They’re also comprehensible by giving descriptive name, making it easier to understand the context and purpose.
Maintainability: CTEs help modularize queries making them easier for changes to be made.
Reusability: Once a CTE has been defined, it can be reused as many times as possible.
Use cases of CTEs
Suppose we want to compare total salary with average salary in each department for the table below:

Our query will be like this:
WITH
Salary_total AS (
SELECT
Department,
SUM(Salary) AS Total_salary,
COUNT(Department) AS No_of_employees,
FROM
Employees
GROUP BY
Department
)
SELECT
Department,
No_of_employees,
Total_salary,
Total_salary / No_of_employees AS Avg_salary
FROM
Salary_total;
In this case, we create a CTE named Salary_total, which calculates the total salary and the number of employees in each department. The main query then uses the result of the CTE to calculate the average salary for each department by dividing the Total_salary by the No_of_employees.
This simplifies the query by separating the total salary and number of employees calculations from that of the average salary.
The output is as shown below:

We can also combine multiple CTEs using a comma.
For instance, suppose we want to view the employees posted at the main offices in Kenya and South Africa based on the previous table above, our combined CTEs query will be as shown;
WITH
Employees_from_kenya AS (
SELECT
First_name,
Gender,
Department,
FROM Employees
WHERE Country = "Kenya" AND Office = "Nairobi"),
Employees_from_sa AS (
SELECT
First_name,
Gender,
Department,
FROM Employees
WHERE Country = "South Africa" AND Office = "Johannesburg")
SELECT *
FROM Employees_from_kenya
UNION
SELECT *
FROM Employees_from_sa;
In this case, we have Employees_from_kenya which returns results for employees working in the “Nairobi“ office in Kenya, while Employees_from_sa returns results of employees working in the “Johannesburg” office in South Africa. The main query then combines these two result sets using the UNION operator into a single unified result containing all the employees working in the company’s offices in Kenya and South Africa.



