Skip to main content

Command Palette

Search for a command to run...

Optimizing SQL Queries

Published
4 min read
L

Passionate Software Developer with a strong enthusiasm for data, technology, and entrepreneurship to solve real-world problems. I enjoy building innovative digital solutions and currently exploring new advancements in data, and leveraging my skills to create impactful software solutions. Beyond coding, I have a keen interest in strategic thinking in business and meeting new people to exchange ideas and collaborate on exciting projects.

Sub-queries

Subquery, also know as nested query is a query embedded within another SQL statement such as SELECT, WHERE, FROM, JOIN, INSERT, UPDATE, or DELETE to enable retrieval of data from one or more tables based on the results of an inner query.

Even though they come at the cost of lower readability and lower performance, they offer alternative options to JOINS, functions, and window functions.

When there are subqueries present, the inner query is executed first, then the outer query is evaluated.

Subquery in SELECT

A subquery in the SELECT section of a query always returns a scalar value as shown below;

A correlated subquery is a type of SELECT query that uses values from the outer query. The inner query executes, referencing these value(s), and returns the result to the outer query, and this happens row by row.

Example;

For row1, Sample_name = Alpha and Analysis_type = ‘Basic’, The subquery then references Analysis_type, which is ‘Basic’ for this row. The subquery then executes, using ‘Basic’ as the condition to filter the Analysis_costs table on, and returns a single value of 50. Finally, Cost is updated with the subquery result, and the next row is evaluated.

Subquery in FROM/JOIN

Subqueries can be used in FROM and JOIN to create intermediate or derived tables, which can be queried again. This is useful when we want to use aggregated data along with column data.

Example: Let’s calculate the total cost of all samples analyzed. We can use the previous query where we calculated the cost and SUM() the column.

We are using the previous query result as derived table, which the outer query uses to sum with. If we use a subquery in FROM, we have to alias the derived table using AS.

SELECT
    SUM(Cost)
FROM
    (SELECT
        ws_out.Sample_name,
        ws_out.Analysis_type
        (SELECT
            Cost
        FROM
            Analysis_ocsts
        WHERE
            Analysis_type = ws_out.Analysis_type
            ) AS Cost
      FROM
        Water_sample AS ws_out) AS Total_cost;

Subqueries can add complexity to SQL statements often making them harder to understand. However, there are cases where they offer a more concise and efficient solution.

Subqueries in WHERE/HAVING

Subqueries can be used in WHERE and HAVING to enable customized or advanced filtering of results.

For instance; If we are to get records in the Water_samples table that have above average purity, we can write a query like this:

SELECT
    Sample_name,
    Purity,
    AVG(Purity)
FROM
    Water_samples
WHERE
    Purity > (SELECT
                  AVG(Purity))
              FROM
                  Water_samples

They can also return scalar values if used with comparison operators or can return a single column of values if used with the IN() operator.

Common Table Expressions

Also referred to as CTE, Common Table Expression is a query that exists within the context of a larger query and the result can be stored such that they can be referenced later by other queries.

CTEs simplify difficult queries by breaking them down into smaller more understandable components and temporarily storing the result for later reference by other queries.

A CTE is accessible within the larger query in which it is defined.

Below is the syntax of a CTE:

WITH
    CTE_name (Column_list) AS (
        Query
)
SELECT
    Column_list
FROM   
    CTE_name;

Role of CTE in code optimization

  1. Readability: CTEs simplify complex queries to smaller intuitive logical parts making them readable. They’re also comprehensible by giving descriptive name, making it easier to understand the context and purpose.

  2. Maintainability: CTEs help modularize queries making them easier for changes to be made.

  3. Reusability: Once a CTE has been defined, it can be reused as many times as possible.

Use cases of CTEs

Suppose we want to compare total salary with average salary in each department for the table below:

Our query will be like this:

WITH
    Salary_total AS (
        SELECT 
            Department,
            SUM(Salary) AS Total_salary,
            COUNT(Department) AS No_of_employees,
        FROM
            Employees
        GROUP BY
            Department
)
SELECT 
    Department,
    No_of_employees,
    Total_salary,
    Total_salary / No_of_employees AS Avg_salary
FROM
    Salary_total;

In this case, we create a CTE named Salary_total, which calculates the total salary and the number of employees in each department. The main query then uses the result of the CTE to calculate the average salary for each department by dividing the Total_salary by the No_of_employees.

This simplifies the query by separating the total salary and number of employees calculations from that of the average salary.

The output is as shown below:

We can also combine multiple CTEs using a comma.

For instance, suppose we want to view the employees posted at the main offices in Kenya and South Africa based on the previous table above, our combined CTEs query will be as shown;

WITH
    Employees_from_kenya AS (
        SELECT
            First_name,
            Gender,
            Department,
        FROM Employees
        WHERE Country = "Kenya" AND Office = "Nairobi"),
    Employees_from_sa AS (
        SELECT
            First_name,
            Gender,
            Department,
        FROM Employees
        WHERE Country = "South Africa" AND Office = "Johannesburg")
SELECT *
FROM Employees_from_kenya
UNION
SELECT *
FROM Employees_from_sa;

In this case, we have Employees_from_kenya which returns results for employees working in the “Nairobi“ office in Kenya, while Employees_from_sa returns results of employees working in the “Johannesburg” office in South Africa. The main query then combines these two result sets using the UNION operator into a single unified result containing all the employees working in the company’s offices in Kenya and South Africa.