SQL: Structured Query Language

SQL (Structured Query Language) is the standard language used to interact with relational databases. It allows users to retrieve, manipulate, and manage data through various commands and clauses. SQL operates on the foundation of first-order predicate logic, making it a powerful tool for querying data stored in relational tables. SQL is not a general-purpose programming language like Python. Strictly speaking, SQL is a declarative, domain-specific language designed to handle data retrieval and manipulation in relational databases rather than a full-fledged programming language. In this way, SQL is the de facto standard for querying and manipulating data in a Relational Database and for filtering data in GIS like QGIS.

Example Tables

We’ll use the following tables for the examples throughout this note. These tables represent data relevant to geography and planning in Danish towns.

Towns Table

TownIDTownNamePopulationArea (sq km)
1Copenhagen79412888.25
2Aarhus28527391.00
3Odense180760304.34
4Aalborg120194139.95
5Esbjerg71683221.00

LandUsePlans Table

PlanIDTownIDPlanTypeAreaCovered (sq km)
1011Residential40.00
1022Commercial20.00
1033Industrial60.00
1044Residential30.00
1055Commercial100.00

Queries and Results

1. Basic Predicate with WHERE

Let’s start by selecting towns with a population greater than 150,000:

SELECT TownName, Population
FROM Towns
WHERE Population > 150000;

Result:

TownNamePopulation
Copenhagen794128
Aarhus285273
Odense180760

2. Combining Predicates with AND

Next, we’ll select towns with a population greater than 150,000 and an area of less than 100 sq km:

SELECT TownName, Population, Area
FROM Towns
WHERE Population > 150000 
AND Area < 100;

Result:

TownNamePopulationArea (sq km)
Copenhagen79412888.25
Aarhus28527391.00

3. Combining Predicates with OR

Now, we’ll select towns that either have a population greater than 150,000 or an area greater than 100 sq km:

SELECT TownName, Population, Area
FROM Towns
WHERE Population > 150000 
OR Area > 100;

Result:

TownNamePopulationArea (sq km)
Copenhagen79412888.25
Aarhus28527391.00
Odense180760304.34
Aalborg120194139.95
Esbjerg71683221.00

4. Using AND and OR with Parentheses

We can combine AND and OR to create more complex conditions. For example, select towns with a population greater than 150,000 and either a residential or industrial land use plan:

SELECT Towns.TownName, Towns.Population, LandUsePlans.PlanType
FROM Towns
JOIN LandUsePlans ON Towns.TownID = LandUsePlans.TownID
WHERE Population > 150000 
AND (PlanType = 'Residential' OR PlanType = 'Industrial');

Result:

TownNamePopulationPlanType
Odense180760Industrial

5. Spatial Query Example

Let’s incorporate spatial data. Suppose we want to find towns where the area covered by land-use plans is greater than 50 sq km. This could be particularly useful in urban planning where large-scale developments are planned.

SELECT Towns.TownName, LandUsePlans.PlanType, LandUsePlans.AreaCovered
FROM Towns
JOIN LandUsePlans ON Towns.TownID = LandUsePlans.TownID
WHERE LandUsePlans.AreaCovered > 50;

Result:

TownNamePlanTypeAreaCovered (sq km)
OdenseIndustrial60.00
EsbjergCommercial100.00

6. Joining Tables: Combining Towns and LandUsePlans

Let’s join the Towns and LandUsePlans tables to show the names of the towns along with their corresponding land use plans. We’ll use the TownID field, which is the foreign key in the LandUsePlans table, to join it with the Towns table.

SELECT Towns.TownName, Towns.Population, LandUsePlans.PlanType, LandUsePlans.AreaCovered
FROM Towns
JOIN LandUsePlans ON Towns.TownID = LandUsePlans.TownID;

Result:

TownNamePopulationPlanTypeAreaCovered (sq km)
Copenhagen794128Residential40.00
Aarhus285273Commercial20.00
Odense180760Industrial60.00
Aalborg120194Residential30.00
Esbjerg71683Commercial100.00

7. Joining Tables with Filtering

We can also combine the join with a WHERE clause to filter the results. For example, let’s find towns with residential land-use plans:

SELECT Towns.TownName, Towns.Population, LandUsePlans.PlanType, LandUsePlans.AreaCovered
FROM Towns
JOIN LandUsePlans ON Towns.TownID = LandUsePlans.TownID
WHERE LandUsePlans.PlanType = 'Residential';

Result:

TownNamePopulationPlanTypeAreaCovered (sq km)
Copenhagen794128Residential40.00
Aalborg120194Residential30.00

Variants of SQL

There are especially two variants of SQL you come across when working with geospatial data, namely the simplified version used to filter a feature class based on some attributes and the addition of spatial operators found in spatial SQL.

Simplified SQL for filtering in desktop GIS

In many desktop GIS applications like QGIS, SQL is often used to filter features based on attributes. For example, to filter out all towns where the population is greater than 100,000 for further analysis in QGIS:

SELECT TownName, Population
FROM Towns
WHERE Population > 100000;

Result:

TownNamePopulation
Copenhagen794128
Aarhus285273
Odense180760
Aalborg120194

Here’s an additional example where the two tables, Towns and LandUsePlans, are joined. This will show how a JOIN works to combine data from related tables based on a common key.

Spatial SQL

While this note introduces SQL and its use in relational and spatial contexts, a separate note will cover the specifics of how SQL can be extended to support geospatial data management, including OGC Simple Features and their implementation in PostGIS, which enables more advanced spatial queries and operations.