SQL tables getting joined together

Why is SQL weird?

10 May 2022

If you make a search for "SQL” on Seek, you’ll find thousands of jobs. IBM developed SQL in the early 70s and won the war of relational database management languages. There are millions of SQL databases running in data centres around the world. It’s an important tool to learn.

Moving logic out of the application layer and into SQL can improve performance and make systems easier to understand. SQL gets used by business analysts, data scientists and other non-developers to access vaults of information that would otherwise be hidden. Some trendy Node.js apps avoid SQL by using MongoDB, but even the node ecosystem sometimes has to choose boring technology.

SQL is a weird language if you’re used to C-like languages. For a many of us it will be one of the first domain-specific languages we learn. It’s completely different in syntax, scope, and mindset from imperative and object-oriented programming languages. In some ways it feels more similar to functional programming, as you need to change your thinking from step-by-step to declarative.

SQL Order of Operations #

The first thing I found strange in SQL was comparing the order of operations to the order you write an instruction with. In a C-like language, we’re used to things going from top to bottom: A series of instructions ending in semicolons that start at the top and finish at the bottom. Within lines, orders of operations can get a little more complicated, however, I’ve always found C-like languages intuitive. A lot of the order matches intermediate-school algebra we’re mostly familiar with.

But in SQL, the order of operations are more idiosyncratic. Look at this query:

SELECT [name],[breed],[age] FROM [dbo].[dogs] WHERE [age] > 5 ORDER BY [name];

SQL operations follow this order:

  1. FROM
  2. JOIN
  3. WHERE
  4. GROUP BY
  5. HAVING
  6. SELECT
  7. ORDER BY

This means in the previous query, operation starts from the middle of the line (FROM), then moves right to WHERE, then jumps back to the start of the line with SELECT then bounces back over to the end of the line with ORDER BY.

The other strange thing is that you can’t change the order you write instructions. You can't start with order by then write FROM. You have to follow the correct order to write the instructions, and it’s a different order from the execution order?

I find this quite stressful to keep in my head, but an SQL professional would probably argue that it’s a human-friendly way to phrase the language. "Plus!" They scream from the rooftops. "It’s declarative! You shouldn’t have to worry about the order of operations anyway! Let the engine take care of it!"

The Dunning Kruger query language #

Learning introductory SQL is easy. There are a lot of introductory courses that will tell you the simple stuff: SQL databases are made up of tables, which have typed columns and hold a number of rows in which you store data, and you can hook up different tables up with foreign keys and joins to define relationships.

But as soon as you get into intermediate SQL the difficulty curve steepens. This programming language was made to give you Dunning Kruger syndrome. “I know how to create a table with the columns I need, and do CRUD operations on it!” you’ll excitedly say.

But then you find out about about Views, Subqueries, Stored Procedures, indexing, coalescing, user-defined functions, not to mention the biggest topic of them all: performance. You fall off a complexity cliff into an “I know nothing” zone. This might happen in a lot of languages, but most resources seem to tell you "well done, you know SQL well enough to be a developer" after teaching a few surface-level concepts.

Meta-SQL #

The thing about SQL is you do everything with SQL. There are a lot of different GUI apps that allow you to look into a database’s structure and data without running SQL queries. But the more you use these apps to gather information about your database, the more you realise that you’re doing it wrong! What’s the best way to get a list of all the table names? Don’t try and use your GUI client. Instead, run this query:

SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES;

What’s the best way of searching for a view you made? Don’t use the GUI! Run this:

SELECT * FROM sys.views WHERE name like '%MyView%';

What’s the best way to see how many records are in a table? It’s this:

SELECT count(*) FROM [dbo].[dogs];

Okay maybe that’s not the best way to find the number of records in a table - but it’s good for most cases.

SQL isn't only for accessing, deleting, and changing data. It's also for describing data and building databases. And sometimes it feels weird to do both of these using the same quote unquote “domain specific language”, as they're actually quite different tasks.

You can set database table names based on data in another table. You can store SQL configuration variables in the same database that you're configuring. You can use SQL to drive other SQL. That's weird.

Injection #

One of my first blog posts on here was about how to avoid SQL injection using PHP data objects. We would never have needed this sort of thing if we didn’t have to mix SQL into other programming languages. The fact that developers use strings to build out executable queries puts us all in danger, as the Computerphile YouTube channel shows us in this vid.

Who knows how many apps are running right now with an opening for us to type drop table [dbo].[users]; and watch the world burn. Who knows how many passwords have been claimed. Well, Hold Security made an estimate that at least 1.2 billion unique email-password pairs had been stolen using SQL injection. That was 7 years ago now, so hopefully everyone’s moved on to MongoDB since then. Problem solved.

Part of the reason SQL feels weird is that we're always calling it from another language. Plus we're always scared of the potential repercussions of doing this incorrectly.

Maybe I'm the weird one #

I haven't spent as much time with SQL as I have with HTML, JavaScript, CSS, PHP, or Java. This could be the real reason I'm asking why it's weird. Not because of any flaw with the language, but with me.

There you go. The real domain-specific-database-management-language was the friends we made along the way.

Back to blog