Skip to content

Read Connected Data

Now that we have some data in both tables, let's select the data that is connected together.

The users table has this data:

We have both of the tables new_roots and users.

users

id email first last username
1 colrc@gmail.com Lawrence Nicodemus original
2 jdoe@gmail.com John Doe jdoe
3 jadoe@gmail.com Jane Doe jadoe
null rdoe@gmail.com Robert Doe rdoe

new_roots

id userId english nicodemus root
1 1 hello. a a
null 1 so. (lit. Is that so?), a? a
null 1 cut out, knock off!, quit, stop. (lit. Cut it out!, Knock it off, quit it, Stop it!), aaaa...! a
2 2 hey. aye a
3 2 boots. buts bc
4 3 boots (to be wearing...). (lit. He is wearing boots) etsbutsbutsshn bc
5 3 boot. (lit. a borrowed root) sbutsbutsshn bc
6 1 rubber boots (putting on...). sbutsbutsshnmsh bc
7 1 bull. buuli bl
... ... ... ... ...

SELECT Connected Data with SQL

Let's start seeing how SQL works when selecting connected data. This is where SQL databases actually shine.

If you don't have a database.db file, run that previous program we had written (or copy it from the preview above) to create it.

Now open DB Browser for SQLite and open the database.db file.

To SELECT connected data we use the same keywords we have used before, but now we combine the two tables.

Let's get each root with the id, root, nicodemus, and the english gloss, along with the associated email for the user specified in the userId:

SELECT new_roots.id, userId, root, nicodemus, english, email
FROM new_roots, users
WHERE new_roots.userid = users.id;

Info

Because we have two columns called id, one for roots and one for users, we can specify them with the prefix of the table name and the dot to make it explicit what we refer to.

Notice that now in the WHERE part we are not comparing one column with a literal value (like new_roots.root = "a"), but we are comparing two columns.

It means, more or less:

Hey SQL database 👋, please go and SELECT some data for me.

I'll first tell you the columns I want:

Let's get each root with the id, root, nicodemus, and the english gloss, along with the associated email for the user specified in the userId: * id of the new_roots table * userId of the users table * root of the new_roots table * nicodemus of the new_roots table * english of the new_roots table * email of the users table

I want you to get that data FROM the tables new_roots and users.

And I don't want you to combine each root with each possible user. Instead, for each root word, go and check each possible user, but give me only the ones WHERE the roots.userid is the same as the users.id. Note that SQLite isn't case sensitive in terms of the column names, so either userid or userId will work.

If we execute that SQL, it will return this table, except for id=4:

id userId english nicodemus root email
1 1 hello. a a colrc@gmail.com
null 1 so. (lit. Is that so?), a? a colrc@gmail.com
null 1 cut out, knock off!, quit, stop. (lit. Cut it out!, Knock it off, quit it, Stop it!), aaaa...! a colrc@gmail.com
2 2 hey. aye a jdoe@gmail.com
3 2 boots. buts bc jdoe@gmail.com
4 3 boots (to be wearing...). (lit. He is wearing boots) etsbutsbutsshn bc jadoe@gmail.com
... ... ... ... ... ...
11 4 bm bamp speeded (He...), tipsy (He became...). NULL
... ... ... ... ... ...

You can go ahead and try it in DB Browser for SQLite:

Note

Wait, what about bamp/speeded? 😱

He has an invalid userId of 4 in the database. And this SQL is comparing that 4 from the userId with all the id fields in the rows in the user table.

As there's no team with an id of 4, it doesn't find a match.

But we'll see how to fix that later with a LEFT JOIN from the users table to the new_roots table, or by applying constraints so that illegal data can't be put in columns that reference other columns, the PRIMARY KEY and FOREIGN KEY constraints.

JOIN Tables with SQL

There's an alternative syntax for that SQL query from above using the keyword JOIN instead of WHERE. Let's also add in userId so we can understand better why these two methods return different results.

This is the same version from above, using WHERE:

SELECT new_roots.id, userId, root, nicodemus, english, email
FROM new_roots, users
WHERE new_roots.userid = users.id;

And this is the alternative version using LEFT JOIN:

SELECT new_roots.id, userId, root, nicodemus, english, email
FROM new_roots
JOIN users 
ON new_roots.userid = users.id;

These SQL statements return equivalent results. The differences in the SQL code are that instead of passing the users to the FROM part (also called FROM clause) we add a JOIN (same as an INNER JOIN) and put the users table there.

And then, instead of putting a WHERE with a condition, we put an ON keyword with the condition, because ON is the one that comes with JOIN. 🤷

So, this second version means, more or less:

Hey SQL database 👋, please go and SELECT some data for me.

I'll first tell you the columns I want:

  • id of the new_roots table
  • userId of the users table
  • root of the new_roots table
  • nicodemus of the new_roots table
  • english of the new_roots table
  • email of the users table

...up to here it's the same as before, LOL.

Now, I want you to get that data starting FROM the table new_roots.

And to get the rest of the data, I want you to JOIN it with the table users.

And I want you to join those two tables ON the combinations of rows that have the new_roots.userid with the same value as the users.id.

Did I say all this before already? I feel like I'm just repeating myself. 🤔

That will return the same table than as before; in DB Browser for SQLite:

Tip

Why bother with all this?

This JOIN acts like the WHERE statement, where we only get rows that exist in both the users table and the new_roots table where the userId's match.

Soon we will see how to get ALL the rows from new_roots, even where there is no match.

JOIN Tables with SQL and LEFT (or LEFT OUTER)

When working with a JOIN, you can imagine that you start with a table on the FROM part and put that table in an imaginary space on the left side.

And then you want another table to JOIN the result.

And you put that second table in the right side on that imaginary space.

And then you tell the database ON which condition it should join those two tables and give you the results back.

But by default, only the rows from both left and right that match the condition will be returned.

table relationships

In this example of tables above 👆, it would NOT return all the new_roots, because not every new_root has a userId, so every new_root cannot be joined with the users table 😱:

id userId english nicodemus root email
1 1 hello. a a colrc@gmail.com
null 1 so. (lit. Is that so?), a? a colrc@gmail.com
null 1 cut out, knock off!, quit, stop. (lit. Cut it out!, Knock it off, quit it, Stop it!), aaaa...! a colrc@gmail.com
2 2 hey. aye a jdoe@gmail.com
3 2 boots. buts bc jdoe@gmail.com
4 3 boots (to be wearing...). (lit. He is wearing boots) etsbutsbutsshn bc jadoe@gmail.com
... ... ... ... ... ...
11 4 bm bamp speeded (He...), tipsy (He became...). NULL
... ... ... ... ... ...

Foreign Keys with NULL

But in the database that we are working with in the code above, bamp doesn't have any user, the value of userId is 4 in the database. It will also not return any row where the value of userId is NULL. Any mismatches are discarded.

So there's no way to join the bamp row with some row in the users table:

table relationships

Include Everything on the LEFT OUTER

In this case, that we want to include all new_roots in the result even if they don't have a user, we can extend that same SQL using a JOIN from above and add a LEFT OUTER (or simply LEFT) right before JOIN:

SELECT new_roots.id, userId, root, nicodemus, english, email
FROM new_roots
LEFT JOIN users 
ON new_roots.userid = users.id;

This LEFT OUTER part tells the database that we want to keep everything on the first table, the one on the LEFT in the imaginary space, even if those rows would be left out, so we want it to include the OUTER rows too. In this case, every root word with or without a user.

And that would return the following result, including bamp 🎉:

id userId english nicodemus root email
1 1 hello. a a colrc@gmail.com
null 1 so. (lit. Is that so?), a? a colrc@gmail.com
null 1 cut out, knock off!, quit, stop. (lit. Cut it out!, Knock it off, quit it, Stop it!), aaaa...! a colrc@gmail.com
2 2 hey. aye a jdoe@gmail.com
3 2 boots. buts bc jdoe@gmail.com
4 3 boots (to be wearing...). (lit. He is wearing boots) etsbutsbutsshn bc jadoe@gmail.com
... ... ... ... ... ...
11 🎉 4 bm bamp speeded (He...), tipsy (He became...). NULL
... ... ... ... ... ...

Tip

The only difference between this query and the previous is that extra LEFT OUTER.

And here's another of the SQL variations, you could write LEFT OUTER JOIN or just LEFT JOIN, it means the same.

Relationship Attributes

Here we have been using relationships in a somewhat ad-hoc manner, but in the next tutorial we will see how PRIMARY KEY and FOREIGN KEY constraints can enforce some data integrity rules and make our lives simpler. ✨

Let's learn more about primary and foreign keys🚀