Read Connected Data¶
Now that we have some data in both tables, let's select the data that is connected together.
The users table has this data:
We have both of the tables new_roots and users.
users
id | first | last | username | |
---|---|---|---|---|
1 | colrc@gmail.com | Lawrence | Nicodemus | original |
2 | jdoe@gmail.com | John | Doe | jdoe |
3 | jadoe@gmail.com | Jane | Doe | jadoe |
null | rdoe@gmail.com | Robert | Doe | rdoe |
new_roots
id | userId | english | nicodemus | root |
---|---|---|---|---|
1 | 1 | hello. | a | a |
null | 1 | so. (lit. Is that so?), | a? | a |
null | 1 | cut out, knock off!, quit, stop. (lit. Cut it out!, Knock it off, quit it, Stop it!), | aaaa...! | a |
2 | 2 | hey. | aye | a |
3 | 2 | boots. | buts | bc |
4 | 3 | boots (to be wearing...). (lit. He is wearing boots) | etsbutsbutsshn | bc |
5 | 3 | boot. (lit. a borrowed root) | sbutsbutsshn | bc |
6 | 1 | rubber boots (putting on...). | sbutsbutsshnmsh | bc |
7 | 1 | bull. | buuli | bl |
... | ... | ... | ... | ... |
SELECT
Connected Data with SQL¶
Let's start seeing how SQL works when selecting connected data. This is where SQL databases actually shine.
If you don't have a database.db file, run that previous program we had written (or copy it from the preview above) to create it.
Now open DB Browser for SQLite and open the database.db file.
To SELECT
connected data we use the same keywords we have used before, but now we combine the two tables.
Let's get each root with the id, root, nicodemus, and the english gloss, along with the associated email for the user specified in the userId:
SELECT new_roots.id, userId, root, nicodemus, english, email
FROM new_roots, users
WHERE new_roots.userid = users.id;
Info
Because we have two columns called id, one for roots and one for users, we can specify them with the prefix of the table name and the dot to make it explicit what we refer to.
Notice that now in the WHERE
part we are not comparing one column with a literal value (like new_roots.root = "a"), but we are comparing two columns.
It means, more or less:
Hey SQL database 👋, please go and
SELECT
some data for me.I'll first tell you the columns I want:
Let's get each root with the id, root, nicodemus, and the english gloss, along with the associated email for the user specified in the userId: *
id
of thenew_roots
table *userId
of theusers
table *root
of thenew_roots
table *nicodemus
of thenew_roots
table *english
of thenew_roots
table *users
tableI want you to get that data
FROM
the tablesnew_roots
andusers
.And I don't want you to combine each root with each possible user. Instead, for each root word, go and check each possible user, but give me only the ones
WHERE
the roots.userid is the same as the users.id. Note that SQLite isn't case sensitive in terms of the column names, so either userid or userId will work.
If we execute that SQL, it will return this table, except for id=4:
id | userId | english | nicodemus | root | |
---|---|---|---|---|---|
1 | 1 | hello. | a | a | colrc@gmail.com |
null | 1 | so. (lit. Is that so?), | a? | a | colrc@gmail.com |
null | 1 | cut out, knock off!, quit, stop. (lit. Cut it out!, Knock it off, quit it, Stop it!), | aaaa...! | a | colrc@gmail.com |
2 | 2 | hey. | aye | a | jdoe@gmail.com |
3 | 2 | boots. | buts | bc | jdoe@gmail.com |
4 | 3 | boots (to be wearing...). (lit. He is wearing boots) | etsbutsbutsshn | bc | jadoe@gmail.com |
... | ... | ... | ... | ... | ... |
11 | 4 | bm | bamp | speeded (He...), tipsy (He became...). | NULL |
... | ... | ... | ... | ... | ... |
You can go ahead and try it in DB Browser for SQLite:
Note
Wait, what about bamp/speeded? 😱
He has an invalid userId of 4 in the database. And this SQL is comparing that 4 from the userId with all the id fields in the rows in the user table.
As there's no team with an id of 4, it doesn't find a match.
But we'll see how to fix that later with a LEFT JOIN
from the users table to the new_roots table, or by applying constraints so that illegal data can't be put in columns that reference other columns, the PRIMARY KEY
and FOREIGN KEY
constraints.
JOIN
Tables with SQL¶
There's an alternative syntax for that SQL query from above using the keyword JOIN
instead of WHERE
. Let's also add in userId so we can understand better why these two methods return different results.
This is the same version from above, using WHERE
:
SELECT new_roots.id, userId, root, nicodemus, english, email
FROM new_roots, users
WHERE new_roots.userid = users.id;
And this is the alternative version using LEFT JOIN
:
SELECT new_roots.id, userId, root, nicodemus, english, email
FROM new_roots
JOIN users
ON new_roots.userid = users.id;
These SQL statements return equivalent results. The differences in the SQL code are that instead of passing the users to the FROM
part (also called FROM
clause) we add a JOIN
(same as an INNER JOIN
) and put the users table there.
And then, instead of putting a WHERE
with a condition, we put an ON
keyword with the condition, because ON
is the one that comes with JOIN
. 🤷
So, this second version means, more or less:
Hey SQL database 👋, please go and
SELECT
some data for me.I'll first tell you the columns I want:
id
of thenew_roots
tableuserId
of theusers
tableroot
of thenew_roots
tablenicodemus
of thenew_roots
tableenglish
of thenew_roots
tableusers
table...up to here it's the same as before, LOL.
Now, I want you to get that data starting
FROM
the table new_roots.And to get the rest of the data, I want you to
JOIN
it with the table users.And I want you to join those two tables
ON
the combinations of rows that have the new_roots.userid with the same value as the users.id.Did I say all this before already? I feel like I'm just repeating myself. 🤔
That will return the same table than as before; in DB Browser for SQLite:
Tip
Why bother with all this?
This JOIN
acts like the WHERE
statement, where we only get rows that exist in both the users table and the new_roots table where the userId's match.
Soon we will see how to get ALL the rows from new_roots, even where there is no match.
JOIN
Tables with SQL and LEFT
(or LEFT OUTER
)¶
When working with a JOIN
, you can imagine that you start with a table on the FROM
part and put that table in an imaginary space on the left side.
And then you want another table to JOIN
the result.
And you put that second table in the right side on that imaginary space.
And then you tell the database ON
which condition it should join those two tables and give you the results back.
But by default, only the rows from both left and right that match the condition will be returned.
In this example of tables above 👆, it would NOT return all the new_roots, because not every new_root has a userId, so every new_root cannot be joined with the users table 😱:
id | userId | english | nicodemus | root | |
---|---|---|---|---|---|
1 | 1 | hello. | a | a | colrc@gmail.com |
null | 1 | so. (lit. Is that so?), | a? | a | colrc@gmail.com |
null | 1 | cut out, knock off!, quit, stop. (lit. Cut it out!, Knock it off, quit it, Stop it!), | aaaa...! | a | colrc@gmail.com |
2 | 2 | hey. | aye | a | jdoe@gmail.com |
3 | 2 | boots. | buts | bc | jdoe@gmail.com |
4 | 3 | boots (to be wearing...). (lit. He is wearing boots) | etsbutsbutsshn | bc | jadoe@gmail.com |
... | ... | ... | ... | ... | ... |
11 | 4 | bm | bamp | speeded (He...), tipsy (He became...). | NULL |
... | ... | ... | ... | ... | ... |
Foreign Keys with NULL
¶
But in the database that we are working with in the code above, bamp doesn't have any user, the value of userId is 4 in the database. It will also not return any row where the value of userId is NULL
. Any mismatches are discarded.
So there's no way to join the bamp row with some row in the users table:
Include Everything on the LEFT OUTER
¶
In this case, that we want to include all new_roots in the result even if they don't have a user, we can extend that same SQL using a JOIN
from above and add a LEFT OUTER
(or simply LEFT
) right before JOIN
:
SELECT new_roots.id, userId, root, nicodemus, english, email
FROM new_roots
LEFT JOIN users
ON new_roots.userid = users.id;
This LEFT OUTER
part tells the database that we want to keep everything on the first table, the one on the LEFT
in the imaginary space, even if those rows would be left out, so we want it to include the OUTER
rows too. In this case, every root word with or without a user.
And that would return the following result, including bamp 🎉:
id | userId | english | nicodemus | root | |
---|---|---|---|---|---|
1 | 1 | hello. | a | a | colrc@gmail.com |
null | 1 | so. (lit. Is that so?), | a? | a | colrc@gmail.com |
null | 1 | cut out, knock off!, quit, stop. (lit. Cut it out!, Knock it off, quit it, Stop it!), | aaaa...! | a | colrc@gmail.com |
2 | 2 | hey. | aye | a | jdoe@gmail.com |
3 | 2 | boots. | buts | bc | jdoe@gmail.com |
4 | 3 | boots (to be wearing...). (lit. He is wearing boots) | etsbutsbutsshn | bc | jadoe@gmail.com |
... | ... | ... | ... | ... | ... |
11 🎉 | 4 | bm | bamp | speeded (He...), tipsy (He became...). | NULL |
... | ... | ... | ... | ... | ... |
Tip
The only difference between this query and the previous is that extra LEFT OUTER
.
And here's another of the SQL variations, you could write LEFT OUTER JOIN
or just LEFT JOIN
, it means the same.
Relationship Attributes¶
Here we have been using relationships in a somewhat ad-hoc manner, but in the next tutorial we will see how PRIMARY KEY
and FOREIGN KEY
constraints can enforce some data integrity rules and make our lives simpler. ✨
Let's learn more about primary and foreign keys🚀