Skip to content

What's a Database? The Basics

If you're wondering what a database is (and if it's the same thing as a spreadsheet), you're exactly where you should be! 📍

The good news is that you're already familiar with complex databases in real life: 📚 libraries! 📚 We'll use this to wrap our heads around the concepts central to databases.

Keeping things organized

A library's content is the books (& other media, but we'll focus on books to keep things simple). But if all the books of a library were thrown in a massive pile, it would hardly be helpful to anyone! 😬

Organization is the key to making a library useful.

Let's imagine we're in charge of keeping a library organized. What information about each book do we need to keep track of to maintain an organized and useful library?

Jot down some ideas! Then click here to see if we thought of the same things. Here's what we thought of: - title - author - genre - publication date - publisher - description - shelf/location - language If your list looks a bit different, don't worry! As long as you had something similar, you're on the right track.

Books in a database

If we put our books into a database with the information we listed, it might look something like the image below.

Tip

Take a look at the terminology, as we'll come back to those terms often.

At this point you might be thinking, "That looks a lot like a spreadsheet" and you'd be right. But hang tight! We'll introduce one way spreadsheets and databases differ next.

Let's imagine...

One day someone comes to our library trying to find books written by Norwegian authors born between 1910 and 1930. It turns out having author information is also a part of a useful library! ✍️

What information about each author do we need to keep track of to maintain an organized and useful library?

Again, jot down your ideas! Then click here to see if we thought of the same things. Here's what we thought of: - first name - last name - date of birth - nationality - biography - books written If you thought of even more, that's great!

But...

Does it really make sense to include author date of birth (and all the other details) in the book's information? 🤔 That would mean that for an author with 17 books, we would need to repeat their biographical information 17 times! (And it would be 17 chances for us to make a mistake that could throw a wrench 🔧 in our database!)

There's got to be a better way...

A better way: separate tables!

What if we make separate tables for each book and each author? That way we only have to list each set of information once.

So we've solved the redundancy problem. Now we just need to find way to make sure the book information and the author information can reference each other.

The catch: titles and names aren't unique

If we try to link an author's books via the titles, there's a problem if more than one book in our library shares a title. What if we have two books titled Corkboards and You? The same is true for authors who might share a name.

How can we fix this?

The solution: unique IDs!

If we give each book and each author a unique ID that's separate from their names, we don't have to worry about our database getting its wires crossed. 🔌

Info

The unique IDs could be integers (1, 2, 3, ...) or some other set of characters, whatever works best for our system.

Guess what? You've seen bookIDs before! 😁

Remember Dewey Decimal Numbers? Those are unique IDs in real life! Each book has a unique Dewel Decimal Number. Libraries also use them to keep track of the books' locations in the library.

In an actual database, we can see our different tables and how they relate to each other with a entity relation diagram (ERD). This is what our book and author tables would look like as an ERD.

Tip

Again, take a look at the terms used, as they'll be important for understanding the following lessons!

Our databases can have as many tables and relations as we need. In fact, a true library might have a database system that looks more like this:

The ability to have separate tables that all connect to each other is what makes databases so powerful!

Technically...

The collection of data (i.e., the books) is the database. The library as an institution is better labeled a database management system (DBMS). A DBMS allows us to interact with and use the data without accidentally messing it up (or deleting it!). 😱

How is a library a DBMS?

Content is managed: - Books are assigned Dewey Decimal Numbers and shelved. - Books are checked out, checked back in, and reshelved. - The card catalog directs you to the correct books.

Content is protected: - Experts assign the IDs to books. - Experts shelve the books. - Experts maintain the card catalog (index). - No one can randomly remove a book from the library. - Experts know how the library works and can guide you in using it.

Um, actually...

Libraries are relational DBMSs.

How is a library a relational DBMS?

A relational database is one where multiple tables can reference each other. We saw that happening already in our library analogy above, but now we're putting a label on it. 🏷️

In other words, the new italicized points below are what make a library a relational database.

Content is managed: - Books are assigned Dewey Decimal Numbers and shelved. - Books are checked out, checked back in, and reshelved. - The card catalog directs you to the correct books. - Reference books for subject areas reference the book authors and titles, which in turn can be found in the card catalog, which in turn can be found in the shelves.

Content is protected: - Experts assign the IDs to books. - Experts shelve the books. - Experts maintain the card catalog (index). - No one can randomly remove a book from the library. - Experts know how the library works and can guide you in using it. - If a card is added referencing a book, the book is immediately shelved - If a card is removed from the card catalog, the book is also immediately removed from the shelves.

So now you know the basic concepts that databases are built on. You're ready to go deeper in the next lesson!