Relational Database Essentials

Join over 2 million students who advanced their careers with 365 Data Science. Learn from instructors who have worked at Meta, Spotify, Google, IKEA, Netflix, and Coca-Cola and master Python, SQL, Excel, machine learning, data analysis, AI fundamentals, and more.

Start for Free
Martin Ganchev 25 Apr 2023 7 min read

What is a Database?

For many people, this question is more challenging than it might seem at first.

An answer like “a big file where a lot of information is stored” is not satisfactory and would not please potential employers. However, reading this article may help you get a better understanding of what a database actually is.

We've also made a video on the topic - you can watch it below or just scroll down if you prefer reading.

How Many Types of Databases are there?

You should remember there are two main types: relational database and non-relational database. The former will be the focus of our tutorials, while the latter regards more complex systems.

relational databases and non-relational databases

Although understanding non-relational databases requires a serious mathematical and programming background, some of the logic applied in its coding is the same as SQL. Likewise, relational databases have a few advantages on their own. A small bit of theory will explain why they are still the preferred choice in many companies and institutions.

Author's note: To find out how relational databases can help you become a workplace superhero, check out Actiondesk's super cool article.

Why make a Relational Database?

Databases’ main goal is to organize huge amounts of data that can be quickly retrieved upon users’ request. Therefore, they must be compact, well-structured, and efficient in terms of speed of data extraction.

Relational databases are compact well structured and efficient

How do Relational Databases Solve the Problem with Efficiency?

Today, people need extra efficiency because data occupies memory space. Moreover, the bigger its size, the more sluggish the database is and the slower the retrieval process becomes. If we have a database containing a multi-million-row table, with many columns, then every time a request has been received, the server must load all the records, with all fields, and it would take too much time for a task to be completed.

relational bases are connected to each other

Don’t forget every symbol is a container of information and requires bytes of storage space. Hence, loading that much data will not be an easy job for the computer.

The Solution

So, what allows us to contain so much data on the server and yet lets us efficiently use only the portions we need for our analysis? The secret lies behind the use of mathematical logic originating from relational algebra. Please, don’t worry - we will not bother you with math.

Relational algebra is the key

Imagine each table with data is represented by a transparent circle that contains all the data values of the table, categorized by columns or, as we will often call them, fields.

Fields contained within a circle

The Bad Way to Approach it

Now, if our database consists of only one table, a giant circle would represent the entire database -something like this huge table. As you can see in the picture below, it consists of names of customers, their id, their email, the number of the purchase, the date, and many other fields. A large table doesn't fit in the ircle

And when we need a piece of information from the database, for example, if we wish to see who has bought something on a certain date, we will have to lift this whole big circle and then search for what we need. This challenge seems vague and the process of data extraction will not be efficient.

trying to find information in the table becomes a challange

The Good Way to Approach it

Let’s see what will happen if we split the circle into 3 smaller circles. One circle will stand for the “Sales” table, the other for “Customers”, and the last one for “Items”.

splitting the circle into three - items, customers and sales

There are various theoretical combinations between 3 or more circles, but in our database, we have the following model. “Sales” and “Customers” have the same customer ID column, and “Sales” and “Items” have the same item code column.

The circles are connected by specific information such as customer ID or item code

This way, we can see the circles overlap as they have common fields.

How does an RDBMS work?

So, if we’d like to extract the same information, the names of the customers who have purchased something on a given date, we will only need the date of purchase column from the “Sales” table and the first and last name from the “Customers” table.

For certain requests you will only need particular circles

So, to satisfy this request, we will not need to lift the third circle from our database, “Items”. This way, we can save energy or, more technically, increase efficiency. Less data, represented by only two of the three circles, will be involved in this operation.

The Mathematical Trick

The mathematical trick lies in relating the tables to one another. Relationships were formed namely through the common fields.

Relationships are formed through common fields

That’s why some professionals may refer to the tables, or the circles in our plot, as relations because theoretically, they are the smallest units in the entire system that can carry integral logical meaning. Likewise, the three circles are all part of the same database.

The three circles are all part of the same database

So, when we combine the database and its existing relations, we obtain the famous term relational database management system, frequently abbreviated as RDBMS.

RDBMS relational database management system

What Are the Benefits of Using a Relational Database?

In summary, to be a good programmer, you should fully embrace the idea of working with relational databases. The tool which will allow you to do that is SQL – a language that was designed to manage relational database management systems by creating relations between the different tables in a database.

So, if you want to continue learning new concepts, feel free to jump into the tutorial where we outline the most substantial differences between a database and a spreadsheet.

***

Eager to hone your SQL skills? Check out our SQL course.

Next Tutorial: Databases Vs Spreadsheets

Martin Ganchev

Instructor at 365 Data Science

Martin holds an MSc degree in Economic and Social Sciences from Bocconi University. His diverse academic and research experience combined with his friendly and explanatory approach to teaching have made him one of the most beloved instructors on our team. Some of the courses he has authored include: SQL, SQL + Tableau, SQL+Tableau+Python, Introduction to Python, Introduction to Jupyter, to name a few.

Top