Jumping into the topic of the relational database, it is essential to have an idea what database means. What is the first thing that comes to your mind when you hear the word database?
For many people, this question is more challenging than it might seem at first. An answer like “a big file where a lot of information is stored” is not satisfactory and would not please potential employers. However, reading this article may help you get a better understanding of what a database actually is.
Types of Databases
You should remember there are two main types: relational database and non-relational database. The former will be the focus of our tutorials, while the latter regards more complex systems.
Although understanding non-relational databases requires a serious mathematical and programming background, some of the logic applied in its coding is the same as SQL. Likewise, relational databases have a few advantages on their own. A small bit of theory will explain why they are still the preferred choice in many companies and institutions.
Why make a Relational Database
Databases’ main goal is to organize huge amounts of data that can be quickly retrieved upon users’ request. Therefore, they must be compact, well-structured, and efficient in terms of speed of data extraction.
The Problem with Efficiency
Today, people need extra efficiency because data occupies memory space. Moreover, the bigger its size, the more sluggish the database is and the slower the retrieval process becomes. If we have a database containing a multi-million-row table, with many columns, then every time a request has been received, the server must load all the records, with all fields, and it would take too much time for a task to be completed.
Don’t forget every symbol is a container of information and requires bytes of storage space. Hence, loading that much data will not be an easy job for the computer.
So, what allows us to contain so much data on the server and yet lets us efficiently use only the portions we need for our analysis? The secret lies behind the use of mathematical logic originating from relational algebra. Please, don’t worry – we will not bother you with math.
Imagine each table with data is represented by a transparent circle that contains all the data values of the table, categorized by columns or, as we will often call them, fields.
The Bad Way to Approach it
Now, if our database consists of only one table, a giant circle would represent the entire database -something like this huge table. As you can see in the picture below, it consists of names of customers, their id, their email, the number of the purchase, the date, and many other fields.
And when we need a piece of information from the database, for example, if we wish to see who has bought something on a certain date, we will have to lift this whole big circle and then search for what we need. This challenge seems vague and the process of data extraction will not be efficient.
The Good Way to Approach it
Let’s see what will happen if we split the circle into 3 smaller circles. One circle will stand for the “Sales” table, the other for “Customers”, and the last one for “Items”.
There are various theoretical combinations between 3 or more circles, but in our database, we have the following model. “Sales” and “Customers” have the same customer ID column, and “Sales” and “Items” have the same item code column.
This way, we can see the circles overlap as they have common fields.
How it Works
So, if we’d like to extract the same information, the names of the customers who have purchased something on a given date, we will only need the date of purchase column from the “Sales” table and the first and last name from the “Customers” table.
So, to satisfy this request, we will not need to lift the third circle from our database, “Items”. This way, we can save energy or, more technically, increase efficiency. Less data, represented by only two of the three circles, will be involved in this operation.
The Mathematical Trick
The mathematical trick lies in relating the tables to one another. Relationships were formed namely through the common fields.
That’s why some professionals may refer to the tables, or the circles in our plot, as relations because theoretically, they are the smallest units in the entire system that can carry integral logical meaning. Likewise, the three circles are all part of the same database.
So, when we combine the database and its existing relations, we obtain the famous term relational database management system, frequently abbreviated as RDBMS.
The Benefits of Using a Relational Database
In summary, to be a good programmer, you should fully embrace the idea of working with relational databases. The tool which will allow you to do that is SQL – a language that was designed to manage relational database management systems by creating relations between the different tables in a database.
So, if you want to continue learning new concepts, feel free to jump into the tutorial where we outline the most substantial differences between a database and a spreadsheet.
Next Tutorial: Databases Vs Spreadsheets