Today, organizations increasingly rely on information gathering and insights to further their business growth, make strategic business decisions, and bring in a wider set of audiences. In other words, they need data. But where does it come from? It might seem like the term data science became trendy in the last decade. However, the history of data is much longer than that. In fact, it has been around much longer than we think. Despite being different from what we know now, data did exist. People have been using it for centuries, working hard to innovate new ways to benefit from it.
In this explorative article, we’ll take a time machine back to 19,000 BC when the first evidence of data collection was found. Next, we’ll fast-forward to the 1600s and the first known concept of data. Then, we’ll visit the 1800s and 1900s to track how data evolved over time to provide different services and solutions. Back to the future, we’ll briefly touch upon the modern-day applications of data analysis and data science, and what role they play in the different industries.
Table of Contents
- What is Data?
- The History of Data Timeline
- Understanding Data Analysis
- Understanding Data Science
- Real-life Data Use Cases
- History of Data: Next Steps
What is Data?
The word “data” derives from the Latin word “datum” (singular), which means the “thing given”. If you look at the Merriam-Webster definition, you might find it easier to understand the meaning.
Technically speaking, data is any information that has been translated into different forms to be processed, analyzed, managed, and transferred. The word itself is an enormous umbrella term for many concepts and scientific branches, such as statistics and mathematics. There are different data types according to their classification; information can be numerical and textual, or even in audio, video, and image form. The scope is big and continues to evolve as our environment changes around us.
The History of Data Timeline
While data is all around us, it’s important to understand where it comes from. What were its origins and how did it come to be as we know it now?
We might not have access to a working time machine like the Doctor’s Tardis, but we can still take a trip along the data timeline. Allons-y!
History of Data in 19,000 BC: The Great Baboon
The first use of data goes back to 19,000 BC when our Palaeolithic ancestors used a baboon tool called the Ishango bone to perform simple calculations. Back then, there were no calculators, pens, or even paper – as we know, those came much later. Thus, we consider this prehistoric invention as the first evidence of tallying or recording information.
History of Data in the 1640s: The Father of Public Health Statistics
In the 1640s, John Graunt, a hat maker, started collecting information regarding deaths in London. He noted down statistics such as:
- The number of deaths
- The mortality rate among age groups
- The causes of death
In 1665, he published his book, Natural and Political Observations Mentioned in a Following Index, and made upon the Bills of Mortality, as the collective result of his research. In it, the innovator stored information in tables and even attempted to predict life expectancy.
As a result of Graunt’s publication, the city started issuing a weekly report called “Bills of Mortality”. Knowing the number of deaths and births better prepared people for a possible plague outbreak.
History of Data in the 1880s: The Era of Data Processing
One day back in the 1880s, the German-American statistician Herman Hollerith saw a train conductor punching train tickets for passengers. That’s how the idea of using punch cards in writing and processing data was born. Hollerith started working on the design of the tabulation machine that uses punch cards, based on a previous model invented by the silk weaver Joseph Jacquard in the 1800s.
A punch card is normally a type of stiff paper, onto which a machine would create holes in specific locations. The cards are moved between brass rods so that the data is read electronically. With this great breakthrough, Hollerith helped the American government complete the US census within the same year – after it took them nearly a decade of trial and error.
History of Data in 1928: The Concept of Storage
Pfleumer’s Magnetic Tape
In 1928, the German engineer Fritz Pfleumer patented a magnetic tape that he used to replace wire recording for storing data. Essentially, he coated very thin paper with iron oxide powder and glued it with lacquer. The idea of storing information on magnetic tapes actually inspired the invention of floppy disks and hard-disk drives later on.
Codd’s Relational Model
In the 1960s, he started working on a model that can describe data attributes in columns and their values in rows:
Structurally speaking, each column has a header with the attribute or feature name, while the rows are full of values. Meanwhile, the entire table is named after the content within it.
Present-Day History of Data: The Internet Era
And with the introduction of Google in 1997, data became even more widely available to everyone with access to a computer or mobile device.
While this is the last stop in our time-traveling journey today, it’s certainly not the final destination on the history of data timeline. With every innovation in technology, data science, machine learning, or AI comes a new way of creating and spreading information.
Data has changed the way we look at the world and continues to shape it in different ways. Nowadays, industries like finance, astronomy, and many more, benefit from employing analytical techniques to improve their operations.
Understanding Data Analysis
Now that we’ve gone through the history of big data, it’s time to learn how to use it to its fullest potential today. As a starting point, we’re going to cover data analysis. The term, as defined by the Cambridge Dictionary, is “the process of examining information, especially using a computer, in order to find something out, or to help with decision making”.
In our context, data analysis is the full process of collecting, exploring, preprocessing, and analyzing data in order to draw insights that can be used by companies in almost every industry, from commerce to healthcare and transportation.
How to Conduct Data Analysis?
Step 1: Identify the Objective
The first step for any task is to question why it’s necessary in the first place. Why do you need to do this analysis? Perhaps you need to find which of your company’s products has sold the most, or why a specific production line has the biggest losses. You might even want to find out why a hospital’s patients spend too much time in the waiting room. The important thing is to have a clear objective in mind before you begin.
Step 2: Collect the Necessary Information
Now it’s time to gather your samples. There are multiple ways to collect data according to the analysis purpose. Based on the industry and available time to solve the problem, collection methods include but are not limited to:
- Surveys and questionnaires
- Online tracking
- Online and offline interviews
- Documents and historical records
Step 3: Clean the Data
It’s widely known in the data community that professionals spend 80 to 90% of their time cleaning data before analyzing it. Since the real world is rather messy and disorganized, you must get your hands dirty first.
This step can be broken into a few smaller ones, including:
- Inspecting the integrity and structure of your data
- Searching for outliers
- Checking for missing and duplicate values
Step 4: Start the Analysis
After cleaning, comes a quite significant step – the actual analysis process. This can be done in several ways depending on the purpose and data type, including but not limited to:
Each type serves a different purpose. You need to know beforehand which analysis method will be best suited for the results you want to obtain.
Step 5: Communicate the Findings
Now, the most important step is to communicate your findings, which eventually explain the answer to your question.
In this step, it’s critical to keep everything neat and simple. The best way to achieve this is through a dashboard of visualizations and a summary report that illustrates the insights drawn from the analysis step.
Understanding Data Science
According to IMB, “data science combines the scientific method, math, statistics, specialized programming, advanced analytics, AI, and even storytelling to uncover and explain the business insights hidden in data.”
This explains why studying it takes a much longer time than studying data analysis. The reason for this is that the latter is considered a phase in the data science process. To be a successful data scientist, you need to understand analysis first. On top of that, you have to have basic knowledge of programming, statistics, and machine learning.
If you’re still new to data analysis, then the 365 Data Science Data Analyst Career Track would be a perfect start. But if you’re already proficient and ready to advance, then you can challenge yourself with the Data Scientist Career Track, sit the exams, and earn your industry-recognized certificate to enhance your resume and boost your chances of success at data science job interviews.
Real-life Data Use Cases
There are countless applications of data in real life. As we’ve already mentioned, any industry that comes into your mind in fact employs different techniques within the organizational workflow.
Two decades ago, people were unfamiliar with the concept of data even though it has been used for a really long time in different ways. Today, we’re much more aware and know how to manipulate it to extract meaningful insights.
In this part of the article, we’ll delve into some real-life applications of data that many companies rely on to leverage their potential and improve results.
Web Traffic Forecasting
Five years ago, Google hosted a competition on Kaggle to predict future values of web traffic to help improve the infrastructure and solve issues like sudden outages. They offered a $25,000 prize to the user with the best predictive model. More than 1000 participating teams had to work on a set of 145,000 Wikipedia articles.
To do a similar project, you must first understand time series analysis. To get started, you can check out 365 Data Science’s Time Series Analysis with Python course. And while the prize is no longer up for grabs, you still can head over to the competition page on Kaggle and practice your newfound skills.
Fake News Detection
Data scientist Johnny Wales built an AI tool that can analyze a news article’s URL to predict whether the information is real or fake based on the words written in the article. He added the tool to the Unslanted website so that anyone can use it.
Amazon’s Use of Big Data
While we know Amazon as a multinational American company founded by Jeff Bezos in 1994, its humble origins started as a marketplace for books. Later, it expanded to offer the wide variety of categories we see now. Millions of customers purchase items from the website every day. How did they succeed in such a global outreach? Well, the company is a perfect case study of how data is used to transform businesses.
Amazon uses the power of big data and business analytics to analyze consumer behavior. Having millions of users from all over the world enables the data teams to sort through vast amounts of information, including:
- Purchase preferences
- Browsing habits
In case you’re eager to learn more about data use cases and their importance within a company, check out the AI Applications for Business Success to understand how any organization can benefit from data and AI.
Apple’s Wearable Technology
Apple is one of the leading organizations that use technology to improve people’s health. Their Apple Watch gives you access to a selection of features such as:
- Tracking your sleeping pattern
- Monitoring your heart rate
- Setting an alarm if your heartbeat exceeds a certain pre-defined threshold
- Receive reminders to drink water and exercise
The app basically collects data and analyses it to predict whether you’ll wake up relaxed or when a new menstruation cycle may begin.
Perhaps you’re an aspiring product manager, a data professional building a product with AI, or just curious how data improves the technology you use? In any case, the 365 Product Management for AI and Data Science course will give you the knowledge and skills necessary to make a product a success.
History of Data: Next Steps
Data has been around the dawn of time. And it has continuously changed our lives in more ways than we can imagine. Becoming part of this evolutionary process is an exciting and rewarding profession that helps shape the future.
Whether you’ve just discovered the world of data science or you’ve already taken some steps towards a career in this field, the 365 Data Science Program has what you need.
It offers self-paced courses led by renowned industry experts. Starting from the very basics all the way to advanced specialization, you will learn by doing with a myriad of practical exercises and real-world business cases. If you want to see how the training works, start with a selection of free lessons by signing up below.