Pandas 1.0.0 is Out!!! What Are The New Features?

Join over 2 million students who advanced their careers with 365 Data Science. Learn from instructors who have worked at Meta, Spotify, Google, IKEA, Netflix, and Coca-Cola and master Python, SQL, Excel, machine learning, data analysis, AI fundamentals, and more.

Start for Free
Iliya Valchanov Iliya Valchanov 20 Oct 2021 1 min read

pandas 1.0.0

As a data science enthusiast, you probably know by heart that Pandas is Python’s primary library for data analysis and manipulation. What you may not have heard already is that Pandas 1.0.0 was officially released!

The New Pandas 1.0.0

Although at first sight this latest version is not much different for the user than the previous release starting with a 0: Pandas 0.25.3, there are plenty of enhanced features that boost performance and lay a better foundation in the long run. They represent 1.0.0 as a stable version of pandas with a strengthened API, which has also been cleaned of many prior version deprecations.

So, what are the most notable improvements that come with Pandas 1.0.0?

StringDtype and BooleanDtype

The dedicated string and Boolean data types are still “experimental”, which means that further improvements are expected to happen in the near future. So, as of yet, pandas will not automatically assign “string” or “bool” to your data. This can only happen if you explicitly specify dtype=pd.StringDtype() or dtype=pd.BooleanDtype() while creating a new structure.However, in the future, this may become the default way in which pandas treats data of this type. We’ll just have to wait and see.

Also, you must consider the benefit of having the new “string” data type. For example, until now, pandas would treat a date value and a string value as “object”. Using “string” allows you to distinguish between the two, so now you can select and manipulate string data much more easily. Which leads us to the second point worth mentioning.

DataFrame.select_dtypes() performance improvement

The DataFrame.select_dtypes() method is much quicker now! It relies on vectorization instead of iterating over a loop. So, you can run df.select_dtypes(include='strings') to pull all string values, or df.select_dtypes(include='bool') to retrieve the Boolean data from a DataFrame, provided that you have set them as such beforehand.

Experimental NA scalar

We now can enjoy the pd.NA scalar that denotes missing values. Using pd.NA is a new concept in the scientific ecosystem of Python, and its goal is to provide an indicator for missing values that can be used consistently and successfully across data types. \

That said, this feature is currently “experimental”, too. The reason is that it is yet to be further verified how it will intertwine with the simultaneous work of other packages such as NumPy.

DataFrame.convert_dtypes()

A method that will convert the data types of columns containing such null values has been introduced – DataFrame.convert_dtypes().

DataFrame.info()

The well-known DataFrame.info() has been improved. It is much more readable and this does help you to explore your data in a quicker and more efficient way.

to_markdown()

Now we also have .to_markdown() – this new method allows you to display a Series or DataFrame object as a markdown table.  

So overall, a lot has been done but mainly on the backend. For everyday users like us, the development of clear data types, consistent with other libraries is surely the most prominent improvement.

In any case, it is worth checking the official release notes for more information before you start using Pandas 1.0.0. There you can find out more about the changes related to using such features as .sort_index() or .sort_values() methods and many more.

Finally, note that you need at least Python 3.6.1 to use this new version.

With that in mind, 

pip install pandas --upgrade

and have fun!

***

If you’re enthusiastic about boosting your Python knowledge, check out our super practical tutorials! And, if you're new to Pandas and Python, you'll benefit greatly from our guide: Learning Python Programming - Everything You should Know.  

Iliya Valchanov

Iliya Valchanov

Co-founder of 365 Data Science

Iliya is a finance graduate with a strong quantitative background who chose the exciting path of a startup entrepreneur. He demonstrated a formidable affinity for numbers during his childhood, winning more than 90 national and international awards and competitions through the years. Iliya started teaching at university, helping other students learn statistics and econometrics. Inspired by his first happy students, he co-founded 365 Data Science to continue spreading knowledge. He authored several of the program’s online courses in mathematics, statistics, machine learning, and deep learning.

Top