Resolved: Error on using groupby mean()
Why is this not working, it says aggregate function failed?
sales_data.groupby('TimeZone').mean()
Hi Olanike!
Thanks for reaching out!
The error message suggests that there may be non-numeric columns in your DataFrame that are causing the issue when trying to calculate the mean for each group.
Have you completed all the data preparation and manipulation steps outlined in the notebook? I recommend revisiting and executing all the cells within the provided resource file. If the issue persists, please don't hesitate to share a screenshot of the earlier portions of your code for further assistance.
Looking forward to your answer.
Best,
Ivan
The issue persist, now i ran the provided resource file directly. All previous codes ran well.
Hi Olanike!
After investigating the problem, it appears that this error may be associated with the most recent releases of pandas. I suggest considering a downgrade of the pandas package, such as reverting to version 1.3.5. By doing so, the code will likely function as expected.
So, open Anaconda Prompt and execute:pip install pandas==1.3.5
After that, apply Kernel >> Restart & Clear Output in Jupyter and re-run all codecells in the Notebook.
Hope this helps.
Best,
Ivan
Will downgrading the version not affect the running of other codes requiring the latest version?
The version of pandas I currently have is 2.1.2 and the latest pandas version is 2.1.4.
Or would you suggest I upgrade back once I can confirm this code runs?
You can proceed with downgrading to pandas version 1.3.5 for the current project. If the code works as expected and there are no compatibility issues with other projects, you can leave it at that version. However, if you encounter issues in other projects, you may consider upgrading back to the latest version for those projects.
Alternatively, you can create a virtual environment for the specific project with pandas version 1.3.5. This allows you to isolate the environment and install the required version of pandas without affecting the global Python installation. Other projects that require the latest pandas version can continue to use it in their separate environments.
For now, I suggest you go with option 1, as you can upgrade pandas again at any time.
Best,
Ivan
Downgrading was unsuccessful.
Hi Olanike!
Newer versions of pandas often have pre-built wheels available for various platforms, which is likely why you were able to install version 2.1.2 without needing to compile anything. For older versions like 1.3.5, if a pre-built wheel is not available for your specific setup, Python's pip will attempt to build the package from source, which requires the C++ build tools.
What I suggest you do is the following:
1. Uninstall your current version of pandas.
pip uninstall pandas
2. Install Microsoft Visual C++ Build Tools.
You can download the Build Tools for Visual Studio from the Microsoft website. Make sure to select the C++ build tools during installation.
3. Update setuptools and wheel.
pip install --upgrade setuptools wheel
4. Install pandas.
pip install pandas==1.3.5
Alternatively, let me remind you of the option to create a clean environment with this version of pandas. You can create a new environment in Anaconda with a specific pandas version using the following code:
conda create -n new_env_name pandas=1.3.5
This creates a new environment named new_env_name
with pandas 1.3.5. You can then activate this environment and work within it:
conda activate new_env_name
I apologize for the challenges you're facing and I'm hopeful that we'll be able to resolve them together shortly.
Best,
Ivan
This is the solution without having to downgrade Pandas as suggested by Ivan.
sales_data.groupby('TimeZone').mean(numeric_only=True)