Building machine learning models is fun. But sharing them with the rest of the world is even more fun. In this tutorial, we will learn how to build a simple ML model and then deploy it using Streamlit. In the end, you will have a web application running your model which you can share with all your friends or customers.
This exercise assumes that you have a bit of experience with Python and the sklearn library. But if that’s not the case, don’t worry!
Here’s a tutorial that covers everything you need to know to get started with predictive models in Python.
What Kind of ML Model Are We Building?
Our goal is to build a classification app that predicts the type of flower depending on characteristics like petal and sepal length/width. To do this, we will be working with a simple yet famous dataset called Iris. Check out a demo of the final app that we will deploy to get a better sense of our objective.
Table of Contents
- Part I: Building Your Machine Learning Model
- Part II: Building the Streamlit App
- Deploying Machine Learning Models with Python and Streamlit: Next Steps
Part I: Building Your Machine Learning Model
Setting Up Your Project Structure
We begin with an outline of the project structure and the required notebook and data files. To successfully deploy the project to Streamlit you’ll need to have the data file, in our case the iris data set, as well as create three Python scripts: one for the app, one for the model, and one for the prediction:
- data: folder where we are going to store our iris.csv dataset. We can safely do this because the dataset is small and we can store it on our own computer.
- app.py: the file where we will code the Streamlit app.
- model.py: the file where we will train our model.
- prediction.py: the file where we will code functions that will allow us to run predictions every time a user triggers them.
In the following sections, we will discuss how to build each of the three Python notebooks, starting with model.py.
Building Your Model
Since this tutorial is focused on how to deploy machine learning models, we will not spend a lot of time perfecting every feature of our model. Instead, we will directly implement a random forest classifier which will give us an accurate enough model. But be aware that for a higher degree of accuracy, you will have to spend some time fine-tuning this or another classifier.
If you’re interested in learning more about how to make the best of the random forest classifier, check out 365’s course Machine Learning with Decision Trees and Random Forests.
# Read original dataset
iris_df = pd.read_csv(“data/iris.csv”)
iris_df.sample(frac=1, random_state=seed)
# selecting features and target data
X = iris_df[[‘SepalLengthCm’, ‘SepalWidthCm’, ‘PetalLengthCm’, ‘PetalWidthCm’]]
y = iris_df[[‘Species’]]
# split data into train and test sets
# 70% training and 30% test
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=seed, stratify=y)
# create an instance of the random forest classifier
clf = RandomForestClassifier(n_estimators=100)
# train the classifier on the training data
clf.fit(X_train, y_train)
# predict on the test set
y_pred = clf.predict(X_test)
# calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f”Accuracy: {accuracy}”) # Accuracy: 0.91
Saving Your Model
After making sure that our model performs well on the train and test set, we can use the joblib library to save our progress.
# save the model to disk
joblib.dump(clf, “rf_model.sav”)
And that’s it! This is how easy it is to build working machine learning models in Python!
But before we start training our model on real data, we need to create an intuitive interface so users without a machine learning background can easily navigate through it. For this, we will be using Streamlit.
Part II: Building the Streamlit App
Creating the Streamlit UI
Let's start with the most immediately visible part of the app - the UI interface.
- Install Streamlit using
pip install streamlit
- Go to the "app.py" file and import the Streamlit library
Here is the relevant code for that:
import streamlit as st,
import pandas as pd
import numpy as np
from prediction import predict
Then we can use some of the Streamlit helper functions that allow us to easily create a beautiful user interface with ease.
To build an app just like the one above, first we need to add a title and description:
st.title(‘Classifying Iris Flowers’)
st.markdown(‘Toy model to play to classify iris flowers into \
setosa, versicolor, virginica’)
Next, we need to include features sliders for the four plant features:
st.header(“Plant Features”)
col1, col2 = st.columns(2)
with col1:
st.text(“Sepal characteristics”)
sepal_l = st.slider(‘Sepal lenght (cm)’, 1.0, 8.0, 0.5)
sepal_w = st.slider(‘Sepal width (cm)’, 2.0, 4.4, 0.5)
with col2:
st.text(“Pepal characteristics”)
petal_l = st.slider(‘Petal lenght (cm)’, 1.0, 7.0, 0.5)
petal_w = st.slider(‘Petal width (cm)’, 0.1, 2.5, 0.5)
In order to make new predictions, we also should include a prediction button, which can be realized with the following line:
st.button(“Predict type of Iris”)
Lastly, to be able to run the app locally, you need to initiate the following command from your terminal:
streamlit run app.py
This will fire up a browser window where you can see your current Streamlit app.
Our app looks awesome but still doesn’t carry out any predictions. Why? Because we need to tell the predict button to call our saved model and run a prediction with the user’s selected values. We can do this by creating a predict function on the predicction.py file and then calling it on the app.py file.
Loading Your Saved Model & Making Real-Time Predictions
Let’s go to the predicction.py file and create this basic prediction function:
import joblib
def predict(data):
clf = joblib.load(“rf_model.sav”)
return clf.predict(data)
This function loads our previously trained model and runs a prediction with the data that we will pass from the app.py file. Now, let’s go back and import our predict function:
from prediction import predict
Finally, the only thing we need to alter is the predict button. We’ll change it so if a user clicks on it, it will run a prediction.
if st.button(“Predict type of Iris”):
result = predict(np.array([[sepal_l, sepal_w, petal_l, petal_w]]))
st.text(result[0])
To dig a little deeper into the code we’re using here, check out this GitHub entry.
Deploying the Streamlit App
We’re now ready to power up our app and try out its classification capabilities. Here’s how we’ll do that:
Step 1: Create a requeriments.txt file at the root of your folder with the libraries that we used
joblib==0.14.1
streamlit==1.7.0
scikit-learn==0.23.1
pandas==1.0.5
Step 2: Create a GitHub repository (in case you haven’t) and push your code
Step 3: Create a Streamlit account and connect your GitHub profile to it
Step 4: Now, on the Streamlit dashboard click the “New app” button
Step 5: Link your Streamlit app with your GitHub repository:
Step 6: Click “Deploy” and you're all done!
Now you can share your new app with the world!
You can find the full code on the GitHub repository. Here is a demo app for you to see how it works in real-time.
1. To use it in Python you first need to install Streamlit. You can do so via pip with: pip install streamlit
2. Spend some time learning about the main concepts in Streamlit – if you’ve never worked with Streamlit before, it’s worth investing a little bit of time learning about the main concepts like how an app is created and how it’s run. You can do so directly on their site.
3. Creating the app. If you know how Streamlit works, you can proceed with creating your app. To do so:
a. First, create a Python script. For example
my_app.py
b. You also need to import your required libraries like so:
import streamlit as st
import pandas as pd
import numpy as np
c. You can add a title to your app with:
st.title(My first streamlit app')
4. To run the app you can simply run the script from the command line with: streamlit run my_app.py
5. You can improve your script, by loading actual data and creating a machine learning model. You can further add options for your Streamlit app, like adding buttons, filters, charts, and much more. It’s up to you.
2. Load the data file into Python – if you’re data is in a separate file, like a csv you need to read it into Python. You can use pandas read_csv() to read a csv file as a pandas data frame
3. Preprocess the data – depending on the data and algorithm this might include:
a. Splitting the data into train test – you must always train your algorithm on a completely different data set than the testing data
b. Encoding the data to numerical values if they are categorical – some algorithms will have difficulty working with non-numerical values
c. Standardizing the data – so that no feature in the data carries more weight than the rest
d. A specific data modification if the algorithm requires it. For example, for an SVM, you’ll need to rescale all your values to be in the [-1,1] interval.
4. Choose your machine learning algorithm and create a machine learning model. This will depend on your data and your problem, i.e. classification, regression, or clustering tasks, large or small data sets will determine your choice of machine learning method.
5. Train your model on the training data using the fit() method.
6. After the fitting process is complete, your model is trained and ready to be tested on your test data
7. After you test your model, you can evaluate its performance and try to improve performance with the help of hyperparameter tuning or other techniques
Deploying Machine Learning Models with Python and Streamlit: Next Steps
The Python + Streamlit machine learning model for identifying flowers based on their characteristic features is a fairly popular exercise among budding machine learning specialists. However, its hidden value lies in the fact that it gives you a hands-on understanding of random forests – a topic that frequently crops up during machine learning interviews.
If ML is your data career of choice, you’re in for a treat. With machine learning set on overtaking data science and analytics in the next 10 years as the most sought-after profession on the market, you’ll do best to keep honing your skills and expanding your toolkit of predictive analytics models.
The 365 Data Science Program offers a wide range of self-paced courses on topics such as Machine Learning with Random Forests and Decision Trees, K-Nearest Neighbors, Naïve Bayes, and more – all led by renowned industry experts. Even if you have no clue about statistics and linear algebra or their foundational role in ML, you will start from the very basics and progress all the way to advanced considerations. Along the way, you will learn by doing with a myriad of practical exercises and real-world business cases. If you want to see how the training works, start with a selection of free lessons by signing up below.