Resolved: Aren't we supposed to have unlabelled data here?

Question

Aren't we supposed to have unlabelled data here as Clustering is an unsupervised learning technique?
Or is it the case that unsupervised learning works on both labelled and unlabelled data?

Also, I would like to know how do we not have chosen Classification over Clustering for this data set as it is a labelled data.
Sorry if the question is too trivial.

Answer 1

Hey Kiran,

Thank you for your question, I think it is great!

You are correct, the dataset presented in the video contains both Country and Language as labels. However, notice that the kmeans object, defined in 3:45, doesn't know that :) The dataset was sliced such that the fit is performed only on the latitude and the longitude. In principle, we wouldn't have the Country and Language labels and we would have to figure them out on our own. They were shown in the video solely for illustrative purposes.

Hope this answers your question!

Kind regards,
365 Hristina

Resolved: Aren't we supposed to have unlabelled data here?

Submit an answer

related questions