Resolved: Aren't we supposed to have unlabelled data here?
Aren't we supposed to have unlabelled data here as Clustering is an unsupervised learning technique?
Or is it the case that unsupervised learning works on both labelled and unlabelled data?
Also, I would like to know how do we not have chosen Classification over Clustering for this data set as it is a labelled data.
Sorry if the question is too trivial.
Thank you for your question, I think it is great!
You are correct, the dataset presented in the video contains both Country and Language as labels. However, notice that the
kmeans object, defined in 3:45, doesn't know that :) The dataset was sliced such that the fit is performed only on the latitude and the longitude. In principle, we wouldn't have the Country and Language labels and we would have to figure them out on our own. They were shown in the video solely for illustrative purposes.
Hope this answers your question!