Resolved: How to define the number of labels?
Hi Hristina! First of all, thanks very much for this excellent course. I'm just wondering where is the properties that define the number of labels? Are 0 to 2 the default numbers? What if you want 4 or more distinct labels into the data? Thank you.
Hey San,
Thank you for reaching out and thank you a lot for the positive feedback!
If the parameter n_samples
is an integer, just as is the case in the lecture, then, by default, make_blobs
will split the samples into 3 classes of equal sizes. Additionally, again by default, the classes are labeled with integers, starting from 0. To see this in action, please study the snippet below:
If you wish to instead have 4 classes of equal size, then you can introduce an additional parameter called centers
:
If instead the parameter n_samples
is an array storing integers, then each integer would correspond to the number of samples of the given class. Please, study the snippet below where I've defined 4 classes of different sizes:
Lastly, if you wish to give your classes other names, you can do it in various ways. I've demonstrated this using pandas
' replace
method:
For more information, don't hesitate to visit the official documentation on the sklearn
website. There, you will find detailed explanations accompanied by examples.
Hope this helps! Let me know if something has remained unclear.
Kind regards,
365 Hristina
That's exactly what I needed to know. Thanks for the prompt response and going to great length to explain it. Much appreciated! :)