The following Python 3 code snippet demonstrates the implementation of a simple K-Means clustering to automatically divide input data into groups based on given features.
In the example a TAB-separated CSV file is loaded first, which contains three corresponding input columns. Then the K-Means clustering model is created from this input data. Afterwards, new data can be classified using the predict() method based on the learned model.
The sample code requires the Scikit-learn and the Pandas library to be installed (pip install sklearn, pip install pandas).
from sklearn.cluster import KMeans
import pandas as pd
import numpy as np
import pickle
# read csv input file
input_data = pd.read_csv("input_data.txt", sep="\t")
# initialize KMeans object specifying the number of desired clusters
kmeans = KMeans(n_clusters=4)
# learning the clustering from the input date
kmeans.fit(input_data.values)
# output the labels for the input data
print(kmeans.labels_)
# predict the classification for given data sample
predicted_class = kmeans.predict([[1, 10, 15]])
print(predicted_class)