Select your language

The following Python 3 code snippet demonstrates the implementation of a simple K-Means clustering to automatically divide input data into groups based on given features.

In the example a TAB-separated CSV file is loaded first, which contains three corresponding input columns. Then the K-Means clustering model is created from this input data. Afterwards, new data can be classified using the predict() method based on the learned model.

The sample code requires the Scikit-learn and the Pandas library to be installed (pip install sklearn, pip install pandas).

from sklearn.cluster import KMeans
import pandas as pd
import numpy as np
import pickle

# read csv input file
input_data = pd.read_csv("input_data.txt", sep="\t")

# initialize KMeans object specifying the number of desired clusters
kmeans = KMeans(n_clusters=4)

# learning the clustering from the input date

# output the labels for the input data

# predict the classification for given data sample 
predicted_class = kmeans.predict([[1, 10, 15]])