class: center, middle ## Clustering Segmentation By: Eslam Adel email: eslam.a.mahmoud@eng1.cu.edu.eg Inspired by slides of TA.Mohamed Hisham 2016 --- class: top, left ## Clustering Segmentation -- * Clustering is grouping of similar -- * Machine learning, Data analysis and Data mining. -- * Segmentation is as a clustering problem. --- class: top, left ## Feature Space -- * Feature * Identifies characteristic of subject * Must discriminate different subjects. * Good feature vs robust classifier. -- * Feature space is an alternative space -- * Coordinates are values of features. --- class: top, left ## Feature Space cont'd * For example, color as a feature -- * Coordinates will be R, G, and B. --
--- class: top, left ## K means Clustering * K is number of clusters -- * Assumes spherical clusters -- * Sensitive to outliers --
--- ## ## Basic Algorithm ```python Construct feature space from your image (number of data point = number of pixels) Set number of required clusters k Set Max number of iterations for clustering Get random k points in your feature space (initial centers) for i in range(max number of iterations) #Cluster remaining data points to centers according to distance for each data point: for each cluster: distances = distance between it cluster centers its cluster = min(distances) #Calculate the new centers for each cluster: newCenter = mean(RelatedPoints) ``` --- class: top, left ## Mean shift Clustering -- * Non-parametric technique for clustering. -- * Robust and not sensitive to outliers -- * Clusters take a none-linear shape. --
--- class: top,left ## Basic algorithm -- Pseudo code of basic mean shift algorithm with uniform kernel -- ```python Extract feature space from image While number of unvisited points > 0 Select a random point in feature space (Initial mean) While true: Get distance between mean and all points in feature space For uniform window, select points in the range of specified bandwidth and track that points Get the new mean, it is the mean of points within bandwidth if distance between new and old means < threshold: cluster all tracked points to new mean update number of visited points break ``` --- class: top, left ## Basic algorithm Cont'd --
-- **Wait, we didn't finish yet!** --- class:top, left ## Merging clusters -- * For uniform kernel center may not move -- * Merging is the solution. -- * Distance between the two centers < 0.5 bandwidth. -- ```python Extract feature space from image While number of unvisited points > 0 Select a random point in feature space (Initial mean) While true: Get distance between mean and all points in feature space For uniform window, select points in the range of specified bandwidth and track that points Get the new mean, it is the mean of points within bandwidth if distance between new and old means < threshold: for c in clusters: #Check merge condition if distance(c, center) < 0.5* Bandwidth: mean of cluster c = 0.5*distance(c,center) cluster all tracked points to cluster c #No merge cluster all tracked points to new mean #Update visited points update number of visited points break ``` --- ## Bandwidth selection -- * Larger bandwidth lower number of clusters -- * Smaller bandwidth more number of clusters. --
--- class: center, middle # Thanks