Monday, January 24, 2011

Hierarchial Mapping – An Overview -12169

Hierarchial Mapping – An Overview -12169 - SwapneelShetkar

Cluster Analysis: Important Points as covered in class
Clustering – grouping things in a particular form e.g customers in a segment
Types - Hierarchical and KMeans clustering
Hierarchical clustering: Done when objects to be clustered are less than 50. If objects more than 50 we use KMeans
 We have 2 types of hierarchical clustering
Divisive clustering: All objects are considered as one big group and then divided into sub clusters and keep dividing until we have one object as one group
Agglomerative clustering:  All objects as individual clusters and combine until we get one big cluster
SPSS Follows agglomerative cluster

How to cluster:
Selection of variables to cluster – All people from a specialisations in one group
Distance Measurement – Objects which are at less distance get clustered first, then the ones farther. E.g. we form a group of every third person in the class. In business applications we use Euclidian distance i.e. the shortest distance between 2 points.
Clustering Criteria- Distance measurement gives distance between two objects. Clustering gives the distance between two clusters or cluster and object. For e.g distance taken from the midpoint of the cluster. There are different methods used for clustering
·         Between group linkages
·         Nearest Neighbour
·         Furthest Neighbour
·         Centroid
Statistically, we cut off clustering when the next object to join, does so at a relatively higher distance.

Dendogram: Visual representation of how the clusters happen, which combine first and what combine later.





Literature Review
http://www.norusis.com/pdf/SPC_v13.pdf
Although both cluster analysis and discriminant analysis classify objects (or
cases) into categories, discriminant analysis requires you to know group membership for the cases used to derive the classification rule. The goal of cluster analysis is to identify the actual groups.
Identifying groups of individuals or objects that are similar to each other but different
from individuals in other groups can be intellectually satisfying, profitable, or
sometimes both. Using your customer base, you may be able to form clusters of
customers who have similar buying habits or demographics. You can take advantage
of these similarities to target offers to subgroups that are most likely to be receptive
to them. Based on scores on psychological inventories, you can cluster patients into
subgroups that have similar response patterns. This may help you in targeting
appropriate treatment and studying typologies of diseases. By analyzing the mineral
contents of excavated materials, you can study their origins and spread.

Hierarchical clustering is one of the most straightforward methods. It can be either agglomerative or divisive. Agglomerative hierarchical clustering begins with every case being a cluster unto itself. At successive steps, similar clusters are merged. The algorithm ends with everybody in one jolly, but useless, cluster. Divisive clustering starts with everybody in one cluster and ends up with everyone in individual clusters. Obviously, neither the first step nor the last step is a worthwhile solution with either method. In agglomerative clustering, once a cluster is formed, it cannot be split; it can only be combined with other clusters. Agglomerative hierarchical clustering doesn’t let cases separate from clusters that they’ve joined. Once in a cluster, always in that cluster.