Included with this assignment is an Excel spreadsheet that contains data with two dimension values. The purpose of this assignment is to demonstrate steps performed in a K-Means Cluster analysis. Review the "k-MEANS CLUSTERING ALGORITHM" section in Chapter 4 of the et. . textbook for additional background. Use Excel to perform the following data analysis. You will use Excel to help with calculations, but only standard functions should be used (i.e. don't use a plug-in to perform the analysis for you.) You need to show your work doing this analysis the long way. If you were to repeat steps 4 through 6, what will likely happen with the cluster centroids? The rubric for this assignment can be viewed when clicking on the assignment link. Here is a link to an example spreadsheet using a smaller data set. It contains two tabs. The first tab is the raw data. The second tab contains the analysis that was performed. Make sure that you use a different starting center points from the example. The attached file has 43 data points. Please complete all.

K-Means Cluster analysis is a widely used technique in data mining and machine learning that aims to classify data points into distinct groups or clusters. The purpose of this assignment is to demonstrate the steps involved in performing a K-Means Cluster analysis using Excel.

To start with, I have reviewed the “k-MEANS CLUSTERING ALGORITHM” section in Chapter 4 of the textbook to gain additional background knowledge. The algorithm involves the following steps:

1. Select the number of clusters (k) that you want to identify in the data. In this assignment, the number of clusters is not specified, so you will need to decide based on your analysis.

2. Initialize the centroids randomly or by using a specific method. In this case, you are asked to use a different set of starting centroids from the example provided in the spreadsheet.

3. Assign each data point to the nearest centroid based on the Euclidean distance. This step calculates the distance between each data point and each centroid and assigns the point to the cluster with the closest centroid.

4. Recalculate the centroids by taking the mean of all the points assigned to each cluster. This step calculates the average position of all the data points in each cluster to update the centroids.

5. Repeat steps 3 and 4 until convergence is achieved. Convergence is reached when the centroids no longer change significantly, or when a specified number of iterations is reached.

Now, let’s discuss the specific instructions for this assignment. You have been provided with an Excel spreadsheet containing 43 data points and two dimension values. Your task is to perform a K-Means Cluster analysis on this data using Excel, showing your work in the long way without using any plug-ins or shortcuts.

To do this, you will need to follow the steps outlined above. Firstly, you need to decide on the number of clusters (k) you want to identify in the data. This decision could be based on the characteristics of the data or any prior knowledge you have.

Next, you will initialize the centroids by selecting different starting points from the provided example. This step ensures that your analysis is not biased towards the starting positions.

Then, you will assign each data point to the nearest centroid based on the Euclidean distance. Excel provides functions that can help you calculate the distance between each data point and each centroid.

After that, you will recalculate the centroids by taking the mean of all the points assigned to each cluster. Again, Excel provides functions that can help you calculate the mean.

You will then repeat steps 3 and 4 until convergence is achieved. This means that you will assign data points to clusters and recalculate centroids iteratively until the centroids no longer change significantly or until a specified number of iterations is reached.

Throughout the analysis, make sure you document all the steps and calculations in Excel. This will help you show your work in the long way, as required by the assignment.

In conclusion, this assignment requires you to perform a K-Means Cluster analysis on the given data set using Excel. You need to follow the steps outlined in the k-MEANS CLUSTERING ALGORITHM section of the textbook, showing your work in the long way without using any plug-ins.

Disclaimer

Links

Payment Method

Contact

CHAT WITH OUR LIVE SUPPORT WHO ARE LIVE 24/7.

START A CONVERSATION ANYTIME AND WE WILL BE GLAD TO SERVE YOU.

Need your ASSIGNMENT done? Use our paper writing service to score better and meet your deadline.