demnag

Azure Machine Learning Studio is a great place to experiment with various analytics solutions. One can build, test & deploy solutions like breeze on Azure ML studio.

We had to group US counties based on the cost of home insurance in them. Here is a brief description about how we used Azure ML studio to cluster counties based on their home insurance values.

One has to log in to start creating and running experiments in AZure Ml Studio. Creating an account is easy, initially you can sign up for a free account for which there is no need for a credit card.

Once you are logged in you can create experiments, a large variety of modules for various tasks like regression, clustering, classification are readily available and can be just be dragged and dropped into work space. There is a provision to run custom python and R scripts on data also.

Data for the experiments can be imported from any local computer. Once the data is imported, it can also be dragged and dropped on to the workspace. To import the dataset click on “New” button visible on the bottom left of the screen. Select the appropriate files and meta information about the files in the popup that appears.

To create an experiment click on the “Experiments” tab on the left side of screen. You can create a new experiment there and give an appropriate title to the experiment. Once the experiment is created one can add modules.

K-means clustering algorithm is available in Azure ML studio modules. To use this algorithm add “K-Means Clustering” module to the experiment. One can find the available modules by navigating the drop down lists or by searching in the search box.

Before training the model one must configure the module. You can find a detailed documentation on K-Means Clustering here.

Finally add a “Convert to CSV” module and pass the Result dataset of “Train Clustering Model” as input to be able to download the clusters from the experiment.

Run the experiment by clicking Run. This will take few minutes depending upon the dataset and the module we used. Once the experiment is done the result can be downloaded by doing

Convert to CSV >> Result Dataset >> Download

Here is a visualization of the clusters we got.