Monday, 16 May 2016

Watson Analytics 2.0

Continuing from where we left off the last time. We covered two out of the five section of the software indicated in the diagram below by the bits in grey. So we are discussing 'Assemble' to 'Refine'.


Assemble: With 'Assemble' making a dashboard has never been easier. The software has reduced making of charts to a  simple 'drag and drop' process. The section has templates and layouts that users can use to make all kinds of dashboards. Once the template and layout is selected, fields are simply dragged into axes and filters to make graphs as shown in the diagram below.

There are several options of charts that can be made, it is just as simple as clicking on the chart that meets the objective and the job is done. Below is a diagram showing the types of charts available.

Social Media: This section scrapes social media for data based on criteria specified by the user. The criteria will include fields like language, source,The data collected can analyzed by topics,theme, geography,sentiment, active authors and demographics. This tool is useful for finding out the buzz around a service (or product or may be even a topic) in social media.  The software gives clues from data generated around the chosen topic in form of a word cloud, just like in the diagram.




Users can include these clues as  terms closely related to the search topic to get better results, they can also break down the topic by specifying a criteria of some sort, this is known as a 'Theme' in Watson Analytics. After the search is all sorted, a dataset is generated which can now be analyzed.


The diagram above shows a 'Sentiment' chart from the data collected.  The dataset fields are displayed at the bottom of the chart. Other information like topics, themes, languages, dates, documents and mentions are displayed at the top. All these details will specify the search and scrape criteria for the data. Users can change charts by using the chart criteria to display charts for criteria that was earlier specified.

Refine: Watson Analytics assesses data before use, result of assessment are usually displayed on the tile that links to the data as shown in the diagram below.

 It assesses data for missing values and inconsistencies that may affect use of the data and then awards marks. 100 for data that is consistent and has no missing values. The 'Refine' section does exactly what it says on the box. It is used for cleaning, munging and editing of data. The section can enable users to group data, form hierarchy, do calculations. Users can also search data, include/exclude rows and columns.  Each column is assessed , marks and charts are displayed showing quality of data.

There is still much to learn in Watson Analytics, the blog is just a headstart. As always, there's more information on the internet.

Give Watson Analytics a try. Thanks for reading.

Monday, 2 May 2016

Introducing Watson Analytics

Watson Analytics

So you have 100,000 rows of data to analyze, you cannot afford to hire a Data Scientist and you do not have the required skills.Have no fear, IBM Watson Analytics have come to the rescue.

Watson Analytics is IBM's attempt to replicate what Apple has done in computing to data science. They have managed to make the subject simple enough for mere mortals (with little or no knowledge of Data Science) like you and me to understand.


The software has sections that it uses to work on data just as shown in the image above, they are namely:

Explore: This section deals with data exploration, which is just having a feel of data in an attempt to find out what the data is all about. The section has charts that will help provide insight as to what the data is all about. It's an array of charts options that will paint a good picture of the data. This section allows users to probe data for insights that will inform decision. Once the data is uploaded and refined, Watson dissects data and comes up questions that users might be interested in finding answers to.


 Clicking on the question that best matches objectives, will reveal detailed answers with charts. The software will produce lots of graphs pertaining to the question asked. Clicking on a graph reveals a more detailed information. The image below shows graphs made from the data we tested with.


Predict: This section is basically linear regression in graphic user interface. It users predict values using variables in the datasets. Dependent variables are predicted using independent variables. Watson examines the quality of the dataset and scores it. This enables users to make changes that ensures accurate predictions. The software then explains that datasets using all sorts of graphs and tables showing things like skewness of data, Outliers, Box graphs etc. For users that are novices in linear regression (which is most of us), Watson is able to point out variables that predict other variables. In my dataset,  Item variable drives Unit cost, as shown below.


It also shows field associations , which is another way of showing which variables are correlated just as shown below.




 It also shows degree  and accuracy of predictability in a circular diagram just like the one shown below. It also offers alternatives where necessary, like in my dataset Rep provides a better prediction for item, with predictive strength of 99.6%



With these kind of information, users now know what  variables to tweak to get results from other variables. Will discuss other sections in subsequent blogs.


 Enjoy Watson.