IBM Watson Analytics preps the data so you don't have to

The new IBM Watson Analytics promises to remove many of the headaches around data cleansing and analysis

Given a set of data, with descriptive column headers, IBM Watson Analytics can generate a set of visualizations showing possible trends of interest
Given a set of data, with descriptive column headers, IBM Watson Analytics can generate a set of visualizations showing possible trends of interest

With the newest commercial service to spring from its highly publicised Watson cognitive computing initiative, IBM is attempting to streamline the process of analysing data so business managers can pull insights from data sets without the help of IT experts.

"Often what business users do is rely on a data scientist or business analysts to help them, which can be too slow, or those people may not be available. So they make a decision without any analysis," said Eric Sall, IBM vice president of marketing for business analytics.

The new service, IBM Watson Analytics, launched as an invitational beta trial, can walk the user through data analysis by way of a series of interactive steps. In an event in New York Tuesday, the company demonstrated how the service would work.

The service can upload data, cleanse it, understand the user's requirements through an interactive natural language query session and then produce a range of analysis and accompanying visualisations that may be of interest.

"It's the full range of analytical capabilities integrated together in one user experience," Sall said in an interview before the launch event.

The service is powered by a set of different IBM analysis technologies, such as Cognos and SPSS, and adds some machine learning capabilities from the Watson project.

In a series of presentations, IBM showed how Watson Analytics could be used. None of the demonstrations were live; the audience was shown screen shots taken before the presentation. But they offered a glimpse of how the technology, which IBM is still refining, would work.

As a first step, the user uploads a Comma Separated Values (CSV) or Excel Binary File Format (XLS) flat file to the service. A CSV file can easily be extracted from a spreadsheet or a database table. The data values are separated by commas and each column of data has a name, usually given to it by the administrator who created the table or spreadsheet.

Watson first analyses the file to estimate the quality of the data. A data set may have missing values, or be too small to draw meaningful conclusions from. Certain columns may not have any statistical interest -- they all may have the same value for instance. Watson offers an overall score of the cleanliness of the data, as well as how much potentially interesting analysis could be derived from the set.

The service will then examine the data set for any interesting correlations that may be occurring, and look for more obscure second-order effects that someone looking at a spreadsheet may miss.

In effect, Watson Analytics runs through the data and prepares visualisations for all the possible relations among the data elements, emphasising those that might be of most interest to the user. The user can then click on the charts of interest, and Watson makes notes of what its users are looking for.

It offers both basic analysis and predictive analysis, where it can make guesses at future behavior based on metrics of past behavior.

This is not to say the Watson does all the work of analysing the data. Helping Watson understand the data requires that all the column rows be descriptively named. The set-up process also requires domain experts to identify those columns that they are most interested in learning more about, to prevent Watson from endlessly churning up relations of little interest.

During a customer panel at the event, a number of data scientists and business intelligence system managers expressed optimism for the new offering.

Justin Croft, manager for brand platforms and analytics for U.S. wireless communication provider C Spire, noted that this service could be immediately useful to the company's sales and marketing staff, who could use it to ferret out reasons why there may be a spike or dip in sales volumes. Typically, the staff may have "far more questions than there is time to track down the answers," he said.

Though IBM is marketing the service directly to end users, it can also be embedded in other enterprise software. SugarCRM, electronics distributor AVNET, and Truven Health Care analytics have all shown an interested in folding Watson capabilities into their own offerings

IBM itself is folding Watson Analytics into another, as-of-yet unnamed cloud service for its Smarter Workforce portfolio that it plans to launch later this month for human resource (HR) departments. Among the features offered by this service is a Watson Analytics powered service allowing HR professionals to ask questions about information typically captured by HR offices.

A demo showed how someone, for instance, could look for drivers of attrition -- why employees are leaving a company.

A company may keep a spreadsheet of workers employed in 2014 and what departments they belong to, along with a column indicating whether or not each employee is still with the company. Other columns may capture additional factors such as the geographic location of the employees, number of hours each employee worked and so on.

Watson will automatically create a chart of the number of people who have left per department, such as sales or testing, as well as charts that show the number departed employees by other factors as well, such as all the departures in each geographic region.

Watson Analytics also has a learning component that will able to build an ontology, or thesaurus, based on user input. If users enter the phrase "employees who quit or were fired" and then click on the data sets built around the attrition column, the service will come to equate the term "attrition" with the terms "quit" and "fired," and will offer the attrition information more readily in future searches with the words "quit" or "fired."

In the hypothetical example IBM provided, the data showed that one department, testing, had a higher turnover of employees, proportionally, than any other department. While this information could be fairly easily garnered from the spreadsheet itself, Watson Analytics was also able to call up information on how many overtime hours that each employee in testing worked, a correlation not so readily observable by a visual scan of the spreadsheet.

In this example, testing employees clocked far more overtime hours than the average for the company, and the heavy workload could be seen a prime culprit for the departures.

Watson Analytics is the latest move in IBM's US$1 billion initiative to commercialise Watson technologies.

IBM Research developed Watson to compete with human contestants on the "Jeopardy" game show in 2011, using natural language processing and analytics, as well as many sources of structured and unstructured data, to formulate responses to the show's questions.

In the years since, the company has been working to commercialise the Watson technology by identifying industries that could benefit from this form of cognitive computing, such as health care, law enforcement and finance.

IBM expects that Watson Analytics will go live by the end of November. Then, the service can be used at no cost, for up to a certain number of data sources (the exact number IBM has still to determine). IBM will charge for using the service with additional data sources.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the newsletter!

Or

Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.
Show Comments

Latest Videos

Conversations over a cuppa with CMO: Microsoft's Pip Arthur

​In this latest episode of our conversations over a cuppa with CMO, we catch up with the delightful Pip Arthur, Microsoft Australia's chief marketing officer and communications director, to talk about thinking differently, delivering on B2B connection in the crisis, brand purpose and marketing transformation.

More Videos

Hey there! Very interesting article, thank you for your input! I found particularly interesting the part where you mentioned that certain...

Martin Valovič

Companies don’t have policies to disrupt traditional business models: Forrester’s McQuivey

Read more

I too am regularly surprised at how little care a large swathe of consumers take over the sharing and use of their personal data. As a m...

Catherine Stenson

Have customers really changed? - Marketing edge - CMO Australia

Read more

The biggest concern is the lack of awareness among marketers and the most important thing is the transparency and consent.

Joe Hawks

Data privacy 2021: What should be front and centre for the CMO right now

Read more

Thanks for giving these awesome suggestions. It's very in-depth and informative!sell property online

Joe Hawks

The new rules of Millennial marketing in 2021

Read more

In these tough times finding an earning opportunity that can be weaved into your lifestyle is hard. Doordash fits the bill nicely until y...

Fred Lawrence

DoorDash launches in Australia

Read more

Blog Posts

Highlights of 2020 deliver necessity for Circular Economies

The lessons emerging from a year like 2020 are what make the highlights, not necessarily what we gained. One of these is renewed emphasis on sustainability, and by this, I mean complete circular sustainability.

Katja Forbes

Managing director of Designit, Australia and New Zealand

Have customers really changed?

The past 12 months have been a confronting time for marketers, with each week seemingly bringing a new challenge. Some of the more notable impacts have been customer-centric, driven by shifting priorities, new consumption habits and expectation transfer.

Emilie Tan

Marketing strategist, Alpha Digital

Cultivating engaging content in Account-based Marketing (ABM)

ABM has been the buzzword in digital marketing for a while now, but I feel many companies are yet to really harness its power. The most important elements of ABM are to: Identify the right accounts; listen to these tracked accounts; and hyper-personalise your content to these accounts to truly engage them. It’s this third step where most companies struggle.

Joana Inch

Co-founder and head of digital, Hat Media Australia

Sign in