Google service analyzes live streaming data

Google Cloud Dataflow can analyze both streaming and batched data with the same programming models

Google's Dataflow can analyze data as it crosses the wire
Google's Dataflow can analyze data as it crosses the wire

Taking what many see as the next step in big data analysis, Google is previewing a service called Google Cloud Dataflow that analyzes live data, potentially giving users the ability to view trends and be alerted to events as they happen.

"There's an enormous amount of data being created, and so you need a way to ingest that in a more intelligent way," said Brian Goldfarb, Google Cloud Platform head of marketing. With big data, "the program models are different. The technologies are different. It requires developers to learn a lot and manage a lot to make it happen."

"It is a fully managed service that lets you create data pipelines for ingesting, transforming and analyzing arbitrary amounts of data in both batch or streaming mode, using the same programming model," Goldfarb said.

Google Cloud Dataflow is designed so the user can focus on devising proper analysis, without worrying about setting up and maintaining the underlying data piping and processing infrastructure.

It could be used for live sentiment analysis, for instance, where an organization estimates the popular sentiment around a product by scanning social networks such as Twitter. It could also be used as a security tool to watch activity logs for unusual activity.

"There are a bunch of different business applications in which it could apply. In a lot of data-centric verticals, like retail or oil and gas, a technology like this could open the door to getting analytics," Goldfarb said.

It could also be used an alternative to commercial ETL (extract, transform and load) programs, widely used to prepare data for analysis by business intelligence software.

Google Cloud Dataflow is based on technologies that the company built internally for its own use, following up on work it did on the MapReduce programming model, which is used in Apache Hadoop.

Live data stream analysis appears to be the next logical step in big data analysis, a field pioneered by Hadoop. Hadoop provides a way to analyze massive amounts of unstructured data spread across multiple servers. Originally, Hadoop used MapReduce as the platform to write programs that analyze the data.

MapReduce's limitation is that it can only analyze data in batch mode, which means all the data must be collected before it can be analyzed. A number of new software programs have been developed to get around the limitation of batch processing, such as Twitter Storm and Apache Spark, which are both available as open source and can run on Hadoop.

Google's own approach to live data analysis uses a number of technologies built by the company, notably Flume and MillWheel. Flume aggregates large amounts of data and MillWheel provides a platform for low-latency data processing.

The service provides a software development kit that can be used to build complex pipelines and analysis. Like MapReduce, Cloud Dataflow will initially use the Java programming language. In the future, other languages may be supported.

The pipelines can ingest data from external sources and use them for a variety of things. The service provides a library to prepare and reformat data for further analysis, and users can write their own transformations.

The treated dataset can be queried against using Google's BigQuery service. Or the user can write modules to examine the data as it crosses the wire, to look for aberrant behavior or trends in real-time.

Google announced Cloud Dataflow at the company's Google I/O user conference in San Francisco. A small number of Google customers are testing it and the company plans to open it up as a public preview later this year.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the newsletter!

Error: Please check your email address.
Show Comments

Blog Posts

Maintaining trust in a sceptical world: The power of brand trust

The faith people have in brands creates opportunity for those brands to become trusted advisors. In turn, this builds success by increasing the brand’s profile, letting it broaden its product offering and driving stronger customer loyalty.

Dan Ratner

managing director, uberbrand

When growth stalls: How to boost growth in large organisations

The push to start new businesses continues. In Q1 2017, the number of seed and angel deals increased by 1.4 per cent compared to Q1 2016.

Con Frantzeskos

CEO, Penso

Why we need diversity in marketing

​When we read articles about the need for increased diversity in marketing land, it is often through the lens of gender.

Jodie Sangster

CEO, ADMA

We all know that digital marketing in order to promote a brand, products and services is by the use of electronic media. The evolution of...

Helaina Berry

Predictions: 17 digital marketing trends for 2017

Read more

Interesting insight, well explained and the examples are just apt.Thanks for sharing!

FreshMindIdeas

The politics of branding - Brand science - CMO Australia

Read more

When the world that we live in floods with gigabytes of content every day, we have to learn to be selective about it. Such educational we...

Paulina Cameron

ADMA launches education program to tackle viewability, ad fraud and brand safety

Read more

Hi, i am an Aistralian ALK patient, been on xalkori dec 13 to oct 15 and achieved remission of disease, since been on Ceritinib until no...

gary packer

Pfizer Australia adopts AI-powered digital analyst tool for sales and marketing decision making

Read more

Hi James, shouldn't marketers also be focusing on collecting and utilizing up to date first-party profiling data on customers so that mes...

Tom

3 ways customer data can increase online sales conversion

Read more

Latest Podcast

Getting Intimate with CX Ep 5: Tammy Marshall, founder, The B Hive

How much of customer experience is having the foresight to know what those individuals might like, versus asking them? In Episode 5 of this new podcast series, BrandHook MD, Pip Stocks, talks with Tammy Marshall about the importance of asking your customers questions, how consistency plays a role in engagement, but how the unexpected adds extra value.

More podcasts

Sign in