Databricks takes on Google data streaming analysis with Spark

Databricks Cloud will provide Spark-based streaming analysis as a service

Taking on Google, Databricks plans to offer its own cloud service for analyzing live data streams, one based on the Apache Spark software.

Databricks Cloud is designed to provide a platform for analyzing streaming data, much like the Google DataFlow service announced last week.

Like Google DataFlow, Databricks Cloud promises to offer a single programming model that cuts across different approaches to data analysis, including support for batch programming and live data streaming. And like Google DataFlow, Databricks Cloud will first be offered in preview mode, with full commercial support due by the end of the year.

The two services are aimed to different markets, according to Ion Stoica, CEO of Databricks.

"Google DataFlow is really targeted to developers. We also have higher-level interfaces for data scientists and data engineers," Stoica said.

Databricks also guarantees application portability. Because the entire stack is based on open source software, users can move their workloads to other Apache Spark installations should they need to, Stoica said. "You can take your application and run it in another cloud," Stoica said.

Such a service could be used by enterprises for tasks such as churn analysis, which can determine why a customer stops using a product, or for fraud detection, where a malicious activity can be spotted while it is still taking place.

The University of California, Berkeley's AMP (Algorithms, Machines and People) Lab originally developed Spark as a unified processing engine, one able to provide a platform for a variety of data analysis tasks, including interactive queries, steaming data analysis, machine learning and graph computation.

A number of developers behind Spark went on to form Databricks. The software itself, designed to run on a cluster of servers, is now managed as an open source project under the guidance of the Apache Software Foundation.

Offering Spark as a service eliminates the arduous task for setting up and maintaining an in-house implementation of Spark, Stoica noted.

"Clusters are hard to set up and maintain. To build a data pipeline, you need to stitch together multiple tools, and the tools are still hard to use. So extracting value out of the data is still a struggle," Stoica said.

Initially, Databricks Cloud will be run on Amazon Web Services, though eventually it will also run on other cloud providers such as Google.

In addition to the Spark platform itself, Databricks will provide a set of built-in applications that can do common data analysis tasks. Users can build their own workflows, or issue queries and interact with the data directly. Output can be piped to a dashboard or a report.

Databricks is not the only company making use of Spark's capabilities. ClearStory offers an analytics software package based on Spark that allows organizations to aggregate dozens of unstructured data sources for analysis, far more than can be easily done through traditional business intelligence tools, said ClearStory CEO Sharmila Mulligan.

Databricks also announced Monday that it has received US$33 million in series B funding led by venture capital firm, New Enterprise Associates, with follow-on investment from Andreessen Horowitz.

Join the newsletter!

Or

Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.
Show Comments

Latest Videos

Conversations over a cuppa with CMO: Microsoft's Pip Arthur

​In this latest episode of our conversations over a cuppa with CMO, we catch up with the delightful Pip Arthur, Microsoft Australia's chief marketing officer and communications director, to talk about thinking differently, delivering on B2B connection in the crisis, brand purpose and marketing transformation.

More Videos

We’re seeing an increase in customer loyalty after businesses began implementing Live Chat. Here’s your one-stop guide on Live Chat suppo...

Fiza Syed

Customer loyalty in the time of COVID-19

Read more

JP54,D2, D6, JetA1 EN590Dear Buyer/ Buyer mandateWe currently have Available FOB Rotterdam/Houston for JP54,D2, D6, JetA1 with good and w...

Collins Johnson

Oath to fully acquire Yahoo7 from Seven West Media

Read more

Hi This is George, Thanks for sharing this nice information about foodpanda blockchain. During this pandemic situation food delivery indu...

George David

foodpanda launches blockchain-based out-of-home advertising campaign

Read more

Did anyone proofread this document before it was published?

Beau Ushay

CMO Momentum 2020: How to embrace agile marketing

Read more

JP 54, D2, and D6 EN590,JET A1 AVAILABLE ON FOB DIP AND TEST IN SELLER TANKWe Can supply Aviation Kerosene,Jet fuel (JP 54-A1,5), Diesel ...

Collins Johnson

Oath to fully acquire Yahoo7 from Seven West Media

Read more

Blog Posts

Commissioning personas that get used

How to avoid the bottom drawer, and how to get value from the work you’ve paid for

Melanie Wiese

Chief strategy officer, Wunderman Thompson

Why It’s Going To Be A Bumper Holiday Season Despite the Pandemic

Behavioural science expert Dan Monheit, co-founder and strategy director of creative agency, Hardhat, writes that marketing chiefs should hold their nerve, as they have reason to be optimistic

Dan Monheit

Co-founder, Hardhat

Why marketing and UX teams must join IT on cyber security

For far too long, cyber security has been considered the sole domain and concern of the IT department, with other departments including marketing, UX and design, firmly entrenched in the belief it is not their concern. The reality could not be further from the truth. In fact, this view is dangerous as it could lead to irreparable brand damage and a lack of trust in consumer behaviour.

Nicki Doble

CIO, Cover-More Group

Sign in