CMO

eBay employs big data platforms to drive competitive edge

eBay analytics chief outlines how the online retailer is using analytics to cope with big data overload and drive value

Trying to make sense of the estimated 100 terabytes of new data received every day to meet changing customer expectations and keep a competitive edge has led eBay to start using big data platforms.

Speaking at the Teradata Big Data Analytics summit in Sydney, eBay director of data and data infrastructure, Alex Liang, told delegates that its website has more than 50,000 product categories with more than US$3500 goods sold every second.

“We know almost everyone is using a smartphone to browse listings on eBay, which means we get more data. This also means we need to process more data,” he explained.

Liang said his team was also under pressure from the finance department to provide better systems for increased analytics.

“For eBay, data is about value so if you cannot get value from big data you should not even work on it,” he said.

However, getting the value proved difficult because eBay’s integrated analytics environment has more than 100,000 data elements, 90 petabytes of stored data and tables containing 3.5 trillion rows of data. According to Liang, this environment was not easy to navigate for the 12,000 internal business intelligence (BI) users who range from data scientists to sales directors who want regular reports.

In 2011, the company began rolling out three different platforms which support a particular type of analytics. The company uses its enterprise data warehouse (EDW) platform for corporate BI reporting and a 40 petabyte Discovery platform called Singularity for website behaviour analytics. Its 40 petabyte Hadoop cluster is used for technical analytics such as counterfeit detection and image classification. EBay has also built a Data Hub to provide a central information platform for access to all analytics and information.

“Because the business environment is much more complex, you cannot have one analyst working independently. People must be working with each other to get deep data insight,” Liang said.

This information portal has been configured to drive collaboration between analysts with analytics that have been built by anyone in the company. It provides definitional information about each report and can be searched or browsed by category.

According to Liang, the Web design was borrowed heavily from the eBay website to make it easier for analysts to find the report they were searching for.

“We are facing very aggressive competition from other sites so data is the biggest advantage for eBay. Every business initiative we make is based on data,” he said.

Finally, eBay developed an integrated dashboard hub called QuickStrike. Liang said the company was now considering machine learning techniques to drive more value from stored data.

“You don’t need to spend so much time finding different algorithms because once you have a big volume of data, machine learning will offer a higher rate of accuracy,” he said.

According to Liang, the future will be “live”- meaning real-time data loading and analytics. “Coupled with forecasting and predicting future events, this will lead to even higher value being delivered by the analytics platforms,” he said.

More on big data