Computers and artificial intelligence have come along at an exponential rate over the past few decades, from being regarded as oversized adding machines to the point where they have played integral roles in some legitimately creative endeavours.
If there’s one takeaway President Trump’s lead data scientist, Matt Oczkowski, can offer brands after working on the Republican candidate’s election campaign, it’s that they need their own walled garden.
“I encourage all corporate entities to move more towards a walled garden approach to their data and modelling,” the Cambridge Analytica data scientist said at this year’s ADMA Data Day, as well as during an exclusive interview with CMO. “This is something Google and Facebook have mastered for a long time. They’re never going to give you data outputs from their pools, you have to buy it through marketing.
“If I am a large consumer brand, I should be working heavily to have my own walled garden of first-party data sources.”
By contrast, if there’s one thing Oczkowksi is certain will disappear from the digital marketing landscape, it’s programmatic advertising.
“I fundamentally believe programmatic media buying is dying and will go away,” he claimed, noting more than half the total digital dollars spent on media by the Trump campaign was on Facebook alone.
“The idea we’re still going to buy 250 x 250 squares on a website is ludicrous; the chances of you being bitten by a shark are higher than someone clicking on those units. We’re moving towards native advertising that’s all about the user experience, and programmatic is not working anymore.”
Building up internal first-party data sources are key to following customer experiences all the way through, Oczkowski said. “And that’s not just online, append it offline, because it gives you a much more robust view of the customer,” he said.
“I often share this quote with clients: If you want to learn how to hunt, you go to the jungle, not the zoo. A lot of commercial clients are hunting in the zoo – all using the exact same information and data, which makes it null. If you are Coke and Pepsi, both buying X’s data to influence your media buys, you’ve no market difference. Cultivate your own data, do your own research.”
How data decided the Trump campaign
Cambridge Analytica was brought in less than six months before the US presidential elections to help the Trump campaign secure a win. The big data consultancy firm has a history in political campaigning, but also works in the corporate sector and defence.
Oczkowski said politics is a great vertical because it’s “forced innovation”. “There’s no truer startup than a political campaign. You have between six months and two years to build a multi-person organisation, spend hundreds of millions of dollars, but at a definitive end date you need to win 51 per cent market share in your vertical,” he said.
In the case of the Trump campaign, Cambridge Analytica’s first task was to build the appropriate database infrastructure to collect, ingest, match and de-duplicate data from disparate sources.
The focus then shifted to three business objectives: Fundraising; persuading votes to vote for their candidate; and drive more people to the polls (known in the US as ‘GOTV’, or get out the vote).
To do this, the team leveraged three pots of data: Political data, such as voter history; publicly available data from Experian, Acxiom and others on consumer purchase trends, demographics, geography and so on; then its own growing pool of first-party data based on polling, market research and modelling results. In all, between 1000 and 5000 data points were used for each US adult.
“The most important thing is the quality and type of data you have. Data analysis, artificial intelligence and machine learning are great and they’re advancing, but you have to have good data in and out,” Oczkowski said. “One thing I’ve learnt working with big corporate companies is their problems are very basic. Data silos are an issue, and one team can’t get access to another’s set of data. Companies have gotten too bureaucratic, so the problems we solve early on are simplistic – funnelling data into one location, de-duping, modelling for analysis.”
With Trump, the data science approach then needed to be based on the type of candidate Cambridge was working for, Oczkowksi said.
“I’ve never had a candidate who, with one Tweet, can change the entire attitude of an electorate. So our data program had to be very reactive, in terms of tracking ripples in the water. Any time something happened, we needed to see what mass effect it was having on the electorate.”
A cornerstone of building the Trump campaign’s first-party data sets was research and polling, and the team conducted 800,000 live and online surveys across 17 battleground states during the campaign period. Results influenced data modelling, which informed segmentation and media buying. Insights from these activities were then fed back into the machine.
“We had a cyclical approach – we’d go in to the field with research, use data to conduct models, use those models to influence media buying, then those would influence more research,” Oczkowski explained. “It was a closed loop process.”
Through this, the team built custom tools like ‘path to victory’ calculators, showing the optimised paths to reach 270 votes in the US, as well as ‘cities to visit’ calculators.
“We ranked every city in all 17 battleground states and then were focused on, and the ops team used that list to figure out where Trump should be flying to,” Oczkowksi said. “A lot of people thought Trump’s travel schedule looked very erratic in the last month or two of the election.It was really very intelligently designed we just weren’t talking about it.”
Using data for action
One of the challenges many organisations are grappling with actioning data insights. Oczkowksi said this wasn’t too much of a problem for Cambridge Analytica, because the Trump campaign was run as a business.
“It was ROI driven, and if you performed, they’d invest more in your strategy,” he said. “The campaign was a fully data-driven operation, with almost everything outside of Trump himself, who is his own boss at the end of the day, being data-driven. Media spend allocation through to our candidate’s travel schedule, to words and issues used in surrogate speeches – you name it, big data was involved.”
A key reason Oczkowski believes he was able to see the pending outcome of the election, when the wider media and Hilary Clinton camp did not, was because of the data collected on people donating to the Trump cause. Many were individuals who hadn’t donated to the Republican party before. Notably, Trump raised online low-dollar contributions of US$240 million, the most any presidential candidate has raised from low-dollar sources.
“The most difficult thing a data scientist has to do is predict who is going to turn out in an election,” Oczkowksi continued. “To do this, the only data most people have is past election voting history- because you showed up at the last three elections, people make an assumption based on basic modelling that you’ll show up again and vote.
“There was a massive, hidden tranche of Trump voters that came out and no one, besides a few people, had the data to show that was going to happen, or to reinforce it.
“We were building up first-party data inside the campaign. Once people started donating to Trump that had never donated to political candidates before, and we started gathering absentee/early voter data… that allowed us to see the eventual outcome of the election much earlier than other people did.”
Oczkowski said he’s often asked why Hilary Clinton’s massive data science team saw something different to what Trump’s team saw. The answer is in the data.
“Hilary had her own data, which showed typical Democrats who come out to vote at this time. We were collecting interesting data points that told us a different story,” he said.
It was also data modelling that allowed Trump’s data science team to recognise consumers lying publicly about their intention to vote for the controversial candidate.
“We noticed specifically with live people calling cell phones that 3 per cent of people were lying because they were embarrassed to admit they were going to vote for Trump to a live caller,” he said. “We’d have a score of 90 out of 100 in our database for them, they’d say they’d vote for Hilary and we’d just know that wasn’t the case, the signal was too strong. So we reweighted our modelling to account for that.
“We also saw major demographic trends: A massive increase in rural, white voters, a massive decrease in younger African American voters in cities, and a slight increase in Hispanic vote depending on the state. When you take all that and redo your modelling, it gives you a different picture of America.”
None of this, however, takes away from the importance of brand and creative. Oczkowski pointed out that for every piece of content produced during the campaign, the team also pitched a big overarching brand narrative next to it.
“Trump wasn’t a policy candidate talking about intricacies of healthcare policy. It was about widespread appeal,” he said. “The data wouldn’t show or imply that so we had to test it.
“You can do a data program while still being creative. Trump sold hundreds of thousands of baseball hats, we were also playing in earned media, and he had loyal advocates. It was cultural.”
The privacy question
He might be a big data advocate, But Oczkowski said he’s always been concerned about consumer privacy. In the case of the Trump campaign, data used was with publicly available or collected via an opt-in process, such as polling.
“When it comes to doing things like US elections, I’ve never run into a point where I think we’ve hit some creepy line, as most info is self-reported,” he said. “But I also think it’s up to the consumer to take protective measures – there needs to be a massive education campaign. We have to be very careful of privacy but it’s also up to us as the end consumer to push back. There has been very little pushback, people always go for convenience. There needs to be an open conversation and dialogue.”
Another problem the industry has is that it’s “very data rich and content poor”, Oczkowksi said. “Even if I could produce creative specifically to 230 million people, the cost benefit would never actually be there,” he said. “The most we’re trying to do is cluster people into sub-segments.”
Moving forward, Oczkowski said he’s most interested in the immersive types of experiences data can foster at the cross roads of digital and physical experience.
“What I’m going to find most interesting is taking and translating concepts in the digital space, into more immersive things, such as user experiences at sporting venues or a concert,” he said. “With things like Disney’s [Magic Wrist] bands.. we’re getting to the stage where we’re truly talking about big data. How we actually talk to people is where the innovation is going to be.
“We are pretty good at working out who to talk to, but we’re still figuring out what to say to them. That’s where we want to fill the void.”