A basic truth exists for investors looking to extract value from alternative data, said Sequentum CEO Sarah McKenna: they need clean, valuable data feeds to start with.
The New York-based company provides software and services for web data collection at scale.
“Your artificial intelligence, your machine learning and your analytics are really dependent on the quality of your data,” McKenna said Tuesday during a panel discussion at the fifth annual Benzinga Global Fintech Awards in New York City.
CBOE’s Clay: No Holy Grails In Data
When Tesla Inc (NASDAQ: TSLA) reported third-quarter earnings Oct. 23, the stock was only priced to move about 7% after the print, said Catherine Clay, senior vice president of information solutions at CBOE.
“I’m typically a believer that you can extract pretty good price info out of the options market.”
In reality, the stock moved 18%, she told the Global Fintech Awards audience.
“That was a big miss in looking at that type of volatility data,” Clay said. “There’s never going to be a holy grail data set. It’s not out there.”
CBOE’s first venture into alternative data was about eight years ago, when the exchange built an index using signals from social media sentiment on Twitter Inc (NYSE: TWTR) and StockTwits, she said.
“When you think about adding alternative data to your investment process or product suite, you really do have to get right to the point of: what is it you’re looking to accomplish or what is it you’re looking to do?”
IBM’s Eck: 3 Alternative Data Challenges
Model bias, data lineage and governance and model explainability are three problems that come with alternative data, said Tom Eck, chief technology officer at IBM's (NYSE: IBM) IBM Watson.
Giving the example of models that predict credit risk for loans, Eck said they can be biased toward or against certain demographics.
“It’s a side effect of the data,” he said.
A population can be underrepresented in data because they don’t apply for a lot of loans — and models can interpret this as a population that's rarely approved for a loan, Eck said.
“But the model doesn’t know any differently.”
Data lineage must be tracked just like source code, the CTO said.
And model explainability “is a really tough problem that the industry as a whole is trying to deal with right now.”
Sequentum CEO Sarah McKenna, left, IBM Watson CTO Tom Eck and Catherine Clay, CBOE's senior vice president of information solutions, speak Tuesday at the Benzinga Global Fintech Awards in New York City. Photo by Mandar Parab.