How Hedge Funds Use Alternative Data

by Josh Howarth
December 6, 2023

Companies and consumers produce a large amount of data as they go about their business: transaction receipts, customer logs, and location data are a few examples.

This “data exhaust”, often called alternative data, is now being used to give investors an edge in optimizing their investment strategies.

In the fiercely competitive world of hedge funds, alternative data emerges as the key to achieving that much-needed edge. Let’s dive in.

Source: CB Insights

Alternative Data and Alpha Generation

Generating returns that are higher than the market (also known as alpha) consistently is no easy feat.

If clients are going to pay high fees, they want near certainty that the hedge fund is going to outperform a passive benchmark. Otherwise, what’s the point?

This pursuit of superior returns, therefore, is the driving force of hedge fund strategies. This means research departments staffed with PhDs, quantitative analysts (quants), cutting-edge tech and in-house tools.

Source: Bloomberg

Hedge funds use two types of data to generate outsized returns: traditional data and alternative data.

Traditional data comprises standard sources like SEC filings and government economic data, known for their accuracy and reliability.

Yet, the catch is their universal accessibility; if every player in the market has access to the same information, how can you outperform?

This is where alternative data comes into play. It draws insights from unconventional sources, providing hedge funds with the crucial informational edge they need to outperform and thrive.

Source: Financial Times

Alternative Data Uses for Hedge Funds

According to the Alternative Investment Management Association (AIMA), the three most popular uses of alternative data by the world’s leading hedge funds are:

1. To help source new investment opportunities

Hedge fund Man GLG had a hunch about the luxury goods sector: while growth in the industry usually comes from Chinese buyers, most research analysts covering luxury companies are in the US or Europe.

The fund sought to exploit this information asymmetry. They used natural language processing (NLP) to analyze Chinese news sentiment, and found the following pattern:

  • June 2019: Versace is endorsed by actress Yang Mi; headlines had a positive NLP score of 0.4.
  • Early July 2019: A Versace t-shirt suggesting that Hong Kong and Macau are sovereign territories is released.
  • Late July 2019: Versace stops selling the t-shirts and destroys remaining inventory.
  • August 2019: Yang Mi revokes her endorsement; headlines had a negative NLP score of 0.7.

Source: WWD

Investors armed with this information could have profited from the 14% drop in the parent company’s stock price in that time period.

2. To improve investment decisions

Investors have tapped into Yodlee’s credit card data to study spending patterns at theme parks operated by SeaWorld.

This in-depth analysis covers purchases from tickets to merchandise and even parking. In contrast, SeaWorld’s financial statements only disclose aggregate revenue figures such as ‘food, merchandise, and others’.

Such granular transaction information is crucial in the theme park industry, where stock values are heavily influenced by quarterly revenues.

Source: Wall Street Journal

3. To help generate outperformance

Most established companies buy from their vendors by receiving goods without making upfront payments, in a widespread practice known as trade credit.

Until recently, the extent of investment alpha contained within trade credit data remained unknown.

A trade credit dataset revealed that companies that kept large trade credit balances but paid their invoices on time outperformed relevant market indices by a substantial 6-7%.

Challenges in Using Alternative Data

The two major challenges hedge funds face in leveraging alternative data are the sheer volume of AD available in the market, and the intensive resources required for extracting actionable insights.

The alternative data industry is experiencing breakneck growth, with projections indicating a surge from 1,200 to about 5,000 datasets by the end of 2024.

Source: Hedge Week

This rapid growth means hedge fund managers struggle to pinpoint which datasets are valuable amidst a sea of information. To make matters worse, many datasets lack the historical depth for effective backtesting.

Once datasets are purchased, they still require wrangling to produce investment alpha. Hedge funds are channeling significant resources into hiring data scientists and investing in technologies like natural language processing, machine learning, and AI.

This scenario demands data providers adept at both obtaining datasets and providing clients with essential alpha. Exploding Topics, for example, scours the web for rapidly growing trends before they become widely known.

Source: Exploding Topics

Hedge funds are able to directly leverage that data to capitalize on products and companies that are outperforming.

Top Alternative Datasets Used by Hedge Funds

According to AIMA, the top alternative data sets used by the world’s top-performing hedge funds are:

  • Web crawled data
  • Data sourced from expert networks
  • Consumer spending data
  • Business performance metrics
  • Online reviews and social media sentiment

Source: AIMA

Let’s break each of these down:

  • Web Crawled Data: the first data type that qualified as alternative data, data scraped from the web has the purpose of aggregating and displaying information in a single source of truth.

  • Data sourced from expert networks: expert networks are services that offer highly-specialized senior talent in often niche markets and industries. This includes tailored research that can include data from non-traditional sources.

  • Consumer spending data: this is the highest grossing dataset, and includes credit and debit card data, as well as email receipts.

  • Business performance metrics: this involves reinterpreting traditional data in innovative ways, such as creating unique valuation metrics that diverge from standard methods of evaluating asset risks and rewards.

    For example, equity research firm New Constructs employs NLP technology to comb through thousands of companies’ financial filings, particularly footnotes.

    While most analysts focus on the main financial statements, New Constructs delves into the less scrutinized sections of filings to uncover data that can obscure a company's real performance.

  • Online reviews and social media sentiment: a company is only as good as the loyalty of its clients. Social media sentiment allows hedge funds to peek behind a company’s marketing to see how their customers really feel.

    The Future of Alternative Data

    When it comes to hedge funds, alternative data has become table stakes: 69% of funds are using alternative data to outperform the market, and 23% utilize it for risk management.

    Looking ahead, key trends include a shift towards high-quality data, and the development of a comprehensive AD ecosystem.

    The top-performing hedge funds are likely to choose quality over quantity. Large, tech-savvy firms will be supported by data science teams that direct the integration of alternative data into the fund’s investment process.

    Source: Deloitte

    For smaller players, third-party firms offer a means to scale and compete, providing advisory services and mitigating the trial-and-error process.

    In this context, the race is on for data providers to offer efficient solutions and deliver the most alpha.