What is Alternative Data and How Can You Use It?

Hi there! Alternative data is one of the most exciting developments I‘ve seen during my 5 years working in web scraping and proxies for hedge funds and investment institutions. In this guide, I‘ll explain everything you need to know to take advantage of this powerful new data source for investing.

The Origin Story of Alternative Data

Alternative data refers to information coming from non-traditional sources outside of regular company financials and earnings reports. While investment research historically relied on financial statements, macroeconomic trends, and demographics, alternative data opens up new possibilities.

Back in the mid-2000s, some pioneering hedge funds began testing new unconventional data sources. One of the first successes came from funds using satellite images of retail parking lots to estimate customer traffic ahead of earnings announcements. The satellite photos provided valuable insights into revenue trends before the official reports came out.

The use of satellite parking lot images spread quickly and was soon considered a go-to alternative data source. In just a few years, adoption jumped from 10% of institutional investment firms in 2009 to over 60% by 2016. The COVID-19 pandemic further accelerated adoption as traditional data failed to provide timely insights into rapidly changing consumer behavior.

Categories of Alternative Data

Since those early parking lot images, alternative data sources have expanded dramatically. Here are some of the most popular categories and examples I‘ve seen clients use:

Web Scraped & Crowdsourced Data

  • Product reviews on Amazon, Walmart, app stores
  • Google Trends search volume
  • Social media metrics and sentiment analysis
  • Job listings and resumes
  • Wikipedia page views
  • Weather data from crowdsourced apps

Location-Based Data

  • Satellite/drone photos of facilities, crops, shipping containers
  • Phone tracking for foot traffic estimates
  • GPS tracking of shipping trucks and vessels
  • Local event and attendance data

Transactional Data

  • Credit, debit, and app purchase data
  • Email receipt information
  • Shipping and supply chain data

Mobile App Usage Data

  • Installs, engagement and usage metrics
  • Ad performance and auction data

New alternative datasets are emerging all the time. Based on my experience, I expect location-based data and web scraped reviews to continue growing rapidly with advances in satellite imagery and image recognition technology.

Real World Examples and Use Cases

Alternative data comes in handy across many industries but is especially beneficial for consumer facing companies. Here are some real world examples I‘ve seen clients use successfully:

  • Examining satellite photos of retailer parking lots from Orbital Insight to estimate customer traffic ahead of earnings. This can validate or question revenue expectations.

  • Using anonymized credit card data from Cardify to analyze customer spend at restaurants and track it against competitor promotions. This reveals which companies‘ offers are attracting more business.

  • Partnering with email receipt aggregators like Rakuten to monitor ecommerce order rates, average order values, and buying activity by product category. This identifies changes in consumer discretionary spending.

  • Checking Google Trends data to compare consumer search interest for two upcoming movies from rival studios to estimate which film may have the stronger opening weekend.

  • Using GPS tracking of oil tankers from companies like Kpler to estimate supply and demand trends for crude oil that may impact prices.

The use cases are endless but require creativity and testing to determine what alternative data correlates with business performance.

Challenges of Using Alternative Data

While alternative data is a game changer, it also comes with some unique obstacles to overcome compared to traditional financial data:

– Accuracy and bias – Unlike regulated financial statements, alternative data doesn‘t conform to strict reporting standards. Users must deeply evaluate accuracy and potential biases.

– Complex analysis – The right baselines, context, and correlation with business metrics may be unclear requiring rigorous testing and modeling.

– Access limitations – Alternative datasets often have usage limits, restrictive contracts, and expensive subscription fees from data providers.

– Regulatory uncertainty – Evolving privacy laws and website terms of service make the legality of web scraping somewhat ambiguous if not monitored closely.

Here are some best practices I recommend for mitigating these challenges:

  • Consult with legal counsel to understand regulations and liability for any web scraping

  • Start with small sampling of new data sources and expand slowly after validating quality

  • Compare alternative datasets to each other or traditional sources to identify outliers

  • Clean and process data thoroughly before analysis to remove abnormalities

  • Keep detailed records of data provenance and update monitoring for Terms of Service changes

  • Build rigorous models to test for spurious correlations unrelated to business performance

With careful implementation, the value gained from alternative data far outweighs the costs.

Methods to Access Alternative Data

If you‘re interested in tapping into alternative data, here are three typical approaches along with their pros and cons:

Method Description Pros Cons
Data Providers Companies like YipitData sell access to curated datasets Removes hassle of acquiring data Expensive fees, limited flexibility
Internal Data Teams Build in-house pipelines for collection and processing Complete control and customization High technical skills required
External API Services APIs like ScraperAPI enable on-demand data extraction Simple and scalable access Some limitations on customization

For many investors, a combination of partners is optimal. For example, you could have an internal team handle processing and analysis but use a web data API to simplify data collection.

When evaluating partners focus on data quality, flexibility, and compliance rigor. I‘ve found services like Oxylabs who offer proxies for web scraping invaluable for speeding up data collection.

Best Practices for an Alternative Data Strategy

Based on my experience, here are 5 recommendations when developing an alternative data strategy:

Start with a specific use case in mind – Don‘t just collect data for data‘s sake. Have a clear vision for how it will improve decision making.

Turn data into actionable insights quickly – Alternative data has a short shelf-life. Analyze and operationalize faster for maximum value.

Combine alternative data with traditional data – Alternative data is most powerful when supplementing, not replacing traditional data.

Monitor for diminishing returns – The more competitors using the same alternative data source, the less edge it may provide.

Stay on the right side of the law – Regularly consult counsel to avoid regulatory violations and keep meticulous records.

The Future of Alternative Data

Alternative data has already moved from fringe idea to a mainstream pillar of investment research. New data sources will keep emerging as economic activity becomes more digitized. For instance, I expect drone imagery and IoT sensors to be a boon for supply chain insights.

Meanwhile, methods for extracting insights will grow more advanced through AI, machine learning, and natural language processing. For example, using sentiment analysis on product reviews rather than just volume and ratings.

The companies benefiting most from alternative data today will likely be the market leaders of tomorrow. While challenges exist, those able to tap into this information treasure trove stand to gain a significant competitive advantage.

I hope this guide has provided a helpful introduction to the world of alternative data! Let me know if you have any other questions.

How useful was this post?

Click on a star to rate it!

Average rating 0 / 5. Vote count: 0

No votes so far! Be the first to rate this post.