Biases of Hedge Fund Data Print E-mail
Biases in fund databases present another serious problem originated from the following reasons. First, databases include only a part of the whole hedge find population that varies from 30 percent to 50 percent. Second, some biases are derived from the peculiar terms of counting hedge funds and their respective returns. The typical biases related to hedge fund data providers are listed as follows:
  • Survivorship bias. When calculating indices, data vendors exclude defunct funds from computation, though their performance prior exclusion is counted in indices. In other words, this practice artificially inflates performance of the survived funds. Different researches estimated the bias from 2 percent to 3 percent.
  • Selection bias. Since listing is not obligatory, only a part of hedge funds is represented in databases, therefore, they do not reflect the true trend of the whole hedge fund universe. A hedge fund manager may have a few reasons not to be included in databases: a disappointing performance, problems with track records, and no need for new investors. The last reason may derive from the founders’ intentions to keep a low profile, or the critical size of the fund.
  • Back reporting bias. This practice contrasts with the system of computation indices for mutual funds and stocks, because means recalculating the related indices and back-fills the return data upon inclusion a new fund in their database. Simply speaking, if a fund with the history of five years included in the database, the vendor will recalculate the appropriate index for five years back.
  • Instant history bias. Often, upon inclusion in a database, a fund’s performance degrades comparing with that before listing. A few reasons are behind that bias. On the one hand, a hedge fund manager may be induced to shuffle returns since database listing is commonly viewed as a promoting tool to attract new investors. On the other hand, a fresh fund during its incubation period tends to demonstrate a better performance.

Short data series
Since most hedge funds report performance on a monthly basis, an average available series ranging from twenty to sixty observations. This completely differs with the situation for stocks and bonds with virtually unlimited return series. From the practitioner’s point of view, it implies deploying appropriate analytical methods that provide an adequate confidence level on short data series. For example, using the Conditional Value-at-Risk (CVaR) becomes highly problematic, when it comes to hedge funds, because it would imply its computation based on a few historical observations only.