In this post, I want to introduce local effects in the cross-section of stock returns. Different from risk factors/ market anomalies, calculated using raw (global) measures, local measures that relate single stocks to their peers are largely underdeveloped. There are many ways to construct local measures in the cross-section of stock returns. In this post, I want to focus on local outliers. Local outliers are stocks that behave fundamentally unexpectedly, given their underlying characteristics.
Computing Stock Neighbors using KNN
To obtain local measures, I first need to create localities, namely regions of stocks that are fundamentally similar. I will utilize stock characteristics to measure the similarity between two assets 𝑖 and 𝑗. Factor returns and stock level characteristic signals are taken from Dacheng Xiu’s website. To find suitable neighborhoods, the nearest neighbors algorithm is applied.
The nearest neighbors algorithm attempts to find the closest neighbors of an observation 𝑎 given a sample of feature realizations 𝑋 in Euclidean space ℝ by minimizing the Euclidean distance:
The algorithm iteratively processes as follows:
- Calculate the distance between observation 𝑎 and any observation b≠𝑎 from the data
- Add the distance and the label of the example to an ordered collection
- Sort the distances between 𝑎 and all 𝑏 by the distances
- Pick the closest 𝑘 neighbors
These four steps are repeated for all 𝑎 in the sample.
In my case, my observations 𝑎 are one realization of a stock return 𝑟𝑖,𝑡+1 of stock 𝑖 at time 𝑡+1. The feature realizations 𝑋 are stock characteristics at time 𝑡. Step 1 to 4 is performed separately for each cross-section. Once all neighbors for the cross-section at time 𝑡 are found, I can calculate local measures. These local measures can take many forms. They can be related to return distributions, similarity/dissimilarity between stocks (given the underlying characteristics), the spread in the cross-sectional region, or local idiosyncratic volatility. In the beginning, as stated in the introduction, I will focus on local correlation/ local outliers.
Finding Local Outliers
Local outliers are stocks that behave differently from their neighbors. I define unexpected behavior as low correlations with the ten closest peers. Since we are mainly interested in systematic covariation, I compute the factor based covariance matrix using a Fama-French 5 factor model. If a stock correlates poorly with fundamentally similar stocks, it is considered an outlier. The higher the correlation, the more a stock performs as expected.
Let there be 𝑛 assets and 𝑓 factors in the market with the 𝑓×𝑓 covariance matrix Σ𝑓. Conditional on the factor return, 𝑟𝑓 the return 𝑟𝑖 of asset 𝑖, is normally distributed with mean level 𝜇𝑖|𝑟𝑓 = 𝛽𝑇𝑖 𝑟𝑓 = 𝛽𝑖,1𝑟1 + ⋯ + 𝛽𝑖,𝑓𝑟𝑓, and residual return variance 𝜎2𝑖,𝜖. The factor-based covariance between asset 𝑖 and 𝑗 then is: 𝐶𝑜𝑣(𝑟𝑖,𝑟𝑗)= 𝛽𝑇𝑖Σ𝑓𝛽𝑗
And the Covariance Matrix between all assets is then:
Now let us zoom in to a local neighborhood of 10 stocks. Local outliers are stocks that covary differently (load differently on systematic risk factors) with what one would expect from their fundamental characteristics. A local outlier induces uncertainty in evaluating its conditional factor dependence 𝑟𝑖|𝑟𝑓. We thus compute the average correlation between stock 𝑖 and its ten fundamental peers:
Portfolios sorted on Local Correlation
Now that we have our local correlation measure, we test whether the parameter uncertainty of outlier stocks is priced in the cross-section of stock returns. Using the CRSP Database, I investigate around 20,000 stocks from 1963 – 2018. Each month I sort stocks into five portfolios according to their local correlations (calculated with an expanding window of a minimum length of 10 years). Applying value weights for aggregating the portfolio returns results in the following mean annual returns:
As you can see, the effect unexpectedly runs in the opposite direction. Outlier stocks in portfolio 1 have significant negative returns. Stocks that behave similarly to their fundamental peers have the highest returns. This is hard to reconcile with our theoretical parameter uncertainty projection at the beginning of the post.
A considerable return spread exists between low- and high-local correlation stocks. From a risk perspective, it is of extraordinary interest if common stock risk factors can explain this spread. For this purpose, I run time-series factor spanning tests for the local correlation factor (built with a long position in P5, and a short position in P1) in the below figure:
First, local correlation has a positive and statistically significant alpha with considerable magnitude. The annual alpha is as high as 7.1%. Second, my local correlation factor negatively loads on the size (SMB) factor. Thus, our strategy is prevalent among large stocks rather than small stocks, facilitating the implementation. Third, the local correlation factor loads firmly on the value (HML) and the profitability (RMW) factor. Thus, these popular investment strategies are likely expandable by local correlation considerations.
Conclusion
In this post, I introduced local effects in the cross-section of stock returns. With a straightforward example, namely local correlation, I showed that investors could improve their investment strategies by considering local effects. From here, there are almost unlimited possibilities for how stock neighbor-based measures can be expanded and calculated. One could consider local versions of established global factors. For example, investors could blend a momentum strategy with local momentum measures. I plan to build a whole series for local effects in the cross-section of stock returns. Stay tuned.