Showing posts with label markets efficiency. Show all posts
Showing posts with label markets efficiency. Show all posts

Friday, June 16, 2017

16/6/17: Replicating Scientific Research: Ugly Truth


Continuing with the theme on 'What I've been reading lately?', here is a smashing paper on 'accuracy' of empirical economic studies.

The paper, authored by Hou, Kewei and Xue, Chen and Zhang, Lu, and titled "Replicating Anomalies" (most recent version is from June 12, 2017, but it is also available in an earlier version via NBER) effectively blows a whistle on what is going on in empirical research in economics and finance. Per authors, the vast literature that detects financial markets anomalies (or deviations away from the efficient markets hypothesis / economic rationality) "is infested with widespread p-hacking".

What's p-hacking? Well, it's a shady practice whereby researchers manipulate (by selective inclusion or exclusion) sample criteria (which data points to exclude from estimation) and test procedures (including model specifications and selective reporting of favourable test results), until insignificant results become significant. In other words, under p-hacking, researchers attempt to superficially maximise model and explanatory variables significance, or, put differently, they attempt to achieve results that confirm their intuition or biases.

What's anomalies? Anomalies are departures in the markets (e.g. in share prices) from the predictions generated by the models consistent with rational expectations and the efficient markets hypothesis. In other words, anomalies occur when markets efficiency fails.

There are scores of anomalies detected in the academic literature, prompting many researchers to advocate abandonment (in all its forms, weak and strong) of the idea that markets are efficient.

Hou, Xue and Zhang take these anomalies to the test. The compile "a large data library with 447 anomalies". The authors then control for a key problem with data across many studies: microcaps. Microcaps - or small capitalization firms - are numerous in the markets (accounting for roughly 60% of all stocks), but represent only 3% of total market capitalization. This is true for key markets, such as NYSE, Amex and NASDAQ. Yet, as authors note, evidence shows that microcaps "not only have the highest equal-weighted returns, but also the largest cross-sectional standard deviations in returns and anomaly variables among microcaps, small stocks, and big stocks." In other words, these are higher risk, higher return class of securities. Yet, despite this, "many studies overweight microcaps with equal-weighted returns, and often together with NYSE-Amex-NASDAQ breakpoints, in portfolio sorts." Worse, many (hundreds) of studies use 1970s regression technique that actually assigns more weight to microcaps. In simple terms, microcaps are the most common outlier and despite this they are given either same weight in analysis as non-outliers or their weight is actually elevated relative to normal assets, despite the fact that microcaps have little meaning in driving the actual markets (their weight in the total market is just about 3% in total).

So the study corrects for these problems and finds that, once microcaps are accounted for, the grand total of 286 anomalies (64% of all anomalies studied), and under more strict statistical signifcance test 380 (of 85% of all anomalies) "including 95 out of 102 liquidity variables (93%) are insignificant at the 5% level." In other words, the original studies claims that these anomalies were significant enough to warrant rejection of markets efficiency were not true when one recognizes one basic and simple problem with the data. Worse, per authors, "even for the 161 significant anomalies, their magnitudes are often much lower than originally reported. Among the 161, the q-factor model leaves 115 alphas insignificant (150 with t < 3)."

This is pretty damning for those of us who believe, based on empirical results published over the years, that markets are bounded-efficient, and it is outright savaging for those who claim that markets are perfectly inefficient. But, this tendency of researchers to silverplate statistics is hardly new.

Hou, Xue and Zhang provide a nice summary of research on p-hacking and non-replicability of statistical results across a range of fields. It is worth reading, because it dents significantly ones confidence in the quality of peer review and the quality of scientific research.

As the authors note, "in economics, Leamer (1983) exposes the fragility of empirical results to small specification changes, and proposes to “take the con out of econometrics” by reporting extensive sensitivity analysis to show how key results vary with perturbations in regression specification and in functional form." The latter call was never implemented in the research community.

"In an influential study, Dewald, Thursby, and Anderson (1986) attempt to replicate empirical results published at Journal of Money, Credit, and Banking [a top-tier journal], and find that inadvertent errors are so commonplace that the original results often cannot be reproduced."

"McCullough and Vinod (2003) report that nonlinear maximization routines from different software packages often produce very different estimates, and many articles published at American Economic Review [highest rated journal in economics] fail to test their solutions across different software packages."

"Chang and Li (2015) report a success rate of less than 50% from replicating 67 published papers from 13 economics journals, and Camerer et al. (2016) show a success rate of 61% from replicating 18 studies in experimental economics."

"Collecting more than 50,000 tests published in American Economic Review, Journal of Political Economy, and Quarterly Journal of Economics, [three top rated journals in economics] Brodeur, L´e, Sangnier, and Zylberberg (2016) document a troubling two-humped pattern of test statistics. The pattern features a first hump with high p-values, a sizeable under-representation of p-values just above 5%, and a second hump with p-values slightly below 5%. The evidence indicates p-hacking that authors search for specifications that deliver just-significant results and ignore those that give just-insignificant results to make their work more publishable."

If you think this phenomena is encountered only in economics and finance, think again. Here are some findings from other ' hard science' disciplines where, you know, lab coats do not lie.

"...replication failures have been widely documented across scientific disciplines in the past decade. Fanelli (2010) reports that “positive” results increase down the hierarchy of sciences, with hard sciences such as space science and physics at the top and soft sciences such as psychology, economics, and business at the bottom. In oncology, Prinz, Schlange, and Asadullah (2011) report that scientists at Bayer fail to reproduce two thirds of 67 published studies. Begley and Ellis (2012) report that scientists at Amgen attempt to replicate 53 landmark studies in cancer research, but reproduce the original results in only six. Freedman, Cockburn, and Simcoe (2015) estimate the economic costs of irreproducible preclinical studies amount to about 28 billion dollars in the U.S. alone. In psychology, Open Science Collaboration (2015), which consists of about 270 researchers, conducts replications of 100 studies published in top three academic journals, and reports a success rate of only 36%."

Let's get down to real farce: everyone in sciences knows the above: "Baker (2016) reports that 80% of the respondents in a survey of 1,576 scientists conducted by Nature believe that there exists a reproducibility crisis in the published scientific literature. The surveyed scientists cover diverse fields such as chemistry, biology, physics and engineering, medicine, earth sciences, and others. More than 70% of researchers have tried and failed to reproduce another scientist’s experiments, and more than 50% have failed to reproduce their own experiments. Selective reporting, pressure to publish, and poor use of statistics are three leading causes."

Yeah, you get the idea: you need years of research, testing, re-testing and, more often then not, you get the results are not significant or weakly significant. Which means that after years of research you end up with unpublishable paper (no journal would welcome a paper without significant results, even though absence of evidence is as important in science as evidence of presence), no tenure, no job, no pension, no prospect of a career. So what do you do then? Ah, well... p-hack the shit out of data until the editor is happy and the referees are satisfied.

Which, for you, the reader, should mean the following: when we say that 'scientific research established fact A' based on reputable journals publishing high quality peer reviewed papers on the subject, know that around half of the findings claimed in these papers, on average, most likely cannot be replicated or verified. And then remember, it takes one or two scientists to turn the world around from believing (based on scientific consensus at the time) that the Earth is flat and is the centre of the Universe, to believing in the world as we know it to be today.


Full link to the paper: Charles A. Dice Center Working Paper No. 2017-10; Fisher College of Business Working Paper No. 2017-03-010. Available at SSRN: https://ssrn.com/abstract=2961979.

Wednesday, February 19, 2014

18/2/2014: Have Financial Markets Become More Informative since the 1960s?


In strongly efficient markets, prices of shares transmit strong information about company fundamentals, such as productivity and demand for and risk of investment. As Fama (1970) wrote: "The primary role of the capital market is allocation of ownership of the economy's capital stock. In general terms, the ideal is a market in which prices provide accurate signals for resource allocation: that is, a market in which firms can make production/investment decisions... under the assumption that security prices at any time `fully reflect' all available information."

In recent years, quality of information in the financial markets has significantly improved, while analysis costs have fallen, suggesting that informational content of prices in the markets should have risen as well.

A new paper, titled "HAVE FINANCIAL MARKETS BECOME MORE INFORMATIVE?" by Jennie Bai, Thomas Philippon, and Alexi Savov (Working Paper 19728: http://www.nber.org/papers/w19728, December 2013) measures "the information content of prices by using them to predict earnings and investment. We trace the evolution of price informativeness in the U.S. over the last five decades."

The period of analysis is not ad hoc. "During this period, a revolution in computing has transformed finance: Lower trading costs have led to a flood of liquidity. Modern information technology delivers a vast array of data instantly and at negligible cost. Concurrent with these trends, the finance industry has grown, its share of GDP more than doubling."

In this context, the authors ask "Have market prices become more informative?"

To answer this question, the authors first develop measures of informativeness in the financial markets. They do so by combining Tobin's (1969) q-theory of investment with the noisy rational expectations framework of Grossman and Stiglitz (1980). "When more information is produced, prices become stronger predictors of earnings. We define price informativeness to be the standard deviation of the predictable component of earnings and we show that it is directly related to welfare, as in Hayek (1945): information promotes the efficient allocation of investment, which leads to economic growth."

For empirical testing, the authors regress "future earnings on current valuation ratios, controlling for current earnings. We look at both equity and corporate bond markets. We include one-digit industry-year fixed effects to absorb time-varying cross-sectional differences in the cost of capital. This regression compares firms in the same sector and asks whether firms with higher market valuations tend to produce higher earnings in the future than firms with lower valuations."

Conclusion: "...the amount of informativeness has not changed since 1960."

Surprising result means there is some room for potential mis-specification of the tests. As authors note: "By itself, constant price informativeness does not imply constant information production in markets. It is possible that information production has simply migrated from inside firms to markets. Hirshleifer (1971) first noted the dual role of prices in revealing new information and reflecting existing information. Bond, Edmans, and Goldstein (2012) call the revelatory component of price informativeness real price efficiency (RPE), and the forecasting component forecasting price efficiency (FPE). The financial sector adds value only to the extent that it reveals information that would otherwise be unavailable to decision makers. …the distinction between RPE and FPE is fundamental, and we seek to disentangle them."

The model provides a solution. "When managers rely on prices, they import the price noise into their investment policies. When markets reveal no new information, managers ignore them and prices remain noisy but investment does not. In the opposite case, when all information is produced in markets, managers use prices and both investment and prices are equally noisy. Information increases the predictive power of both prices and investment, but a rise in the revelatory component of prices increases price informativeness disproportionately."

Complicated thinking? You bet. But still, intuitive and testable. "To see if the constant price informativeness could mask a substitution from forecasting (FPE) to revealing (RPE) information, we check to see if the predictable component of earnings based on investment has changed."

Conclusion: the authors find that the predictable component of earnings based on investment has not changed over time. "…this implies that neither FPE nor RPE has risen over the last five decades." And furthermore, "…results show that discount rate variation has also remained stable" over time.

But there is more to the study. It turns out that informativeness is different for different types of investment. "Our strongest positive finding is that a higher equity valuation is more closely associated with R&D investment now than in the past. The same is not true of capital expenditure. However, the increased predictability of R&D is not related to increased predictability of earnings, so we cannot conclude that informativeness has increased."

And the results discussed above are sensitive to the sample of stocks studied. "For most of the paper, we examine S&P 500 stocks whose characteristics have remained stable. In contrast, running the same tests on the universe of stocks appears to show a decline in informativeness. We argue, however, that this decline is consistent with changing firm characteristics: the typical firm today is more difficult to value."

Top conclusion therefore is that having examined "the extent to which stock and bond prices predict earnings", the authors find that "the informativeness of financial market prices has not increased in the past fifty years". 

In the words of Herbert Simon (1971), "An information processing subsystem (a computer) will reduce the net demand on attention of the rest of the organization only if it absorbs more information, previously received by others, than it produces -- if it listens and thinks more than it speaks."