New Research: "Is Belief Superiority Justified by Superior Knowledge?"

The title of this post is taken from a recently published research paper by Michael Hall and Kaitlin Raimi. The authors observe that it is not hard to find people who exhibit "the belief that their own views are more correct than other viewpoints" or "belief superiority." They also note a strong positive correlation between "belief superiority" and the degree of confidence that they are correct. An important distinction is that belief superiority results from a comparative judgment, while the latter reflects the strengths of one's convictions.

The focus of Hall and Raimi's research was the extent to which individuals' views about the superiority of their beliefs was justified. As they note, "for a belief to be superior — or more correct — than other beliefs, it should have a superior basis in relevant factual information. Following this logic, belief-superior individuals should possess more accurate knowledge than their more modest peers, or at least better recognize relevant facts when presented with them."

When confronted with this research question, and calling to mind people they have encountered with "belief superiority complex", most people reading this probably have a strong intuition about what the authors research found.

"Belief superior people exhibited the greatest gaps between their perceived and actual knowledge."

However, Hall and Raimi go on to note that, "even if belief superiority is not supported by superior knowledge, belief superiority could be justified by another process: Superior knowledge acquisition. That is, they may seek out information on a topic in an even-handed manner that exposes them to a diversity of viewpoints. As a result, their belief superiority may reflect a reasoned conclusion after comparing multiple viewpoints."

Unsurprisingly, that is not what the authors found. Instead, "belief superior people were most likely to exhibit a preference for information that supported their pre-existing views."

In sum, "belief superior people are not only the least likely to recognize their own knowledge shortcomings, but also the least likely to remedy them."

In our research into the root causes of corporate failures, we have frequently noted that organizational failures to anticipate threats, accurately assess them, and adapt to them in time are driven by fundamental individual and group cognitive and emotional factors that are extremely difficult to change (because, for centuries in our past they were beneficial in the evolutionary sense, and provided an advantage when it came to survival, resource acquisition, and mating).

Hall and Raimi's research findings are yet another example of the deeply rooted individual factors (which are frequently reinforced by group processes) that are the deepest root causes of corporate failure.

As we repeatedly emphasize, the chances of altering these factors through training or incentives are somewhere between slim and none. Instead, organizations' best hope for survival rests on designing processes, systems, and structures that deliberately seek to offset individual and group factors' predictably negative effects.

Comments

Three Techniques for Weighing Evidence to Reach a Conclusion

In a radically uncertain world, the ability to systematically weigh evidence to reach a justifiable conclusion is undoubtedly a critical skill. Unfortunately, it is one that too many schools fail to teach. Hence this short note, which will cover some basic aspects of evidence, and quickly review three approaches to weighing it.

Evidence has been defined as “any factual datum which in some manner assists in drawing conclusions, either favorable or unfavorable, retarding a hypothesis.”

Broadly, there are at least four types of evidence:

  • Corroborating: Two or more sources report same information, or one source reports the information and the other attests to the first’s credibility;

  • Convergent: Two or more sources provide information about different events, all of which support the same hypothesis;

  • Contradictory evidence is two or more pieces of information that are mutually exclusive, and cannot both or all be true;

  • Conflicting evidence supports different hypotheses, but the pieces of information are not mutually exclusive.

Regardless of its type, all evidence has three fundamental properties:

  • Relevance: “Relevant evidence is evidence having any tendency to make [a hypothesis] more or less probable than it would be without the evidence” (from the US Federal Rules of Evidence);

  • Believability: Is a function of the credibility and competence of the source of the evidence;

  • Probative Force or Weight: Is concerned with the incremental impact of a piece of evidence on the probabilities associated with one or more of the hypotheses under consideration.

There are three systematic approaches to weighing evidence in order to reach a conclusion.

In the 17
th century, Sir Francis Bacon developed a method for weighing evidence. Bacon believed the weight of evidence for or against a hypothesis depends on both how much relevant and credible evidence you have, and on how complete your evidence is with respect to matters which you believe are relevant to evaluating the hypothesis.

Bacon recognized that we can be “out on an evidential limb” if we draw conclusions about the probability a hypothesis is true based on our existing evidence without also taking into account the number relevant questions that are still not answered by the evidence in our possession. We typically fill in these gaps with assumptions, about which we have varying degrees of uncertainty.

In the 18
th century, Reverend Thomas Bayes invented a quantitative method for using new information to update a prior degree of belief in the truth of a hypothesis.

”Bayes Theorem” says that given new evidence (E), the updated (posterior) belief that a hypothesis is true (p(H|E) is a function of the conditional probability of observing the evidence given the hypothesis (p(E|H), times the prior probability that the hypothesis is true (p(H)), divided by the probability of observing the new evidence (p(E)).

In qualitative terms, we start with a prior belief in the probability a hypothesis is true or false. When we receive a new piece of evidence, we use it to update our prior probability to a new, posterior probability.

The “Likelihood Ratio” is a critical concept in this process of Bayesian updating. It is the probability of observing a piece of evidence if a hypothesis is true divided by the probability of observing the evidence if the hypothesis is false. The greater the Likelihood ratio for a piece of new evidence (i.e., the greater the information value of the new evidence), the larger should be the difference between our prior and posterior probabilities that a give hypothesis is true.

In the 20
th century, Arthur Dempster and Glenn Shafer developed a new theory of evidence.

Assume a set of competing hypotheses. For each of these hypotheses, a new piece of evidence is assigned to one of three categories: (1) It supports the hypothesis; (2) It disconfirms the hypothesis (i.e., it supports “Not-H”); or (3) it neither supports nor disconfirms the hypothesis.

The accumulated and categorized evidence can then be used to calculate a lower bound on the belief that each hypothesis is true (based on the number of pieces of evidence that support them, and the quality of that evidence), as well as an upper bound (equal to one less the probability that the hypothesis is false, again, based on the evidence that disconfirms the hypothesis, and its quality). This upper bound is also known at the plausibility of each hypothesis.

The difference between the upper (plausibility) and lower (belief) probabilities for each hypothesis is the degree of uncertainty associated with it. Hypotheses are then ranked based on their degrees of uncertainty.

While there are quantitative methods for applying all of these theories, they can also be applied qualitatively, to quickly and systematically produce an initial conclusion about which of a given set of hypotheses is most likely to be true.

Comments

How Conceptual Elegance Can Lead to Risk Blindness

We’ve spent a lot of time over our careers working with risks that are, at least in theory, easy to quantify, price, and transfer. These include hazard risks for which there is substantial historical data on the frequency of their occurrence, as well as market risks where historical data sets are also very large.

In these cases, the traditional way of mitigating unwanted risk exposure is to transfer it, via insurance or financial derivative contracts. This also makes it apparently straightforward to calculate an organization’s residual or retained risk after mitigation actions are taken. In turn, this makes is apparently easy to compare the total amount of residual/retained/net risk to a board’s “risk appetite” – for example, the maximum reduction in cash flow or equity market value to which it desires to be exposed over a given period of time (with, for example, a 95% degree of confidence).

Especially after the events surrounding the 2008 global financial crisis (or the collapse of Long Term Capital Management in 1998), we are all painfully aware that in practice, things are not this easy, even in the case of risks that are apparently easy to quantify, price, and transfer.

Some real-world complications include:

  • Use of historical data sets that do not include extreme downside losses that a given system can produce;

  • Evolution in the nature of the system over time that makes historical data an increasingly inaccurate guide to what may occur in the future;

  • Use of inaccurate models to forecast future risks;

  • Risks whose covariance changes, both over time and as a function of conditions (e.g., remember the saying that as conditions deteriorate and uncertainty increases, correlations move towards 1.0);

  • The ability of risk transfer counterparties to make good on the payments they have contractually agreed to make should a risk materialize (e.g., the case of AIG and credit default swaps in 2008).
If conceptually elegant approaches to retained risk and risk appetite are this challenging in practice for hazard and financial risks, they are exponentially more so in the case of operational and strategic risks.

Consider the case of Carillion, the UK facilities management and construction services company that recently went into liquidation with almost GBP 7 billion in liabilities.

One of the principal causes of the company’s failure was cost overruns on major projects. The potential for such overruns had previously been recognized by the company’s management as a potentially existential risk.

However, in the company’s risk management process, the size of the residual/retained risk exposure was apparently much smaller than the gross exposure. But this wasn’t because most of the risk had been transferred to a counterparty via insurance or financial derivative contracts. Rather, it was because of the assumption that internal mitigation actions would significantly reduce the risk.

Thus, the board’s apparent belief that Carillion had a small exposure to existential project cost overrun risk seems to have been based on a series of implicit assumptions that critical mitigation actions (a) would be implemented; (b) in time; and (c) would have their expected risk reducing effects.

It is also critical to recognize the enormous difference in the accuracy with which transferable risks (e.g., hazard and market) and non-transferable risks (e.g., operational and strategic) can be quantified, in order to integrated them into an overall calculation of an organization’s retained risks relative to its risk appetite.

As we have shown, the quantification of risks for which large historical data sets are available is still problematic in many ways, and subject to an unknown degree of error, which exponentially grows over time.

But for many reasons, this challenge pales in comparison with those that confront us when we try to quantify of operational and strategic risks and the potential impact of actions taken to mitigate either their probability of occurrence of the potential negative impact if they materialize. Some of the most important challenges include:

  • We can’t be confident that we have identified all the relevant risks, mainly for two reasons. On the operational front, organizations tend to become more complex as they grow, which gives rise to both new risks and new causal pathways for ones already identified. On the strategic front, the nature of the interacting complex adaptive systems within which a company exists (e.g., technological, economic, social, and political) guarantees that new risks will continuously emerge.

  • In many cases, either reference case/base rate data on which we can ground our risk and mitigation impact quantification processes either don’t exist or if they do, are inevitably incomplete.

  • The subjective estimates we are usually forced to use when attempting to quantify operational and strategic risks and the potential impact of mitigation actions are almost always affected by at least five individual, group, and organizational biases, including:
    1. Over-optimism (e.g., the level of the mean or median estimate);
    1. Overconfidence (e.g., the width of the range of possible outcomes);
    1. Confirmation/Desirability (we pay more attention and give more weight to information that supports our view, or the outcome we desire, and less to information that does not);
    1. Conformity (we hesitate to deviate from the prevailing group view); and
    1. A strong organizational desire to avoid errors of commission (i.e., false alarms about potential risks that don’t materialize) even though this automatically increases the likelihood of errors of omission (i.e., missed alarms about potential risks that actually occur).

  • Complete quantification of the relationships between operational and strategic risks, and between them and hazard and market risks, and how these relationships could vary over different situations and over time is, from both an estimation and a computational perspective, a practical impossibility.

With these observations in mind, let us return to Carillion.

In reviewing what we know so far about this failure (and we will know much more when various inquests and litigation cases are completed), two critical points stand out for us.

First, it is not as though the risk of large project cost overruns sinking a company is not well-recognized or well-documented. For example, Professor Bent Flyybjerg has extensively documented the regularity with which cost overruns occur on large projects (e.g., see his paper, “
Over Budget, Over Time, Over, and Over Again: Managing Major Projects”), and project revenue recognition has for years been a major preoccupation of professional accounting standards bodies.
This leads us to infer (perhaps incorrectly), that Carillion’s management and board must have been very confident that these well-known risks were adequately mitigated by the plans the company had put in place to address them. This raises questions about the evidence that provided the basis for this high degree of confidence, as well as the actions taken to confirm that these plans were being implemented (we look forward to internal audit and compliance reports eventually being publicly disclosed).

Second, the Carillion failure highlights yet again the danger of putting too much trust in enterprise risk management models that attempt to quantify and aggregate very different hazard, market, operational, and strategic risks into a unified measure of “residual/retained risk” exposure that can be compared to an equally neat “risk appetite” number.

We continue to stress that when it comes to managing and governing risk, a desire for conceptual elegance is too often achieved at the cost of dangerous risk blindness that only becomes apparent when it is too late to avoid organizational failure.

Of course, this begs the question of what constitutes a better approach to the management and governance challenges posed by various types of risk. Here's a short summary of our view:

  • Use of quantitative Enterprise Risk models that aggregate gross and net exposures to hazard and market risks still makes sense, with the caveats noted above. Given the limitations of these models, their use should be complemented with other techniques, like scenario based stress testing.

  • The general category of "operational risk" encompasses a very wide range of "things that could go wrong." Where such risks can be readily quantified, priced, and transferred, they should be included in the quantitative Enterprise Risk Management models and system. Where this is not the case, risk management should focus on establishing plans, processes, and systems that are robust to potential operational failures under a wide range of scenarios, while also building in various sources of resilience when robust design falls short and failures occur. There are many techniques that can be used to analyze and manage these risks, such as failure mode and effects analysis. And key actions to mitigate operational risks should be assessed and verified at regular intervals. A final focus should be on building an adaptive organization that can constantly identify and adjust to new operational risks created by increasing internal complexity and/or a changing external environment.

  • When it comes to balancing risk exposure with a board's risk appetite, strategic risks present the most vexing challenge. As we have repeatedly noted, attempts at quantifying these risks are at best highly uncertain. It must therefore be the case that a board's decisions about strategic risk exposure versus risk appetite ultimately depends on directors' subjective judgment. But that does not mean such judgments must be unstructured. Consciously or not, they will usually reflect an assessment of the degree of imbalance between the goals being pursued, the resources available, and the strategy for employing those resources in light of the uncertainties facing the organization. The greater the degree of imbalance between goals, resources, and strategy, and the higher the external uncertainty, the greater an organization's strategic risk exposure.
Comments

Modeling -- Not as Easy as it Looks!

No, we’re not talking about a catwalk in stilettos. We’re talking about an activity that, especially since VisiCalc first ran on an Apple II in 1979, has become an integral part of management.

For all its current ubiquity, what too many people fail to appreciate is the amount of uncertainty inherent in quantitative modeling. With that in mind, we offer this quick review.

Level 1: Choice of Theory

Explicitly or implicitly, models intended to explain or predict observed effects begin with a causal theory or theories. The accuracy of the conceptual theory that underlies a quantitative model is rarely acknowledged as an important source of uncertainty.

Level 2: Choice of Modeling Method

The next step is choosing a modeling method that accurately captures the major features of the theory. For example, where theory states that the target effects to be modeled emerge from the bottom-up via the interaction of agents with varying information and beliefs, the agent-based modeling may be the method chosen. Alternatively, where theory states that the target effect is heavily driven by feedback loops, then a top-down system dynamics modeling approach may be used.

The extent of the match between theory and the modeling approach chosen is another potential source of modeling uncertainty.

Level 3: The Structure of the Model(s)

Yet another source of uncertainty are structural choices that are made when implementing a given modeling method. These include the variables that are included in the model, and the nature of the relationships between them (e.g., are they related to each other, and, if they are, is the relationship linear or non-linear, and constant or dependent on other variables?).

In some cases, uncertainties about the correct structure of a model can be resolved through the use of “ensemble” methods, which involves he construction of multiple models and the aggregation of their outputs.

Level 4: The Values of Model Variables

“Parameter uncertainty” refers to doubts about the accuracy of the values that are attached to a model’s variables, including dependency relationships between them (e.g., their degree of correlation). In simple deterministic models, this involved disagreements over values for individual variables, or the values to be used in “best-case, worst-case, most-likely case” scenarios.

In more complex Monte Carlo models, values for key variables are specified as distributions of possible outcomes. In this case, sources of uncertainty include the type of distribution used to describe the possible range of values for a variable (e.g., a normal/Gaussian or power law/Pareto distribution), and the specification of key values for the selected distribution (e.g., will rare but potentially critical events be captured?).

Level 5: Recognizing Randomness

For most variables, there is an irreducible level of uncertainty that cannot be reduced through more data or better knowledge about the variable in question. Sources of this randomness can include sensor and measurement errors, or small fluctuations caused by a complex mix of other factors. Whether and how this randomness is included in potential variable values is another source of model uncertainty.

Level 6: Mathematical Errors

We’ve all done it – wrongly specified an equation or variable value when building a model late at night (and/or under time pressure). And most of us are usually lucky enough to catch those errors the next morning before someone else does, when, after a few cups of coffee, we test our model before finalizing it and say, “that doesn’t look right.” Like it or not, mathematical errors are yet another – and very common – source of model uncertainty.

The discipline of model verification and validation is used to assess these six sources of model uncertainty. Verification focuses on the accuracy with which a model implements the theory upon which it is based, while validation assesses the accuracy with which a model represents the target system.

Level 7: Non-Stationarity

This brings us to the final source of uncertainty. Validation usually involves assessing the extent to which a model can reproduce the target system’s historical results. However, if the system itself is evolving or “non-stationary” – and particularly if that evolution is driven by a complex process that cannot be fully understood (i.e., it is “emergent”), then a final source of uncertainty is how long a model’s predictions will remain accurate (within certain bounds).

Computer models have substantially increased business productivity as they have come into widespread use over the past forty years. Yet they have also introduced new sources of uncertainty into decisions that are made using their outputs. It is for this reason that wise decision makers always test model results against their intuition, and when they disagree take the time to further explore and understand the root causes at work. Both modeling methods and decision makers’ intuition usually benefit from the time invested in this discussion.
Comments

Toys R Us' Most Important Lesson for Boards

If you are a parent of a certain age, part of you was probably a bit saddened by the recent demise of Toys R Us, a store where you probably spent a lot of time back in the day.

However, as this once great retailer prepares to go into liquidation, it also offers us valuable reminders about the causes of organizational failure – and how they can be avoided.

External Causes of Failure

In our client education courses, Britten Coyne makes the point that the external trends which give rise to strategic threats often follow a common causal pattern of four phases (albeit with multiple feedback loops between them).

The first is technological change. In the case of Toys R Us the most important included the birth of the internet, the development of online shopping businesses, the penetration of broadband, and the arrival of advanced gaming consoles in 2001, Facebook in 2004, YouTube in 2005, smart phones in 2007, Instagram in 2010, and Snapchat in 2011.

Technological change eventually leads to economic changes.

In the case of the technologies described above, these changes have been jarring. The launch of new business models has sharply increased competition and uncertainty in many industries, including toys. With their margins under increasing downward pressure, companies have had to continuously cut costs, which has left fewer employees with much more to do and less free time for non-work tasks.

Technology changes also triggered repeated shifts in children’s interest away from traditional toys and towards online entertainment, gaming, and social media that did not require a physical distribution network (the exception was toys tied to major movies, like the Star Wars and Marvel franchises).

Economic change eventually produces social changes.

In the case of Toys R Us, critical social changes included time-short parents increasingly turning to online and superstore superstore (e.g., Walmart and Target) shopping, where in one stop they could purchase both grocery and other items, including toys. This further increased the downward pressure on the margins at traditional toy retailers, even as social changes reinforced the falling economic demand for traditional toys.

At the end of this causal chain comes political change.

In the case of Toys R Us, perhaps the most important has been the taxation of online sales. For many years, these were effectively tax free, which, in addition to convenience, created another incentive to avoid purchases at physical toy stores.

Internal Causes of Failure

Our research has found three critical organizational sources of strategic failure: (1) the failure to anticipate new threats; (2) the failure to accuracy assess the dangers they potentially represent, and how fast they could materialize; and (3) the failure to adequately adapt to them in time.

The available evidence suggests that Toys R Us management anticipated the new threats that they faced. It was not as though the environment was not providing clear signals, including the disruption of book selling by Amazon’s arrival, increasing toy sales at Walmart and Target superstores, the closure of many independent toy retailers, and the bankruptcy of FAO Schwartz in 2003.

Whether Toys R Us managers accurately assessed the danger posed by these emerging threats is hard to say, as much of the evidence on this point is located in documents that remain company confidential. However, Toys R Us’ online alliance with Amazon in 2000 suggests that they appreciated the danger posed by at least some of the new threats they faced.

Unfortunately, the history of organizational failure is filled with stories of timely anticipation of new threats and accurate assessments that came to naught because of inappropriate or poorly implemented adaptations, or initiatives that were too long delayed. The public record suggests that this may well have been the case for Toys R Us.

The Amazon alliance was not successful and in 2004 Toys R Us sued Amazon to force its termination. Toys R Us launched its own website in 2006, by which time Amazon’s dominance and growing economies of scope were well-established. Toys R Us also continued to maintain a relatively large number of traditional “big box” stores, often in malls in which many other retailers were failing (which decreased shopper visits).

While media coverage has focused on the firm’s recent bankruptcy and impending liquidation, perhaps the most interesting chapter in the Toys R Us story played out in 2005 and ended with the company being sold to a trio of private equity firms for $6.6 billion, an 8% premium over its stock price.

In our work with clients, we emphasize the critical importance of boards and management teams maintaining their situation awareness of evolving time dynamics – specifically, the relationship between the remaining time before an evolving strategic risk reaches one or more thresholds and becomes existentially dangerous, and the time still required to implement adequate adaptations to it.

One underappreciated aspect of this approach is that it creates the possibility of a situation in which no more “safety margin” is left, and it is clear that an evolving strategic risk will become and existential danger before an adequate response can be implemented.

At this point, the rational choice for a board is to sell or merge the company, to maximize the value of its shareholders’ investment. This approach can be very successful (in hindsight if not always foresight), if it is undertaken while there is still considerable market uncertainty about future developments, and widely varying beliefs about the potential effectiveness of various options for responding to them.

While never an easy choice, cases like Toys R Us can help management teams and boards to better appreciate that it is sometimes the right one to make.

Comments