Behavioral finance FAQ / Glossary (Distribution)

A    B    C    D    E    F    G-H    I-L    M    N-O    P-Q    R    S    T-U    V-Z

Full list

This is a separate page of the D section of the Glossary

 

Dates of related message(s) in the
Behavioral-Finance group (*):

Year/month, d: developed / discussed,
i: incidental

(statistical) Distribution curve anomalies


03/6i -04/2i,7i + see clusters,
fat tails, kurtosis, asymmetry,

extreme, quant, probability,
random... + bfdef3

Ordering data into ranks and files for the parade,
and taking a picture that shows
if one head (or tail?) sticks out.

Can looking at that picture help predict
where all those heads and tails are going?

The quest for order in a sea of data

A statistical distribution is made by   sorting
numerical data
(related to physical or economic events, or to
populations, for example ages, income...) according to increasing
or decreasing values
.


The distribution shows how many times the same values are found.

In other words, a data sistribution indicates the frequency
of each value, or group of close values.

Those frequencies are usually converted into percentages.

For example, the percentage of people who are 0-4 year old.

     Same thing for 5-9, for 10-14, and so on.

Not to forget Methuselah in the 995-999 class. A typical, albeit
mythical "rare event" (see that phrase).

From distribution to probabilities

Assuming that future dice will behave
like good old past
dice. 


Statistical distributions, that give the frequency of past values, are
widely used as one of the main tool to determine probabilities (see
that word).

In a set of 100 future events, how frequent the same one will

be found ,
Never ? Once ? 10 times ? 90 times ? Always ?


The purpose is to help make previsions and decisions, for example
financial decisions.

But their use might be complicated when data are distributed in an
irregular fashion 
(anomaly, singularity..).

Therefore the first thing to analyze is whether
     that "data distribution"

* either is totally erratic and uncommon,

      * or fits a well-known usual pattern (distribution
        law).

This analysis is a three step process:

1) The data are gathered, processed and put in order on a scale

of numbers (usually in the form of a chart).

For example, applied to a statistic of  people heights,
the analysis compares how many people are about as high as

the average, how many are smaller, and how many are higher.

Those past frequencies are used as probabilities.

A garment shop can thus structure its inventories to match
     the sales probability for every   size, taken from
     official professional statistics.

 

OK, but it cannot replace fully (see numeracy bias) its
neurons, its experience and some knowledge of what is
happening in town (an arrival of oversized giants?) and
of its competitors inventories.

Mind you, this should not be taken as an advice to sneak
into their premises at night ;-)


2)
The next step is to see how , under what pattern, those data

     are "distributed"

Are they random? (the next occurrence cannot be

predicted from the last one or last ones)

What is their range?

What is the shape (statisticians would say the mode) of
    the curve ?

Is it regular (see below "typical distributions": bell curve,
L curve...

What is the mean (the central value for which there are as

many smaller ones than higher ones)?

3) Then this distribution is compared with classical sets / patterns:

Standard distribution curves, distribution functions and
distribution laws
(Gauss, Pareto...: see below).


In many cases, and when well applied, this gives useful
    
information to infer  probabilities.

It is used for stochastic calculations (see stochastic), for
previsions and decisions.


When a distribution is distorted compared to those classical
laws
, we can call that a distribution "anomaly".

Also, when a distribution has no similarity whatever with
any law
, we have a distribution "singularity".

Kinds of statistical distributions

Photograph or movie?

The statistical distribution of events, is either:

A static distribution. It shows how things are at a precise time.

It can be applied in many fields.

A typical example is the situation of a population as the one stated
above.

Another one is comparing financial ratios (profit margin, debt /
equity...)between various businesses so as to determine management
"benchmarks".

  Or a time-distribution.
     It shows how things
evolve during a period.

This is the distribution of data found in historical data
series.

Time distributions are of intensive use in financial markets, as
applied to price, return, volatility data.

Visualization of distributions

Join the dots!

Statistical distributions are often shown visually on charts (as shown
below) in which:

The horizontal scale
gives the values,
i.e market
prices or whatever
numerical data.

It shows for example the range
of a stock's daily returns in a
market.

The scale will be graduated, say,
from minus 19-20 % to plus 19
-20%, as everything can happen.

The vertical scale shows
how many
times (number 
of days in the above example)
each value occurs.

Unless the distribution is fully non-
linear (irregular
broken lines), the
graph obtained by joining the dots
shows a distribution curve.

Typical distributions: A) The normal law

Classical bell ding dongs

Distributions fits most of the times in typical curves.

A highly common one, shown below,is

The Gauss / Laplace "normal law",
also called "bell curve".

It is one of the main laws that applies to various kinds of random
events
.

Randomness applies to events that cannot be predicted
individually but which occurrences follow some precise
frequency that matches a "law of large numbers"
, as seen
when looking at the broad picture.

In the normal law (see the graph below),

* Most data concentrate on or near the mean (also called the
   "central limit"),
forming the top of the bell,

* They follow a symmetric repartition around it,

* They end in slim "distribution tails" on both sides.


If the data obey the normal law, their dispersion around the mean
can be
calculated as the "standard deviation".

This is mathematically explained in the "volatility" article.

But to give a simpler approach, if:

* the mean is 50

* 68 % of the data are between 46 and 54,

* then the standard deviation is 4, on both sides, as equal
   to (54 - 50) and to (50 - 46).

Obviously, if the standard deviation of the wave sizes is 2
centimeters, you are in a farmyard looking at the ducks'
wakes in the pond, if it is 5 meters, you are at the seaside
observing a surfer's paradise.

Gaussian distribution

              X              
            X   X            
                             
          X   X          
                             
          X       X          
                             
        X           X        
                             
      X               X      
                             
   X 
X                   X  
 
 
                      X

Horizontal scale = values (i.e. shoe sizes)
Vertical scale = frequencies (number of times each value occur)

Also, by using (legitimate) mathematical tricks, some asymmetric
distributions might be presented as symmetric ones.

It is the case of the "log-normal" law, which takes into account
the tendency of economic values to rise exponentially in many
time-distributions.

A price change from 100 to 120, and one from 200 to 240,
    
express an identical 20% growth rate.

Typical distributions: B) The power law

The Pareto "power law"; or "law of extremes", is also called
the
"L curve" (or in some cases inverted L curve), or the
"20/80 law"
(20% of events / cases make 80% of the volumes or
variations at play).

In that distribution, there is an asymmetric concentration of data
near one
extreme value (the vertical branch of the L).

This occurs for example, in time-distributions, when the data
evolutions start snowballing towards extremes.


It is considered that the Pareto law applied to economic incomes
represents an optimum,
called "Pareto efficiency", in which:

* A "flattening" of the curve would certainly bring less inequality,

but would also lower the income of the whole population.

* An increase in the asymmetry would also bring lower general 
   results to the population.


Distribution anomalies in
     finance and economics

Dents and cracks in the bell.

Use probabilities but with a lot of precautions


Many classical economists considered that the bell curve and other
classical (also!) curves matched closely the time-distribution of
prices, returns, risk / volatilities
.

This is far from being fully the case.

Economic and financial realities differ rather often, and sometimes
widely, from what this idealized paradigm predicts.


It is said in such cases that there are distribution anomalies
or "defects"
.

For example,

The bell is often slightly flattened at the top, while showing

at its low extremities two "fat tails" (see that phrase)

Often also, it is leptokurtic (see that word) with a higher
    peak around the mean, less thick sides but fatter tails.

In other cases the curve is decisively asymmetric (skew):
    see that word.

It shows often also several data clusters (see that word),
    looking like camel humps.

Those deviations from the "normal" law (or the log-normal law) can
have two origins
. They signal:

1) Either behavioral biases.
    This glossary describes extensively many of them.

2) Or structural market imperfections (poor liquidity).

=> There is here a danger when the normal law is applied
      in prediction models involving for example
      financial markets:

They might underestimate the real risks, by ignoring those distribution
anomalies

There is also the danger of skirting the needed effort to make plans
for some extreme scenarios.

The human mind can be abler than a stochastic law to foresee them.

Never forget also that situations might change.

Because of that divergence, old statistics might not apply to
the new state of affairs.

On the other hand,to ignore or neglect statistical distribution can
be dramatic
: see base rate fallacy

Quantitative analysis tools


Kurtosis
(concentration / dispersion of data) is a mathematical
measurement
of how far from the bell curve a statistical series is.

There are mathematical models (see Garch, heteroskedasticity) that
take into account the instability ("non-constant variance") in
historical data series

Distribution / Accumulation phase

See accumulation / distribution +
congestion / (price) cluster,

percolation + investors type

Amassing fresh goodies inside the castle
and throwing stale ones at the barbarians.

Accumulation and distribution in financial market are periods in which
prices seem to stabilize at the end of long and massive bull or bear period, which
can precede a trend inversion.

At the end of a bull

market,
  "big hands"
(investors who own a large
quantity of assets) are the
first to sell
(distribution phase).

They take advantage that small
investors keep on buying.

Prices tend to stop rising, as if
     blocked
by an invisible ceiling,

and to get more volatile.

After that - unless the big
hands were wrong - the trend
might become bearish.

At the end of a bear

market, before a
recovery, the opposite
phase,
accumulation
by big
hands, often happens

A kind of invisible floor
     seems to block the fall,

although it might not

prevent a sudden last day

of panic
(see "capitulation")

It might then be followed
by the recovery and a
bullish trend.


Those phenomena can be classified as forms of percolations (see
that word) that lead to
tipping points / breakthroughs in a new
direction.

Can such a market phase be detected  ?

Big hands weighing on the charts.


Accumulation create statistical clusters (see that word) in time-distribution
graphs (*) of prices and volatilities.

The market prices are stuck (clustered) in some horizontal
      time-zone
of the graph that look like a ceiling or a floor.

It might signal (but not always), as seen above that a change of trend
is near.

However interesting those phenomena are, they can be misleading, or at
least their effects might be hard to spot visually or mathematically on the
market.

Prices can either rebound on a ceiling or a floor, or percolate through
them (see percolation).

Thus the behavior - as well as the relative market power
(financial
strength) - of each type of investors (see
"investor types" and "agent based models") are not so easily
measurable
(and of course predictable).

The effects might be more complex and diluted in time than an immediate
trend reversal.

(*) Distribution in the statistical sense (see distribution curve), and
     distribution
phases as  occasional market phenomenon described here,
     are non related homonyms,
even if time distribution series might help
     to spot such a market
phase.

(*) To find those messages: reach that BF group  and, once there,
      1) click "messages", 2) enter your query in "search archives".

Members of the Behavioral Finance Group,
 please vote on the glossary quality at
BF polls

separ

This page last update: 20/07/15           

   D section of the Glossary
Behavioral-Finance Gallery main page

  Disclaimer / Avertissement légal