My CFA Journal: Quantitative Methods - Common Probability Distributions

Friday, September 14, 2012

Quantitative Methods - Common Probability Distributions

Start - 1:45 pm

Common Probability Distributions

I'm going to be using Schweser for this section as well.

Introduction

Probability distribution - specifies the probabilities of possible outcomes of a random variable
We will explore 4 distributions and their uses
Definitions

Random variable - we know this
"Discrete" - countable
Continuous - noncountable (e.g. rate of return)

Here you usually talk about the probability of falling in a range, because any given discrete outcome has 0 probability

Must understand whether a variable is continuous or discrete, and sometimes can choose

Usually guided by which distribution is most efficient for the task

Every random variable is associated with a prob distribution that describes it completely

Probability function

p(x) is the probability that a random variable is equal to a specific value
That is, p(x) is the probability that X = x
p(x) must be from 0 to 1 and the sum of all probabilities must = 1

Probability density function - used to calculate probability of an outcome between two values
Cumulative distribution function - defines probability that X takes a value less than or equal to x - represents the sum or cumulative value of the probabilities

F(x) = P(X<=x)

Discrete uniform random variable

Probabilities of all possible outcomes are equal
Assume X = [2, 4, 6, 8, 10]
p(6) = 0.2
F(6) = np(x) = 3(0.2) = 0.6, the probability that X<= 6. Note that 6 is the third number.

Binomial distribution

Binomial random variable is the number of successes in a given number of trials

Outcome is either success or failure

Probability of success p is constant for each trial, trials are independent
"Bernoulli" random variable is a variable for which there is only 1 trial

Trial is a mini experiment

Final outcome: number of successes in a series of n trials

Binomial probability function defines probability of x successes in n trials

probability of exactly x successes in n trials = (number of ways to choose x from n) * p^x * (1-p)^(n-x)

Number of ways = n! / ((n-x)!*x!))
p = probability of success on each trial

Combined formula:

Sidenote - expected value

Expected value simply equals np, the number of trials times p(success)

Binomial trees

Shows all possible combinations of up and down moves for a number of periods
Each possible value along a tree is a node
Using the price changes (%) each time and the cumulative probabilities of up and down moves you can assign probabilities and find expected value - see below

Continuous Uniform Distribution

A range from a to b - outcomes can only occur from a to b
P(X is between x1 and x2) = (x2 - x1) / (b - a)

This is just the selected area divided by the total area

Note that the distribution is a rectangle

CDF is therefore a straight line going up from a and then horizontal after b

The Normal Distribution

Completely described by mean and variance
Skewness = 0, so mean = median = mode
Kurtosis = 3, therefore excess kurtosis = 0 (all kurtosis is measured relative to normal)
A linear combination of normally distributed random variables is also normally distributed
Probabilities get smaller at tails but never go to 0

Univariate and Multivariate Distributions

Univariate - distribution of a single variable

But sometimes the relationship between two or more random variables is relevant

Multivariate Distributions

Specifies probabilities associated with a group of random variables
Only meaningful when behavior of a random variable somehow depends on that of the others
Can apply both to discrete and continuous
For two discrete variables, described by joint probability tables
For two continuous variables, can make a multivariate normal distribution if the individual variables follow a normal distribution

Correlation and Multivariate Normal Distributions

Similar to normal, a Multivariate can be described by the means and variances of the individual random variables
One must also specify the correlation between the pair of variables

Correlation is the strength of the linear relation between a pair variables

To describe a Multivariate in a portfolio of n assets, you need n means, n variances, and 0.5*n*(n-1) correlations. Ex, for 4 assets, you need 4 means, 4 variances, and 0.5*4*3=6 correlations

When building a portfolio, all else equal you want lower correlation because this means lower variance

Confidence Intervals

A 95% confidence interval is the range we expect the variable to be in 95% of the time

Based on expected value and standard deviation(s)
68% fall within 1 std dev and 95% fall within 2 std devs

90% CI = Xbar +/- 1.65 std devs
95% CI = Xbar +/- 1.96 std devs
99% CI = Xbar +/- 2.58 std devs

Standard Normal Distribution

Has been standardized so it has a mean of 0 and sdev of 1
To standardize any random variable:

z = (observation - population mean) / std dev

Can use this z-score and the table to calculate probabilities of falling above/below a certain range

Shortfall Risk/Safety First Ratio/Roy's Safety First Criterion

Shortfall risk - probability that a portfolio value (or return) will fall below a particular (target) value or return over a given period of time
Roy's Safety First Criterion (similar to Sharpe Ratio but uses benchmark instead of riskfree rate)

Minimizes the shortfall risk by maximizing the Safety First Ratio
Safety First Ratio:

SF = [E(Rp) - Rl] / sdevportfolio

This ratio gives the number of sdevs below the mean

Normal and Lognormal Distributions

Lognormal is generated by function e^x where x is normally distributed
Lognormal is skewed right and bounded by 0 on the left side - useful for modeling asset prices which never take negative values
If we used normal, we would admit possibility of returns less than -100%
This allows us to model 'price relatives' (i.e. 1+HPR)

Discrete versus Continuous Compounding

Discrete goes period by period
For an annual rate R, the continuously compounded EAR = e^R - 1
Can reverse this calculation by taking the ln of 1+EAR
Continuously compounded rates are additive for multiple periods

Monte Carlo Simulation

Technique based on repeated generation of one or more risk factors that affect security values
Uses this it generates a distribution of security values
Must first make probability distributions for each risk factor

Computer then generates random variables based on assumed probabilities
These then spit out security values

Repeated thousands of times to generate an idea of mean and perhaps variance of security values
Applications:

Value complex securities
Simulate a trading strategy
Determine risk of a portfolio
Simulate pension fund assets/liabilities
Value portfolios of assets that have non-normal returns distributions

Limitations

Complex and output only as good as input
Not analytic but statistical, cannot provide insights that analytic methods can

Historical simulation - based on actual changes in value or risk factors over some prior period

Rather than model distribution of risk factors, a prior period is used
Each iteration randomly selects one of these past changes and calculates value of portfolio in question based on the changes in risk factors
Advantage is that it uses actual distributions so you don't have to estimate

But past changes in risk factors don't necessarily predict future changes
Infrequent events might not be reflected unless captured in the timeframe
Another disadvantage is it is not as good at 'what if' analysis as Monte Carlo - i.e. in Monte Carlo you can increase the variance of one of the risk factors by 20%, in historical you cannot do this

End of reading.

3:45 pm

2 hours

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)