Common Probability Distributions
I'm going to be using Schweser for this section as well.
Introduction
- Probability distribution - specifies the probabilities of possible outcomes of a random variable
- We will explore 4 distributions and their uses
- Definitions
- Random variable - we know this
- "Discrete" - countable
- Continuous - noncountable (e.g. rate of return)
- Here you usually talk about the probability of falling in a range, because any given discrete outcome has 0 probability
- Must understand whether a variable is continuous or discrete, and sometimes can choose
- Usually guided by which distribution is most efficient for the task
- Every random variable is associated with a prob distribution that describes it completely
Probability function
- p(x) is the probability that a random variable is equal to a specific value
- That is, p(x) is the probability that X = x
- p(x) must be from 0 to 1 and the sum of all probabilities must = 1
- Probability density function - used to calculate probability of an outcome between two values
- Cumulative distribution function - defines probability that X takes a value less than or equal to x - represents the sum or cumulative value of the probabilities
- F(x) = P(X<=x)
Discrete uniform random variable
- Probabilities of all possible outcomes are equal
- Assume X = [2, 4, 6, 8, 10]
- p(6) = 0.2
- F(6) = np(x) = 3(0.2) = 0.6, the probability that X<= 6. Note that 6 is the third number.
Binomial distribution
- Binomial random variable is the number of successes in a given number of trials
- Outcome is either success or failure
- Probability of success p is constant for each trial, trials are independent
- "Bernoulli" random variable is a variable for which there is only 1 trial
- Trial is a mini experiment
- Final outcome: number of successes in a series of n trials
- Binomial probability function defines probability of x successes in n trials
- probability of exactly x successes in n trials = (number of ways to choose x from n) * p^x * (1-p)^(n-x)
- Number of ways = n! / ((n-x)!*x!))
- p = probability of success on each trial
- Combined formula:
- Sidenote - expected value
- Expected value simply equals np, the number of trials times p(success)
Binomial trees
- Shows all possible combinations of up and down moves for a number of periods
- Each possible value along a tree is a node
- Using the price changes (%) each time and the cumulative probabilities of up and down moves you can assign probabilities and find expected value - see below
Continuous Uniform Distribution
- A range from a to b - outcomes can only occur from a to b
- P(X is between x1 and x2) = (x2 - x1) / (b - a)
- This is just the selected area divided by the total area
- Note that the distribution is a rectangle
- CDF is therefore a straight line going up from a and then horizontal after b
- Completely described by mean and variance
- Skewness = 0, so mean = median = mode
- Kurtosis = 3, therefore excess kurtosis = 0 (all kurtosis is measured relative to normal)
- A linear combination of normally distributed random variables is also normally distributed
- Probabilities get smaller at tails but never go to 0
Univariate and Multivariate Distributions
- Univariate - distribution of a single variable
- But sometimes the relationship between two or more random variables is relevant
- Multivariate Distributions
- Specifies probabilities associated with a group of random variables
- Only meaningful when behavior of a random variable somehow depends on that of the others
- Can apply both to discrete and continuous
- For two discrete variables, described by joint probability tables
- For two continuous variables, can make a multivariate normal distribution if the individual variables follow a normal distribution
- Correlation and Multivariate Normal Distributions
- Similar to normal, a Multivariate can be described by the means and variances of the individual random variables
- One must also specify the correlation between the pair of variables
- Correlation is the strength of the linear relation between a pair variables
- To describe a Multivariate in a portfolio of n assets, you need n means, n variances, and 0.5*n*(n-1) correlations. Ex, for 4 assets, you need 4 means, 4 variances, and 0.5*4*3=6 correlations
- When building a portfolio, all else equal you want lower correlation because this means lower variance
Confidence Intervals
- A 95% confidence interval is the range we expect the variable to be in 95% of the time
- Based on expected value and standard deviation(s)
- 68% fall within 1 std dev and 95% fall within 2 std devs
- 90% CI = Xbar +/- 1.65 std devs
- 95% CI = Xbar +/- 1.96 std devs
- 99% CI = Xbar +/- 2.58 std devs
Standard Normal Distribution
- Has been standardized so it has a mean of 0 and sdev of 1
- To standardize any random variable:
- z = (observation - population mean) / std dev
- Can use this z-score and the table to calculate probabilities of falling above/below a certain range
- Shortfall risk - probability that a portfolio value (or return) will fall below a particular (target) value or return over a given period of time
- Roy's Safety First Criterion (similar to Sharpe Ratio but uses benchmark instead of riskfree rate)
- Minimizes the shortfall risk by maximizing the Safety First Ratio
- Safety First Ratio:
- SF = [E(Rp) - Rl] / sdevportfolio
- This ratio gives the number of sdevs below the mean
Normal and Lognormal Distributions
- Lognormal is generated by function e^x where x is normally distributed
- Lognormal is skewed right and bounded by 0 on the left side - useful for modeling asset prices which never take negative values
- If we used normal, we would admit possibility of returns less than -100%
- This allows us to model 'price relatives' (i.e. 1+HPR)
Discrete versus Continuous Compounding
- Discrete goes period by period
- For an annual rate R, the continuously compounded EAR = e^R - 1
- Can reverse this calculation by taking the ln of 1+EAR
- Continuously compounded rates are additive for multiple periods
Monte Carlo Simulation
- Technique based on repeated generation of one or more risk factors that affect security values
- Uses this it generates a distribution of security values
- Must first make probability distributions for each risk factor
- Computer then generates random variables based on assumed probabilities
- These then spit out security values
- Repeated thousands of times to generate an idea of mean and perhaps variance of security values
- Applications:
- Value complex securities
- Simulate a trading strategy
- Determine risk of a portfolio
- Simulate pension fund assets/liabilities
- Value portfolios of assets that have non-normal returns distributions
- Limitations
- Complex and output only as good as input
- Not analytic but statistical, cannot provide insights that analytic methods can
- Rather than model distribution of risk factors, a prior period is used
- Each iteration randomly selects one of these past changes and calculates value of portfolio in question based on the changes in risk factors
- Advantage is that it uses actual distributions so you don't have to estimate
- But past changes in risk factors don't necessarily predict future changes
- Infrequent events might not be reflected unless captured in the timeframe
- Another disadvantage is it is not as good at 'what if' analysis as Monte Carlo - i.e. in Monte Carlo you can increase the variance of one of the risk factors by 20%, in historical you cannot do this
End of reading.
3:45 pm
2 hours
No comments:
Post a Comment