Hypothesis testing - a statistical assessment of a statement or idea regarding a population
If the statement is reasonable, accept, if unreasonable, reject
Null hypothesis - Ho - the hypothesis you want to reject. Null hypothesis can be mean = x, or greater than/less than (or equal to) x. Note: Nulls always include 'or equal to' condition.
One tailed hypothesis - is it greater or less than. Two tailed - deviation from. Most hypothesis tests are constructed as two tail.
Typical two tail: Ho: u = x, Ha: u =/= x
General rule is reject if test statistic > upper critical value, or test statistic < lower critical value
Example: 250 days, mean return is 0.1%. Sample sdev is 0.25%. Do a test at 5% level.
First, need sample sdev = sdev/sqrt(n) = 0.25%/sqrt(250).
Then, take t-score which is 0.1% / (0.25%/sqrt(250)) = 6.33. 6.33>1.96, so we reject the null of Ho: x=0.
Always set up the null so that rejecting will lead to acceptance of the alternative, the goal in performing the test.
Test statistic is a random variable and may follow one of several distributions - more on this later. Critical value for each distribution (t, z, chi-square, or F dist) depends on the distribution.
Type I and Type II errors
- Type I - rejecting null when it is actually true
- At 5% significance, there is a 5% chance of a Type I
- Type II - failing to reject null when it is actually false
Note: it is incorrect to 'accept' a null hypothesis - it can only be supported or rejected
Power of a test
- 1 - p(type II error)
- This represents the probability of correctly rejecting the null hypothesis
- If you have more than one test statistic you can use the highest power to decide which is best to use
- Calculating p(type II error) is quite difficult in practice
- Putting in a more stringent test increases prob of failing to reject a null, so power decreases
- Conversely, increasing power for a given sample size also increases chance of a type I error
- For a given significance level, can increase power only by increasing sample size
Confidence intervals
- Range of values within which the researcher believes the true population parameter may lie
- CI = sample statistic +/- critical value * standard error
- If null value (likely 0) is outside this range, you reject the null
Statistical significance does not mean economic significance
- Transaction costs may outweigh benefits, as are taxes and risk
- In short term, there could be significant variations from year to year even if the mean is profit
- Statistical tests with large samples can result in highly (statistically) significant results that are quite small in absolute terms
- Probability of obtaining the test statistic, assuming the null hypothesis is true
- Use t-test if population variance is unknown and EITHER sample is large or sample is small but distribution is normal or approximately normal
- Note: if sample is small and non-normal, there is no good test
- t(for n-1 degrees of freedom) = (sample - null) / (sample sdev / sqrt(n))
- Use z-test if population variance is known and population is normally distributed
- z = (sample - null) / (population sdev / sqrt(n))
- Can use the same formula if sample is large, using sample sdev instead of population
Comparing population means of 2 at least approximately normal distributed populations based on samples with either 1) equal or 2) unequal assumed variances
- Must be sure samples are independent and pops are at least approximately normally dist.
- In both cases, variance is unknown
- In one case, variance assumed equal between pops, and samples are pooled
- Other case, no assumption of equality b/w variance is made, and t-test uses approximated value for degrees of freedom
- Ho: mean 1 - mean 2 = 0 (for two tail test). Can also set 0 to any other number.
- One sided: Ho: mean 1 - mean 2 >= 0 and vice versa
- There is a big fat formula for the actual test - will not be tested
Second case: Two populations, unknown variances (assumed unequal), normally distributed
- Same as above but with different denominator - but uses individual sample variances unlike pooled sample which assumed variances were equal
- Remember these are both t statistics
Comparing two normally distributed populations (paired comparisons test)
- Sometimes samples may be dependent (unlike before where they were independent)
- Example, observations for two firms are both influenced by economic conditions/market returns/industry conditions etc
- Paired comparisons test is a test of whether average difference between two companies' monthly returns is different from 0, based on standard error of average difference est. in the sample
- Always requires sample data be normally distributed
- Ho: meandifference = hypothesized meandifference (often 0)
- t = (d sample - hypothesized) / sdev of mean difference
Hypothesis test concerning variance of a normal population
- Chi square test - used to test hypotheses concerning variance of a normal population
- Uses chi squared distribution - it is asymmetrical and approaches normal as n increases
- chisquare(n-1) = (n-1)s^2 / variance
- Note: chi square value cannot be negative
- Generally, if significant, the variance is different than the hypothesized value (example: is the variance of a portfolio still about 4%?)
Test equality of variances of two normal populations, based on two independent random samples
- F-test
- Ho: variance 1 = variance 2
- F = sample var 1 / sample var 2
- Always put the larger variance on top - then we only have to consider critical value for the right hand tail
Parametric and non-parametric tests
- Parametric - rely on assumptions regarding distribution of the pop and are specific to population parameters
- z test, for example, relies on mean and sdev. Requires sample is large, or normal dist, or both.
- Non parametric - either do not consider a particular population parameter or have few assumptions about population being tested
- These are used when assumptions can't be supported, or there is concern about quantities other than the parameter of the distribution
- Also used for ranked observations
- Often used alongside parametric tests
- Often used as a backup in case the parametric assumptions do not hold
- Example - comparing ranks between two datasets
- Further example - runs test, e.g. does price tick up or down
10:38 pm
About 1.5 hours
No comments:
Post a Comment