start

# survival analysis

p value: statistical evidence against null hypothesis, or say collect evidence to reject null hypothesis, or say the probability of data to reject the assumed wisdom. (1) probability value (2) p value for data, once data is collected under null hypothesis, data can not be changed. (3) used for rejecting null hypothesis.

parametric & non-parametric test? If we don't know the data follows which kind of distribution, no parameter, then using non-parametric test such as Wilcoxon rank tests directly.

Type I & Type II error, Power

ANOVA, Ancova

mixed effect model: fixed effects, random effects, usually used in non-survival studies, continuous variables, not related to time.

Model selection/building with AIC: smaller AIC indicates better of several models, AIC acts as a guard against overfitting

CMH: for three way table common odds ratio test, testing indepandance.

Categorical: Proc Genmod

Study Design: Sample Size Calculation: Proc power Non-survival analysis sample size calculation Randomization

##### effect size

Effect size is the treatment difference caused by different treatments. It signifies the magnitude of treatment effects.

The difference can be measured in categorical variables or in continuous variables. Difference in proportion can be one type of effect size; difference in means can be another type.

##### how to do randomization?

Randomization can be done by proc plan

It can also be done by using randomization SAS functions.

##### sample size

Sample size can be calculated using proc power in SAS; or using special software like nQuery, PASS, East, etc.

The key is to find the number of subjects that would let us gain the statistical power for the clincial trial.

* Interview Questions *

Randomized two arms, active and placebo outcome variable y measure twice, once baseline, once post treatment measurement. Whether getting active improves y?

You have two treatment groups and you have two measurements on each group.

Best single answer: Two measurements on placebo and active, what are you comparing on the t test? 1. get change from the baseline for the placebo 2. get change from the baseline fort the active. Compare the mean difference of these two changes.

2nd method just to compare post treatment values: Compare the mean differences between the two treatment results. Do a t test on the baseline to see if comparable.

Besides t test, we can use Anova, as with Anova we can use model, y(response, dependent variable)=treatment. PROC ANOVA DATA=datasetname;

```   CLASS factorvars (such as treatment);
MODEL responsevar (such as change) = factorvars; ```

Outcome is 0, 1, how to determine the active is helping? What if it's continuous variable, but not normally distributed, what would you do?

If you are doing Anova, what would you do? Does getting active treatment improve on outcome y? What is the model would be?

What is the dependent variable in the ANOVA model? Left hand side is dependent variable

In t test, the dependent variable could be the change from the baseline. Explanatory variable is the treatment.

The dependant variable is just the post treatment value or the change from the baseline, and explantory variable just is the treatment on the right hand side.

Change from the baseline = treatment, the result exactly is the t test.

Question: Suppose outcome variable y = 0, 1, or yes no, we got baseline and treatment, and we got some other variables, gender taking into account. treatment has effect on y after adjusting for gender. What would you analyze that?

Think more on the statistical side, not the programming side.

# Knowledge Structure of Biostatisticians

• theory: probability, inference, linear models, others
• applications: ANOVA, ANCOVA, t-test, F test, survival analysis, categorical analysisys
• experimental design
• Environments / FDA regulations
• Software : SAS, R 