Effect Sizes

University of San Francisco, MSMI-603

Matt Meister

2022-12-31

Thus far

So far, we have learned about:

Means
Variance
Statistical significance

(among other things)

Thus far

We’ve learned to say things like:

The difference in clicking between group A and B is 2%
- And this is significant because p < .001
With every $10,000 increase in income, customers spend $25 more in our stores
- And this slope is significantly different from 0 because p = .02
Customers who are 25-34 are more interested in our product than those who are 35-44
- $M_{25-34}$ = 4.85/6
- $M_{25-34}$ = 4.32/6
- This might be due to chance, as p = .09

Thus far

Have we learned to say things like:

The difference in clicking between group A and B is 2%
- This is a big difference?
With every $10,000 increase in income, customers spend $25 more in our stores
- This is a big difference?
Customers who are 25-34 are more interested in our product than those who are 35-44
- $M_{25-34}$ = 4.85/6
- $M_{25-34}$ = 4.32/6
- This is a big difference?

Thus far

No!

For the clearest example, let’s focus on the third:

The one that uses a 0-6 scale

What is a difference of .53 on a 0-6 scale?
- Is that big?
- Does it matter in this context?
- To answer this, we are going to learn about effect sizes

Effect sizes

Effect sizes put our results into a standard format.

They do not tell us if our result is statistically significant or not.
- We use them after that
They tell us about how big our results are
- Again, in a standardized format

Effect sizes

Effect sizes put our results into a standard format.

There are two kinds of effect sizes, broadly:

Standardized differences
- These give us a standardized way to say whether the difference between groups is big
Variance explained
- These tell us whether some variable explains a lot or a little of our DV

Effect sizes

Effect sizes put our results into a standard format.

We will learn two today

Standardized differences
- Cohen’s d
  - $\frac{(M_A - M_B)}{SD_{AB}}$
Variance explained
- $R^2$
  - $1 - \frac{SSR}{n - p - 1} \div \frac{SST}{n - 1}$
- These tell us whether some variable explains a lot or a little of our DV

Cohen’s d

$\frac{(M_A - M_B)}{SD_{AB}}$

$M_A$: Mean of group A
$M_B$: Mean of group B
$SD_{AB}$: Pooled standard deviation
- Averaging the standard deviation is fine

This tells us how large the difference between groups is in terms of total variance in the data.

Cohen’s d - Examples

Heights of men and women in the US:

Are men and women different heights on average?

$M_{Male}$ = 69 inches
$M_{Female}$ = 64 inches
$SD_{Height}$ = 2.75 inches
Cohen’s d?
- 1.81

Cohen’s d - Examples

Heights of men and women in the US:

Cohen’s d = 1.81

Cohen’s d - Examples

Heights of men and women in the US:

Cohen’s d = 1.81

Cohen’s d - Examples

Are people are more aggressive toward individuals who have provoked them?

$M_{Provoked}$ = 8.232/10
$M_{Unprovoked}$ = 4.4/10
$SD_{Aggression}$ = 3.22
Cohen’s d?
- 1.19

Cohen’s d - Examples

Are people are more aggressive toward individuals who have provoked them?

Cohen’s d = 1.19

Cohen’s d - Examples

Are people are more aggressive toward individuals who have provoked them?

Cohen’s d = 1.19

Cohen’s d - Examples

Are people who are seen as more credible are also more persuasive?

$M_{Credible}$ = 5.42/10
$M_{Not}$ = 4.76/10
$SD_{Persuasion}$ = 3.29
Cohen’s d?
- .20

Cohen’s d - Examples

Are people who are seen as more credible are also more persuasive?

Cohen’s d = .20

Cohen’s d - Examples

Are people who are seen as more credible are also more persuasive?

Cohen’s d = .20

Contextualize your effect sizes

Sometimes you can look to other research

Or benchmarks like the things above

Sometimes you cannot

One good comparison is covariates

Contextualizing with covariates

Hypothesis: Mode of ordering (smartphone vs. desktop) will influence people’s portion choices

$Portion Size = \beta_{Device}xDevice + \beta_{Hunger}Hunger + \beta_{Dieting}Dieting$

Contextualizing with covariates

Hypothesis: Mode of ordering (smartphone vs. desktop) will influence people’s portion choices

$Portion Size = \beta_{Device}xDevice + \beta_{Hunger}Hunger + \beta_{Dieting}Dieting$

…And use common sense…

R-Squared

$1 - \frac{SSR}{n - p - 1} \div \frac{SST}{n - 1}$

This tells you:

For an entire model, how much of all of the variance you are explaining
- We get this result from lm()
For each individual effect, how much of all of the variance it explains
- We can get this result from anova()

R-Squared

From anova()

customerData <- read.csv('customerData.csv')

m_1 <- lm( data = customerData, sat.service ~ 1) # Just the mean
m_2 <- lm( data = customerData, sat.service ~ email) # Effect of email
m_3 <- lm( data = customerData, sat.service ~ email + income) # Effect of email and income

anova(m_1, m_2, m_3)

Analysis of Variance Table

Model 1: sat.service ~ 1
Model 2: sat.service ~ email
Model 3: sat.service ~ email + income
  Res.Df     RSS Df Sum of Sq        F  Pr(>F)    
1    590 1187.70                                  
2    589 1179.40  1      8.30   5.9544 0.01497 *  
3    588  819.51  1    359.89 258.2261 < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-Squared

From anova()

Analysis of Variance Table

Model 1: sat.service ~ 1
Model 2: sat.service ~ email
Model 3: sat.service ~ email + income
  Res.Df     RSS Df Sum of Sq        F  Pr(>F)    
1    590 1187.70                                  
2    589 1179.40  1      8.30   5.9544 0.01497 *  
3    588  819.51  1    359.89 258.2261 < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

$R^2_{email}$?
- $1 - \frac{1592}{658 - 1 - 1} \div \frac{1606}{658 - 1}$
- .009
$R^2_{income}$?
- $1 - \frac{945}{658 - 2 - 1} \div \frac{1606}{658 - 1}$
- .407

R-Squared

From lm()

summary(m_2)


Call:
lm(formula = sat.service ~ email, data = customerData)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.9813 -0.7347  0.0187  1.0187  4.0187 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  3.98131    0.09673  41.159   <2e-16 ***
emailyes    -0.24656    0.12111  -2.036   0.0422 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.415 on 589 degrees of freedom
  (409 observations deleted due to missingness)
Multiple R-squared:  0.006987,  Adjusted R-squared:  0.005301 
F-statistic: 4.144 on 1 and 589 DF,  p-value: 0.04222

Effect Size Conclusion

There are lots of effect size measures out there
They are useful, in that it’s nice to contextualize our effects
They come in two forms:
- Standardized differences
  - These give us a standardized way to say whether the difference between groups is big
- Variance explained
  - These tell us whether some variable explains a lot or a little of our DV