Charles Angelson writes:
So say you have some variable A, which correlates with B at x. Say you find a variable C where the correlation between C & B is equal to the correlation between C & A. How much will variable C correlate with variable B?
Let the pearson correlation between C & B be y, and the pearson correlation between A & B be x.
y = [1 + x] / sqrt(2x + 2)EDIT:
I guess I should've specified. The purpose of this is for *generating* C from A & B where A & B explain 100% of variance in C.
I have an update to this.
Given:
-correlation of ‘X’ between A and B
-A and B explain ‘R’ of the variance in C
-A and B equally correlate with C
The formula to calculate the correlation between A|B and C is:
(1 + X)/sqrt(2X + 2)*sqrt(R)
proof: run this code
normult <- c(0, 0.1, 0.25, 0.5, 0.75, 1, 1.5, 2, 2.5, 3, 3.5, 4, 5, 6, 7, 8, 9, 10)
bmult <- c(0, 0.1, 0.25, 0.5, 0.75, 1, 1.5, 2, 2.5, 3, 3.5, 4, 5, 6, 7, 8, 9, 10)
nulls <- rep(0, 324)
dftries <- data.frame(nulls)
dftries$corab <- NA
dftries$corcba <- NA
dftries$r2 <- NA
i = 1
for(nm in normult) {
for(bm in bmult) {
A = rnorm(700000)
B = rnorm(700000) + A*bm
C = rnorm(700000)*nm + A + B
ls <- lm(C ~ A + B)
lsum <- summary(ls)
dftries$corab[i] <- cor(A, B)
dftries$corcba[i] <- (cor(C, B) + cor(C, A))/2
dftries$r2[i] <- lsum$r.squared
i = i+1
}
}
dftries$y = (1 + dftries$corab)/sqrt(2*dftries$corab + 2)*sqrt(dftries$r2)
ls <- lm(data=dftries, corcba ~ y)
summary(ls)
To calculate the correlation between the residuals of A|B (the variance in A not explained by B) and C, you can use the partial correlation formula. In this case, since A and B equally correlate with C and explain "R" of the variance in C, you can use the following formula:
Partial correlation (A|B and C) = (ρ_AC - ρ_AB * ρ_BC) / √[(1 - ρ_AB^2) * (1 - ρ_BC^2)]
where:
ρ_AC is the correlation between A and C,
ρ_AB is the correlation between A and B (given as "X"),
ρ_BC is the correlation between B and C.
Given that A and B equally correlate with C, we can assume ρ_AC = ρ_BC. Therefore, the formula simplifies to:
Partial correlation (A|B and C) = (ρ_AC - ρ_AB * ρ_AC) / √[(1 - ρ_AB^2) * (1 - ρ_AC^2)]
Here is the further simplified formula:
Partial correlation (A|B and C) = (1 - X) * √(R / (2 - 2X^2))
Proof: It Just Works^tm.