Polynomials of Order 8? (\(R^2\) = 0.9864, \(R^2_{adj}\) = 0.9708)
My Solution
Let CylVolume be the volume of the cylinder defined by Height and Diam, \(V = \pi (d/2)^2h\).
The model is \(Volume_i = \beta_1CylVolume_i + \epsilon_i\), where \(\beta_1\) is the proportion of an ideal cylinder that is actual usable wood.
trees$CylVolume <- pi * (trees$Diam/2)^2* trees$Heightcyl_lm <-lm(Volume ~-1+ CylVolume, data = trees)summary(cyl_lm)$coef |>round(4)
Estimate Std. Error t value Pr(>|t|)
CylVolume 0.3865 0.005 77.4365 0
summary(cyl_lm)$adj.r.squared
[1] 0.994856
CylVolume Plots
par(mfrow =c(2, 3))plot(Volume ~ CylVolume, data = trees,main ="Volume versus CylVolume")plot(1, main ="Blank Space", bty ="n", xaxt ="n", yaxt ="n",xlab ="", ylab ="", pch ="")plot(cyl_lm)
Some Closing Thoughts
Using ln(y) ~ ln(x) made the \(R^2\) slightly better, even though plots looked similar.
Plots looked slightly better for full log, though.
We were using log just to make the relationship linear.
Box-Cox told us to use sqrt or quarter power instead - result was better!
Very interestingly, models that chose powers of \(x\) chose 1 for Height, 2 for Diam…
Trees are not perfect cylinders.
However, a cylinder model fits best and with fewer parameters!!!
In conclusion, always think through the problem before blindly modelling.
Note that the best model in this case used our knowledge of the physical problem, and this happened to correspond to the best fitting model. This isn’t always the case! It is very common that we’ll have to choose between a model that fits well and a model that matches our understanding of the world, and this is not an easy choice!