library(tidyverse)
## ── Attaching packages ──────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1     ✔ purrr   0.3.3
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   0.8.1     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0
## ── Conflicts ─────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

Loading Data

Remember to set stringsAsFactors = FALSE when you load data. You can also load data from Files -> Import Dataset -> From Text (base).

barium <- read.delim("./barium.txt", comment.char="#", stringsAsFactors=FALSE)
names(barium) <- c("temp", "pressure")

Specify the response variable as \(\log(\text{pressure})\), and the predictor variable as \(\frac{1}{\text{temperature}}\). You can use either baseR or dplyr.

# dplyr 
barium <- barium %>%
  mutate(lnpressure = log(pressure)) %>%
  mutate(invtemp = temp^(-1))

# baseR
# barium$lnpressure <- log(barium$pressure)
# barium$invtemp <- (barium$temp)^(-1)

Now plot the second law of thermodynamics, and see if there is indeed any linear relationship.

ggplot(barium, aes(x = invtemp, y = lnpressure)) +
  geom_point() +
  geom_smooth(method = "lm") +
  xlab("inverted temperature") +
  ylab("ln(pressure)") +
  ggtitle("second law of thermodynamics") +
  theme_classic()

Estimation

You can either use lm() function, or manually compute the estimated values.

Solution 1

Use the covariance and variance to estimate the regression coefficients. This solution only works in the simple linear regression scenario.

\(\hat{\beta}_1\) is

B_hat <- cov(barium$lnpressure, barium$invtemp)/var(barium$invtemp)
B_hat
## [1] -21259.69

\(\hat{\beta}_0\) is

A_hat <- mean(barium$lnpressure) - B_hat * mean(barium$invtemp)
A_hat
## [1] 18.1847

Solution 2

Use the analytical solution in the matrix form.

design_mat <- cbind(rep(1, nrow(barium)), 
                    barium$invtemp)
solve(t(design_mat) %*% design_mat) %*% t(design_mat) %*% barium$lnpressure
##             [,1]
## [1,]     18.1847
## [2,] -21259.6919

Solution 3

Use built-in lm() function to derive the estimated coefficients.

linear_model <- lm(lnpressure ~ invtemp, data = barium)
result <- summary(linear_model)
coef(result)[, 1]
## (Intercept)     invtemp 
##     18.1847 -21259.6919
result$coefficients[, 1]
## (Intercept)     invtemp 
##     18.1847 -21259.6919

Statistical Properties

Solution 1

The summary of the linear model object also contains the standard error of our estimates of the regression coefficients.

coef(result)[, 2]
## (Intercept)     invtemp 
##   0.1418549 133.4539677
result$coefficients[, 2]
## (Intercept)     invtemp 
##   0.1418549 133.4539677

Solution 2

We can also manually compute the standard error. First let us estimate \(\sigma^2\).

sigma_sq_hat <- sum(result$residuals**2)/(nrow(barium) - 2)
sigma_sq_hat 
## [1] 0.01354334

Then, you can plug in into the formulas.