Still presenting regression results in tables? why not forest plots?

In this post I reproduce an elegant JAMA forest plot—originally created in SAS—using R.

Mihiretu Kebede(PhD)
2025-09-27

Why Are We Still Using Endless Tables?

Every week I open papers that bury their main results in wall-to-wall regression tables: dozens of rows, tiny fonts, and confidence intervals that may even require a ruler and patience.

Let’s see this elegant forest plot presented in this article: Associations Between Diabetes and Incident Colorectal Cancer. It looks nice and easy to read, right? So what is stopping many from presenting regression results like this—clear, visual, and instantly interpretable?

A single figure tells the story: which factors matter, the direction of effect, and the uncertainty—no squinting at page-long tables.

Getting the Data

This figure contains only a handful of rows, so I simply typed the numbers by hand from the article and supplement.
That guarantees 100 % fidelity.

For larger figures you could automate extraction.
The tesseract package in R can OCR a PDF or PNG:

library(tesseract)
txt <- ocr("jama.jpg")
cat(txt)
HR Favors no | Favors Pfor
Characteristic (95% Cl) colorectal cancer colorectal cancer interaction
Income, $ 3B
>50000 1.51 (0.84-2.72) —-—.
<15000 1,72 (1.32-2.24) —
‘Smoking status 04
Never 1.10 (0.81-1.48) —
Former 2.07 (1.41-3.06) —.—
Current 1.62 (1.14-2.31) —
Obesity (BMI 230) 83
No 1.46 (1.11-1.92) —
Yes 1.48 (1.12 -1.95) —
Sex 33
Male 1,27 (0.93-1.75) —
Female 1,59 (1.24-2.04) aed
Race and ethnicity 33
Non-Hispanic Black 1.43 (1.13-1.81) Se
Non-Hispanic White 1.69 (1.15-2.47) ——
rr nn)
HR (95% Cl)

…but for eleven lines of data, manual entry wins on speed and accuracy.

df <- data.frame(
  variable = c("Income, $", "Income, $", "Smoking", "Smoking", "Smoking",
               "Obesity", "Obesity", "Sex", "Sex",
               "Race and ethnicity", "Race and ethnicity"),
  characteristic = c(">50000", "<15000", "Never", "Former", "Current",
                     "No", "Yes", "Male", "Female",
                     "Non-Hispanic Black", "Non-Hispanic White"),
  HR    = c(1.51, 1.72, 1.10, 2.07, 1.62, 1.46, 1.48, 1.27, 1.59, 1.43, 1.69),
  lower = c(0.84, 1.32, 0.81, 1.41, 1.14, 1.11, 1.12, 0.93, 1.24, 1.13, 1.15),
  upper = c(2.72, 2.24, 1.48, 3.06, 2.31, 1.92, 1.95, 1.75, 2.04, 1.81, 2.47),
  p_value = c(0.93, NA, 0.04, NA, NA, 0.83, NA, 0.33, NA, 0.33, NA)
)

df$HR_CI <- sprintf("%.2f (%.2f–%.2f)", df$HR, df$lower, df$upper)
df
             variable     characteristic   HR lower upper p_value
1           Income, $             >50000 1.51  0.84  2.72    0.93
2           Income, $             <15000 1.72  1.32  2.24      NA
3             Smoking              Never 1.10  0.81  1.48    0.04
4             Smoking             Former 2.07  1.41  3.06      NA
5             Smoking            Current 1.62  1.14  2.31      NA
6             Obesity                 No 1.46  1.11  1.92    0.83
7             Obesity                Yes 1.48  1.12  1.95      NA
8                 Sex               Male 1.27  0.93  1.75    0.33
9                 Sex             Female 1.59  1.24  2.04      NA
10 Race and ethnicity Non-Hispanic Black 1.43  1.13  1.81    0.33
11 Race and ethnicity Non-Hispanic White 1.69  1.15  2.47      NA
              HR_CI
1  1.51 (0.84–2.72)
2  1.72 (1.32–2.24)
3  1.10 (0.81–1.48)
4  2.07 (1.41–3.06)
5  1.62 (1.14–2.31)
6  1.46 (1.11–1.92)
7  1.48 (1.12–1.95)
8  1.27 (0.93–1.75)
9  1.59 (1.24–2.04)
10 1.43 (1.13–1.81)
11 1.69 (1.15–2.47)

Reproducing the Plot in R

The meta package is designed for meta-analysis, but its forest() function is perfect for any list of estimates once pooling is turned off.

library(meta)

all <- metagen(
  TE = log(df$HR),
  lower = log(df$lower),
  upper = log(df$upper),
  subgroup = df$variable,
  data = df,
  sm = "HR",
  common = FALSE,
  overall = FALSE,
  random = FALSE,
  backtransf = TRUE
)

forest(all,
       leftcols  = c("characteristic", "HR_CI"),
       leftlabs  = c("Characteristic", "HR (95% CI)"),
       rightcols = "p_value",
       rightlabs = "P for interaction",
       col.square = "darkgreen",
       weight.study = "same",
       label.left  = "Favours no colorectal cancer",
       label.right = "Favours colorectal cancer",
       fontsize=7, squaresize = 0.6,
       spacing = 0.7,
       subgroup.name = ""
       )

Take-Home Message

Visuals persuade. A forest plot conveys magnitude, direction, and uncertainty at a glance.

Readers remember pictures, not tables.

If your model fits in a table, it fits in a figure—and your audience will thank you.

So the next time you’re tempted to drop a 12-column regression table into your manuscript, ask yourself: why not a forest plot?

Stop drowning readers in tables—start communicating with figures.

Image credit: Figure 1 from the cited JAMA article.

If you have enjoyed reading this blog post, consider subscribing for upcoming posts.

Subscribe

* indicates required