Meta-Analytic SEM Tutorial

Background

Meta-analytic structural equation is a technique that compiles cumulative empirical evidence from meta-analysis to form a matrix that can in turn be analyzed using SEM. I had developed a workflow for meta-analytic SEM for a forthcoming project, and adapted it to replicate Rains’ analysis on the nature of psychological reactance to demonstrate this technique. Another, more recent example of this technique is Goodboy et al.’s (2020) meta-analysis on relational turbulence model.

Goodboy, A. K., Bolkan, S., Sharabi, L. L., Myers, S. A., & Baker, J. P. (2020). The relational turbulence model: A meta-analytic review. Human Communication Research, 46(2-3), 222-249.

Rains, S. A. (2013). The nature of psychological reactance revisited: A meta-analytic review. Human Communication Research, 39(1), 47-73.

Prep work

Before you start, make sure you have the data file “meta-sem-data.xlsx” stored in the same directory as this .rmd file, so R can find it and save any output files to the same place.

Loading R packages

RStudio may prompt you at the top of this panel to install packages that you have not used before. You can also install packages manually by typing in the console: for example, install.packages(“dplyr”).

library(devtools)    # For version control
library(dplyr)       # For data management 
library(readxl)      # For reading in excel spreadsheet
library(lavaan)      # For SEM
library(lavaanPlot)  # For plotting SEM
library(psych)       # For computing harmonic mean
library(psychmeta)   # For meta-analysis
library(reshape2)    # For data manipulation
library(reshape)     # For data manipulation

Reading the data into the environment.

You’ll notice that effect sizes dataset has 4 columns:

  • study: study identifier
  • var1 & var2: variable pair corresponding with the correlation effect size
  • r: correlation effect size

The study sample size dataset has the same study identifier column, in addition to sample size n.

# The two worksheets contain effect sizes and study sample size information respectively

ES <- read_xlsx("meta-sem-data.xlsx",1)
StudyN <- read_xlsx("meta-sem-data.xlsx",2)

# View the first 10 rows of these data frames
head(ES,10)
## # A tibble: 10 × 4
##    study                var1         var2             r
##    <chr>                <chr>        <chr>        <dbl>
##  1 Dillard & Shen 2005a threat       anger         0.24
##  2 Dillard & Shen 2005a threat       counterargue  0.14
##  3 Dillard & Shen 2005a threat       attitude      0.11
##  4 Dillard & Shen 2005a anger        counterargue  0.43
##  5 Dillard & Shen 2005a anger        attitude      0.22
##  6 Dillard & Shen 2005a counterargue attitude      0.27
##  7 Dillard & Shen 2005b threat       anger         0.32
##  8 Dillard & Shen 2005b threat       counterargue  0.37
##  9 Dillard & Shen 2005b threat       attitude     -0.03
## 10 Dillard & Shen 2005b anger        counterargue  0.45
head(StudyN,10)
## # A tibble: 10 × 2
##    study                      n
##    <chr>                  <dbl>
##  1 Dillard & Shen 2005a     196
##  2 Dillard & Shen 2005b     200
##  3 Ivanov 2011              420
##  4 Kim & Levine 2008a       392
##  5 Kim & Levine 2008b       274
##  6 Magid 2011               226
##  7 Martinez 2009            160
##  8 Miller 2007              383
##  9 Quick & Considine 2008   247
## 10 Quick & Kim 2009         344
# Let's add the sample sizes to the effect sizes
ES <- merge(ES, StudyN, by = "study")
head(ES,10)
##                   study         var1         var2     r   n
## 1  Dillard & Shen 2005a       threat        anger  0.24 196
## 2  Dillard & Shen 2005a       threat counterargue  0.14 196
## 3  Dillard & Shen 2005a       threat     attitude  0.11 196
## 4  Dillard & Shen 2005a        anger counterargue  0.43 196
## 5  Dillard & Shen 2005a        anger     attitude  0.22 196
## 6  Dillard & Shen 2005a counterargue     attitude  0.27 196
## 7  Dillard & Shen 2005b       threat        anger  0.32 200
## 8  Dillard & Shen 2005b       threat counterargue  0.37 200
## 9  Dillard & Shen 2005b       threat     attitude -0.03 200
## 10 Dillard & Shen 2005b        anger counterargue  0.45 200

Before we jump into the meta-analysis, let’s take a look at the numbers of studies and effect sizes available for each correlation.

ES %>%
  group_by(var1,var2) %>%
  summarize(no.studies = n(), total.n = sum(n))
## `summarise()` has grouped output by 'var1'. You can override using the
## `.groups` argument.
## # A tibble: 6 × 4
## # Groups:   var1 [3]
##   var1         var2         no.studies total.n
##   <chr>        <chr>             <int>   <dbl>
## 1 anger        attitude             11    2991
## 2 anger        counterargue         14    3692
## 3 counterargue attitude              9    2334
## 4 threat       anger                19    4821
## 5 threat       attitude             11    2991
## 6 threat       counterargue         17    4062

There are 6 cells in the matrix of 4 variables (lower half, and excluding the diagonal), and we can see for each meta-analysis, we have 9-19 effect sizes, and the total sample size for each cell was between 2334 and 4821.

Create a matrix with meta-analysis results

First, let’s review how to the psychmeta package to conduct meta-analysis. We will use the correlation between threat and anger as an example.

# Subset the data for the correlation of interest
ES.subset <- filter(ES, var1 == "threat" & var2 == "anger")

cor.threat.anger <- ma_r(# specifying the column of effect sizes
                         rxyi = r,
                         # the column for numbers
                         n = n,
                         # we will use the barebones method
                         ma_method = "bb",
                         # giving names to focal variables
                         construct_x = ES.subset$var1[1],
                         construct_y = ES.subset$var2[1],
                         # the input dataframe 
                         data = ES.subset)
##  **** Running ma_r: Meta-analysis of correlations ****
# Let's view the results of the first meta 
first.meta<-as.data.frame(cor.threat.anger$meta_tables$`analysis_id: 1`$barebones)
first.meta
##    k    N    mean_r    var_r       var_e    var_res      sd_r       se_r
## 1 19 4821 0.2167969 0.016053 0.003597538 0.01245546 0.1267004 0.02906707
##         sd_e   sd_res  CI_LL_95  CI_UL_95   CR_LL_80  CR_UL_80
## 1 0.05997948 0.111604 0.1557293 0.2778646 0.06831991 0.3652739

The barebones (because we didn’t adjust for any artifacts) results indicate that on average, threat and anger are correlated at .217.

Next, we write a loop function to run a meta-analysis for each cell in the matrix.

# Let's add an identifier for each effect
ES <- ES %>%
  mutate(ESid = paste(var1, "~~", var2))

# Create an empty data frame to save the results into
results <- NULL

# Create a loop to run the same procedure for each cell
# Specify the elements used for the loop
for (cell in unique(ES$ESid)) {
  # Subset the effect size table
  ES.subset <- ES %>%
    filter(ESid == cell)
  # Run the meta-analysis
  meta.subset <- ma_r(# specify the column of effect sizes
                      rxyi = r,
                      # the column for numbers
                      n = n,
                      # we will use the barebones method
                      ma_method = "bb",
                      # give names to focal variables
                      construct_x = ES.subset$var1[1],
                      construct_y = ES.subset$var2[1],
                      # the input dataframe 
                      data = ES.subset)
  # Using the headers of the first meta-analysis we ran as a base,
  # save the results of each new meta-analysis to it
  subset.results <- as.data.frame(c(var1 = as.character(meta.subset$construct_x),
                  var2 = as.character(meta.subset$construct_y),
                  meta.subset$meta_tables$`analysis_id: 1`$barebones))
  
  results <- rbind(results,subset.results)
}
##  **** Running ma_r: Meta-analysis of correlations **** 
##  **** Running ma_r: Meta-analysis of correlations **** 
##  **** Running ma_r: Meta-analysis of correlations **** 
##  **** Running ma_r: Meta-analysis of correlations **** 
##  **** Running ma_r: Meta-analysis of correlations **** 
##  **** Running ma_r: Meta-analysis of correlations ****
# View the results
results
##           var1         var2  k    N     mean_r       var_r       var_e
## 1       threat        anger 19 4821 0.21679691 0.016052999 0.003597538
## 2       threat counterargue 17 4062 0.19858981 0.012159470 0.003882935
## 3       threat     attitude 11 2991 0.06430502 0.018421172 0.003662463
## 4        anger counterargue 14 3692 0.30402790 0.018214138 0.003138594
## 5        anger     attitude 11 2991 0.19961869 0.007314739 0.003404495
## 6 counterargue     attitude  9 2334 0.16089457 0.012222488 0.003674962
##       var_res       sd_r       se_r       sd_e     sd_res    CI_LL_95  CI_UL_95
## 1 0.012455461 0.12670043 0.02906707 0.05997948 0.11160404  0.15572925 0.2778646
## 2 0.008276535 0.11026999 0.02674440 0.06231320 0.09097546  0.14189422 0.2552854
## 3 0.014758709 0.13572462 0.04092251 0.06051829 0.12148543 -0.02687602 0.1554861
## 4 0.015075544 0.13495976 0.03606951 0.05602316 0.12278251  0.22610445 0.3819513
## 5 0.003910244 0.08552625 0.02578713 0.05834805 0.06253194  0.14216137 0.2570760
## 6 0.008547526 0.11055536 0.03685179 0.06062147 0.09245283  0.07591419 0.2458749
##      CR_LL_80  CR_UL_80
## 1  0.06831991 0.3652739
## 2  0.07697771 0.3202019
## 3 -0.10239530 0.2310053
## 4  0.13825048 0.4698053
## 5  0.11381338 0.2854240
## 6  0.03175504 0.2900341

Alright! We have all the effect sizes we need to build a matrix for SEM/path analysis.

# Select the mean effect size estimates
mean.cors <- select(results, var1, var2, mean_r)

# Manipulate the data into matrix form
cor.matrix <- cast(mean.cors, var2~var1)
## Using mean_r as value column.  Use the value argument to cast to override this choice
# Complete the matrix
cor.matrix$attitude <- c(NA, NA, NA)
cor.matrix <- rbind(c(var2 = "threat", rep(NA,4)),
                    cor.matrix)

# Rearrange the matrix
cor.matrix <- cor.matrix %>% 
  arrange(factor(var2, levels = c("threat","anger","counterargue","attitude"))) %>%
  select(var2, threat, anger, counterargue, attitude)

# Fill in 1's on the diagonal 
cor.matrix[1,2]<-1
cor.matrix[2,3]<-1
cor.matrix[3,4]<-1
cor.matrix[4,5]<-1

# Fill in the matrix
cor.matrix <- cor.matrix %>%
  mutate(cor.input = paste(threat,anger,counterargue,attitude))

input.matrix <- gsub("NA","",as.character(cor.matrix$cor.input))

input.matrix <- getCov(input.matrix, names = colnames(cor.matrix[2:5]))

# View the matrix 
input.matrix
##                  threat     anger counterargue   attitude
## threat       1.00000000 0.2167969    0.1985898 0.06430502
## anger        0.21679691 1.0000000    0.3040279 0.19961869
## counterargue 0.19858981 0.3040279    1.0000000 0.16089457
## attitude     0.06430502 0.1996187    0.1608946 1.00000000

Testing Models

Next, we will build alternative SEM models to determine the best configuration of reactance.

Rains (2013) tested 3 such models.

  • A dual-process model, where anger and counterarguing functioned in a parallel manner to convey the influence of threat to freedom to attitude.
  • An intertwined model, where anger and counterarguing are two components of a latent construct reactance
  • A linear affective-cognitive model, where anger precededs counterarguing

Let’s build these models:

As a reminder, in lavaan language –

  • =~ represents “measured by”
  • ~ represents “regressed on”
  • ~~ represents “correlated with”
dual.process.model <- "anger ~ threat
                       counterargue ~ threat
                       attitude ~ anger + counterargue"

intertwined.model <- "reactance =~ anger + counterargue
                      reactance ~ threat
                      attitude ~ reactance"

linear.model <- "anger ~ threat
                 counterargue ~ anger
                 attitude ~ counterargue"

In meta-analytic SEM, some scholars opt to use a sample size that represents the collection of studies, such as a mean, median, or minimum n of selected studies (such as Rains), while others compute an average of sorts of all total n’s for each cell (Landis, 2013). Here, we take the latter approach by calculating a harmonic mean for all cells.

A harmonic mean is the reciprocal of the arithmetic mean of the reciprocals of the given set of observations. For example, the harmonic mean of 1, 10, and 100 is 2.70.

\[\begin{aligned} H &= (\frac{\frac{1}{1} + \frac{1}{10} + \frac{1}{100}}{3})^{-1} \\ &= \frac{3}{1+.1+.01} = \frac{3}{1.11}\\ &= 2.70 \end{aligned}\]

Landis, R. S. (2013). Successfully combining meta-analysis and structural equation modeling: Recommendations and strategies. Journal of Business and Psychology, 28, 251–261. doi: 10.1007/s 10869-013-9285-x.

harmonic.mean(c(1,10,100))
## [1] 2.702703
# Fit all three models
fit.dual.process <- sem(model = dual.process.model,
                        # Using the cor matrix as input
                        sample.cov = input.matrix,
                        # Computing the harmonic mean of n's corresponding 
                        # to each cell in the matrix
                        sample.nobs = harmonic.mean(results$N))

fit.intertwined <- sem(model = intertwined.model,
                       sample.cov = input.matrix,
                       sample.nobs = harmonic.mean(results$N))

fit.linear <- sem(model = linear.model,
                  sample.cov = input.matrix,
                  sample.nobs = harmonic.mean(results$N))

# Creating a subset of fit indices we will examine across all models
fit.subset<-c("chisq","df","pvalue",
              "rmsea","rmsea.pvalue",
              "rmsea.ci.lower","rmsea.ci.upper",
              "cfi","tli","srmr","aic","bic")

# Extracting fit indices from all three models
round(rbind(fitmeasures(fit.dual.process, fit.subset),
      fitmeasures(fit.intertwined, fit.subset),
      fitmeasures(fit.linear, fit.subset)),3)
##        chisq df pvalue rmsea rmsea.pvalue rmsea.ci.lower rmsea.ci.upper   cfi
## [1,] 254.746  2  0.000 0.196        0.000          0.176          0.217 0.645
## [2,]  11.161  2  0.004 0.037        0.804          0.018          0.060 0.987
## [3,] 153.565  3  0.000 0.123        0.000          0.107          0.140 0.788
##         tli  srmr      aic      bic
## [1,] -0.065 0.084 27583.29 27625.99
## [2,]  0.961 0.016 27339.71 27382.40
## [3,]  0.577 0.066 27480.11 27516.71
# Let's take a look at the model in graphic form and the parameters
lavaanPlot(model = fit.intertwined,
           coefs = T,
           stand = T,
           sig = .05,
           digits = 3)
parameterestimates(fit.intertwined, standardized = T)
##            lhs op          rhs   est    se      z pvalue ci.lower ci.upper
## 1    reactance =~        anger 1.000 0.000     NA     NA    1.000    1.000
## 2    reactance =~ counterargue 0.848 0.070 12.056      0    0.710    0.986
## 3    reactance  ~       threat 0.214 0.016 13.337      0    0.183    0.246
## 4     attitude  ~    reactance 0.501 0.048 10.435      0    0.407    0.595
## 5        anger ~~        anger 0.631 0.034 18.532      0    0.564    0.698
## 6 counterargue ~~ counterargue 0.735 0.029 25.647      0    0.679    0.791
## 7     attitude ~~     attitude 0.907 0.025 36.944      0    0.859    0.955
## 8    reactance ~~    reactance 0.323 0.033  9.884      0    0.259    0.387
## 9       threat ~~       threat 1.000 0.000     NA     NA    1.000    1.000
##   std.lv std.all std.nox
## 1  0.607   0.607   0.607
## 2  0.515   0.515   0.515
## 3  0.353   0.353   0.353
## 4  0.304   0.304   0.304
## 5  0.631   0.631   0.631
## 6  0.735   0.735   0.735
## 7  0.907   0.907   0.907
## 8  0.876   0.876   0.876
## 9  1.000   1.000   1.000

The fit statistics tell us Dillard & Shen’s (2005) intertwined model was indeed the best fit for the data.

You can also add to the analysis by providing reliability information to correct the measurement errors present in barebones meta-analysis. Doing so will result in a matrix of true correlations, and this will remove any attenuation in effects due to imperfect measurement of constructs.


Additional Information

I created this tutorial with a system environment and versions of R and packages that might be different from yours. If R reports errors when you attempt to run this tutorial, running the code chunk below and comparing your output and the tutorial posted on the website may be helpful.

session_info(pkgs = c("attached"))
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.2.0 (2022-04-22)
##  os       macOS Big Sur/Monterey 10.16
##  system   x86_64, darwin17.0
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       America/New_York
##  date     2022-08-29
##  pandoc   2.18 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package    * version date (UTC) lib source
##  devtools   * 2.4.3   2021-11-30 [1] CRAN (R 4.2.0)
##  dplyr      * 1.0.9   2022-04-28 [1] CRAN (R 4.2.0)
##  lavaan     * 0.6-12  2022-07-04 [1] CRAN (R 4.2.0)
##  lavaanPlot * 0.6.2   2021-08-13 [1] CRAN (R 4.2.0)
##  psych      * 2.2.5   2022-05-10 [1] CRAN (R 4.2.0)
##  psychmeta  * 2.6.4   2022-07-11 [1] CRAN (R 4.2.0)
##  readxl     * 1.4.0   2022-03-28 [1] CRAN (R 4.2.0)
##  reshape    * 0.8.9   2022-04-12 [1] CRAN (R 4.2.0)
##  reshape2   * 1.4.4   2020-04-09 [1] CRAN (R 4.2.0)
##  usethis    * 2.1.6   2022-05-25 [1] CRAN (R 4.2.0)
## 
##  [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
## 
## ──────────────────────────────────────────────────────────────────────────────