Package 'EGAnet' reference manual

Title:	Exploratory Graph Analysis – a Framework for Estimating the Number of Dimensions in Multivariate Data using Network Psychometrics
Description:	Implements the Exploratory Graph Analysis (EGA) framework for dimensionality and psychometric assessment. EGA estimates the number of dimensions in psychological data using network estimation methods and community detection algorithms. A bootstrap method is provided to assess the stability of dimensions and items. Fit is evaluated using the Entropy Fit family of indices. Unique Variable Analysis evaluates the extent to which items are locally dependent (or redundant). Network loadings provide similar information to factor loadings and can be used to compute network scores. A bootstrap and permutation approach are available to assess configural and metric invariance. Hierarchical structures can be detected using Hierarchical EGA. Time series and intensive longitudinal data can be analyzed using Dynamic EGA, supporting individual, group, and population level assessments.
Authors:	Hudson Golino [aut, cre] , Alexander Christensen [aut] , Robert Moulder [ctb] , Luis E. Garrido [ctb] , Laura Jamison [ctb] , Dingjing Shi [ctb]
Maintainer:	Hudson Golino <[email protected]>
License:	AGPL (>= 3.0)
Version:	2.2.1
Built:	2025-03-17 18:32:07 UTC
Source:	https://github.com/hfgolino/eganet

EGAnet-package

Description

Implements the Exploratory Graph Analysis (EGA) framework for dimensionality and psychometric assessment. EGA estimates the number of dimensions in psychological data using network estimation methods and community detection algorithms. A bootstrap method is provided to assess the stability of dimensions and items. Fit is evaluated using the Entropy Fit family of indices. Unique Variable Analysis evaluates the extent to which items are locally dependent (or redundant). Network loadings provide similar information to factor loadings and can be used to compute network scores. A bootstrap and permutation approach are available to assess configural and metric invariance. Hierarchical structures can be detected using Hierarchical EGA. Time series and intensive longitudinal data can be analyzed using Dynamic EGA, supporting individual, group, and population level assessments.

Author(s)

Hudson Golino <[email protected]> and Alexander P. Christensen <[email protected]>

References

Christensen, A. P. (2023). Unidimensional community detection: A Monte Carlo simulation, grid search, and comparison. PsyArXiv.
# Related functions: community.unidimensional

Christensen, A. P., Garrido, L. E., & Golino, H. (2023). Unique variable analysis: A network psychometrics method to detect local dependence. Multivariate Behavioral Research.
# Related functions: UVA

Christensen, A. P., Garrido, L. E., Guerra-Pena, K., & Golino, H. (2023). Comparing community detection algorithms in psychometric networks: A Monte Carlo simulation. Behavior Research Methods.
# Related functions: EGA

Christensen, A. P., & Golino, H. (2021a). Estimating the stability of the number of factors via Bootstrap Exploratory Graph Analysis: A tutorial. Psych, 3(3), 479-500.
# Related functions: bootEGA, dimensionStability, # and itemStability

Christensen, A. P., & Golino, H. (2021b). Factor or network model? Predictions from neural networks. Journal of Behavioral Data Science, 1(1), 85-126.
# Related functions: LCT

Christensen, A. P., & Golino, H. (2021c). On the equivalency of factor and network loadings. Behavior Research Methods, 53, 1563-1580.
# Related functions: LCT and net.loads

Christensen, A. P., Golino, H., & Silvia, P. J. (2020). A psychometric network perspective on the validity and validation of personality trait questionnaires. European Journal of Personality, 34, 1095-1108.
# Related functions: bootEGA, dimensionStability, # EGA, itemStability, and UVA

Christensen, A. P., Gross, G. M., Golino, H., Silvia, P. J., & Kwapil, T. R. (2019). Exploratory graph analysis of the Multidimensional Schizotypy Scale. Schizophrenia Research, 206, 43-51. # Related functions: CFA and EGA

Golino, H., Christensen, A. P., Moulder, R., Kim, S., & Boker, S. M. (2021). Modeling latent topics in social media using Dynamic Exploratory Graph Analysis: The case of the right-wing and left-wing trolls in the 2016 US elections. Psychometrika.
# Related functions: dynEGA and simDFM

Golino, H., & Demetriou, A. (2017). Estimating the dimensionality of intelligence like data using Exploratory Graph Analysis. Intelligence, 62, 54-70.
# Related functions: EGA

Golino, H., & Epskamp, S. (2017). Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. PLoS ONE, 12, e0174035.
# Related functions: CFA, EGA, and bootEGA

Golino, H., Moulder, R., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Nesselroade, J., Sadana, R., Thiyagarajan, J. A., & Boker, S. M. (2020). Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables. Multivariate Behavioral Research.
# Related functions: entropyFit, tefi, and vn.entropy

Golino, H., Nesselroade, J. R., & Christensen, A. P. (2022). Towards a psychology of individuals: The ergodicity information index and a bottom-up approach for finding generalizations. PsyArXiv.
# Related functions: boot.ergoInfo, ergoInfo, jsd, and infoCluster

Golino, H., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Sadana, R., Thiyagarajan, J. A., & Martinez-Molina, A. (2020). Investigating the performance of exploratory graph analysis and traditional techniques to identify the number of latent factors: A simulation and tutorial. Psychological Methods, 25, 292-320.
# Related functions: EGA

Golino, H., Thiyagarajan, J. A., Sadana, M., Teles, M., Christensen, A. P., & Boker, S. M. (2020). Investigating the broad domains of intrinsic capacity, functional ability, and environment: An exploratory graph analysis approach for improving analytical methodologies for measuring healthy aging. PsyArXiv.
# Related functions: EGA.fit and tefi

Jamison, L., Christensen, A. P., & Golino, H. (2021). Optimizing Walktrap's community detection in networks using the Total Entropy Fit Index. PsyArXiv.
# Related functions: EGA.fit and tefi

Jamison, L., Golino, H., & Christensen, A. P. (2023). Metric invariance in exploratory graph analysis via permutation testing. PsyArXiv.
# Related functions: invariance

Shi, D., Christensen, A. P., Day, E., Golino, H., & Garrido, L. E. (2023). A Bayesian approach for dimensionality assessment in psychological networks. PsyArXiv
# Related functions: EGA

Automatic correlations

Description

This wrapper is similar to cor_auto. There are some minor adjustments that make this function simpler and to function within EGAnet. NA values are not treated as categories (this behavior differs from cor_auto)

Usage

auto.correlate(
  data,
  corr = c("cosine", "kendall", "pearson", "spearman"),
  ordinal.categories = 7,
  forcePD = TRUE,
  na.data = c("pairwise", "listwise"),
  empty.method = c("none", "zero", "all"),
  empty.value = c("none", "point_five", "one_over"),
  verbose = FALSE,
  ...
)
auto.correlate(
  data,
  corr = c("cosine", "kendall", "pearson", "spearman"),
  ordinal.categories = 7,
  forcePD = TRUE,
  na.data = c("pairwise", "listwise"),
  empty.method = c("none", "zero", "all"),
  empty.value = c("none", "point_five", "one_over"),
  verbose = FALSE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis
`corr`	Character (length = 1). The standard correlation method to be used. Defaults to `"pearson"`. Using `"pearson"` will compute polychoric, tetrachoric, polyserial, and biserial correlations for categorical and categorical/continuous correlations by default. To obtain `"pearson"` correlations regardless, use `cor`. Other options of `"kendall"` and `"spearman"` are provided for completeness and use `cor`. `cosine` is also available
`ordinal.categories`	Numeric (length = 1). Up to the number of categories before a variable is considered continuous. Defaults to `7` categories before `8` is considered continuous
`forcePD`	Boolean (length = 1). Whether positive definite matrix should be enforced. Defaults to `TRUE`
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`empty.method`	Character (length = 1). Method for empty cell correction in `polychoric.matrix`. Defaults to `"none"` Available options: `"none"` — Adds no value (`empty.value = "none"`) to the empirical joint frequency table between two variables `"zero"` — Adds `empty.value` to the cells with zero in the joint frequency table between two variables `"all"` — Adds `empty.value` to all in the joint frequency table between two variables
`empty.value`	Character (length = 1). Value to add to the joint frequency table cells in `polychoric.matrix`. Defaults to `"none"`. Accepts numeric values between 0 and 1 or specific methods: `"none"` — Adds no value (`0`) to the empirical joint frequency table between two variables `"point_five"` — Adds `0.5` to the cells defined by `empty.method` `"one_over"` — Adds `1 / n` where `n` equals the number of cells based on `empty.method`. For `empty.method = "zero"`, `n` equals the number of zero cells
`verbose`	Boolean (length = 1). Whether messages should be printed. Defaults to `FALSE`
`...`	Not actually used but makes it easier for general functionality in the package

Author(s)

Alexander P. Christensen <[email protected]>

Examples

# Load data
wmt <- wmt2[,7:24]

# Obtain correlations
wmt_corr <- auto.correlate(wmt)

# Load data
wmt <- wmt2[,7:24]

# Obtain correlations
wmt_corr <- auto.correlate(wmt)

Bootstrap Test for the Ergodicity Information Index

Description

Tests the Ergodicity Information Index obtained in the empirical sample with a distribution of EII obtained by a variant of bootstrap sampling (see Details for the procedure)

Usage

boot.ergoInfo(
  dynEGA.object,
  EII,
  use = c("edge.list", "unweighted", "weighted"),
  shuffles = 5000,
  iter = 100,
  ncores,
  verbose = TRUE
)
boot.ergoInfo(
  dynEGA.object,
  EII,
  use = c("edge.list", "unweighted", "weighted"),
  shuffles = 5000,
  iter = 100,
  ncores,
  verbose = TRUE
)

Arguments

`dynEGA.object`	A `dynEGA` or a `dynEGA.ind.pop` object. If a `dynEGA` object, then `level = c("individual", "population")` is required
`EII`	A `ergoInfo` object used to estimate the Empirical Ergodicity Information Index or the estimated value of EII estimated using the `ergoInfo` function. Inherits `use` from `ergoInfo`. If no `ergoInfo` object is provided, then it is estimated
`use`	Character (length = 1). A string indicating what network element will be used to compute the algorithm complexity, the list of edges or the weights of the network. Defaults to `use = "unweighted"`. Current options are: `"edge.list"` — Calculates the algorithm complexity using the list of edges `"unweighted"` — Calculates the algorithm complexity using the binary weights of the encoded prime transformed network. 0 = edge absent and 1 = edge present `"weighted"` — Calculates the algorithm complexity using the weights of encoded prime-weight transformed network
`shuffles`	Numeric. Number of shuffles used to compute the Kolmogorov complexity. Defaults to `5000`
`iter`	Numeric (length = 1). Number of replica samples to generate from the bootstrap analysis. Defaults to `100` (`1000` for robustness)
`ncores`	Numeric (length = 1). Number of cores to use in computing results. Defaults to `ceiling(parallel::detectCores() / 2)` or half of your computer's processing power. Set to `1` to not use parallel computing If you're unsure how many cores your computer has, then type: `parallel::detectCores()`
`verbose`	Boolean (length = 1). Should progress be displayed? Defaults to `TRUE`. Set to `FALSE` to not display progress

Details

In traditional bootstrap sampling, individual participants are resampled with replacement from the empirical sample. This process is time consuming when carried out across v number of variables, n number of participants, t number of time points, and i number of iterations. Instead, boot.ergoInfo uses the premise of an ergodic process to establish more efficient test that works directly on the sample's networks.

With an ergodic process, the expectation is that all individuals will have a systematic relationship with the population. Destroying this relationship should result in a significant loss of information. Following this conjecture, boot.ergoInfo shuffles a random subset of edges that exist in the population that is equal to the number of shared edges it has with an individual. An individual's unique edges remain the same, controlling for their unique information. The result is a replicate individual that contains the same total number of edges as the actual individual but its shared information with the population has been scrambled.

This process is repeated over each individual to create a replicate sample and is repeated for X iterations (e.g., 100). This approach creates a sampling distribution that represents the expected information between the population and individuals when a random process generates the shared information between them. If the shared information between the population and individuals in the empirical sample is sufficiently meaningful, then this process should result in significant information loss.

How to interpret the results: the result of boot.ergoInfo is a sampling distribution of EII values that would be expected if the process was random (null distribution). If the empirical EII value is greater than or not significantly different from the null distribution, then the empirical data can be expected to be generated from an nonergodic process and the population structure is not sufficient to describe all individuals. If the empirical EII value is significantly lower than the null distribution, then the empirical data can be described by the population structure – the population structure is sufficient to describe all individuals.

Value

Returns a list containing:

`empirical.ergoInfo`	Empirical Ergodicity Information Index
`boot.ergoInfo`	The values of the Ergodicity Information Index obtained in the bootstrap
`p.value`	The two-sided p-value of the bootstrap test for the Ergodicity Information Index. The null hypothesis is that the empirical Ergodicity Information index is equal to or greater than the expected value of the EII with small variation in the population structure
`effect`	Indicates wheter the empirical EII is greater or less then the bootstrap distribution of EII.
`interpretation`	How you can interpret the result of the test in plain English

Author(s)

Hudson Golino <hfg9s at virginia.edu> & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>

References

Original Implementation
Golino, H., Nesselroade, J. R., & Christensen, A. P. (2022). Toward a psychology of individuals: The ergodicity information index and a bottom-up approach for finding generalizations. PsyArXiv.

Examples

# Obtain simulated data
sim.data <- sim.dynEGA

## Not run: 
# Dynamic EGA individual and population structures
dyn1 <- dynEGA.ind.pop(
  data = sim.dynEGA[,-26], n.embed = 5, tau = 1,
  delta = 1, id = 25, use.derivatives = 1,
  model = "glasso", ncores = 2, corr = "pearson"
)

# Empirical Ergodicity Information Index
eii1 <- ergoInfo(dynEGA.object = dyn1, use = "unweighted")

# Bootstrap Test for Ergodicity Information Index
testing.ergoinfo <- boot.ergoInfo(
  dynEGA.object = dyn1, EII = eii1,
  ncores = 2, use = "unweighted"
)

# Plot result
plot(testing.ergoinfo)

# Example using `dynEGA`
dyn2 <- dynEGA(
  data = sim.dynEGA, n.embed = 5, tau = 1,
  delta = 1, use.derivatives = 1, ncores = 2,
  level = c("individual", "population")
)

# Empirical Ergodicity Information Index
eii2 <- ergoInfo(dynEGA.object = dyn2, use = "unweighted")

# Bootstrap Test for Ergodicity Information Index
testing.ergoinfo2 <- boot.ergoInfo(
  dynEGA.object = dyn2, EII = eii2,
  ncores = 2
)

# Plot result
plot(testing.ergoinfo2)
## End(Not run)

# Obtain simulated data
sim.data <- sim.dynEGA

## Not run: 
# Dynamic EGA individual and population structures
dyn1 <- dynEGA.ind.pop(
  data = sim.dynEGA[,-26], n.embed = 5, tau = 1,
  delta = 1, id = 25, use.derivatives = 1,
  model = "glasso", ncores = 2, corr = "pearson"
)

# Empirical Ergodicity Information Index
eii1 <- ergoInfo(dynEGA.object = dyn1, use = "unweighted")

# Bootstrap Test for Ergodicity Information Index
testing.ergoinfo <- boot.ergoInfo(
  dynEGA.object = dyn1, EII = eii1,
  ncores = 2, use = "unweighted"
)

# Plot result
plot(testing.ergoinfo)

# Example using `dynEGA`
dyn2 <- dynEGA(
  data = sim.dynEGA, n.embed = 5, tau = 1,
  delta = 1, use.derivatives = 1, ncores = 2,
  level = c("individual", "population")
)

# Empirical Ergodicity Information Index
eii2 <- ergoInfo(dynEGA.object = dyn2, use = "unweighted")

# Bootstrap Test for Ergodicity Information Index
testing.ergoinfo2 <- boot.ergoInfo(
  dynEGA.object = dyn2, EII = eii2,
  ncores = 2
)

# Plot result
plot(testing.ergoinfo2)
## End(Not run)

`bootEGA` Results of `wmt2`Data

Description

bootEGA results from boot.wmt <- bootEGA(wmt2[,7:24], seed = 1234)

Usage

data(boot.wmt)
data(boot.wmt)

Format

A list with 12 objects (see Value in bootEGA)

Examples

data("boot.wmt")
data("boot.wmt")

Bootstrap Exploratory Graph Analysis

Description

bootEGA Estimates the number of dimensions of iter bootstraps using the empirical zero-order correlation matrix ("parametric") or "resampling" from the empirical dataset (non-parametric). bootEGA estimates a typical median network structure, which is formed by the median or mean pairwise (partial) correlations over the iter bootstraps (see Details for information about the typical median network structure).

Usage

bootEGA(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  iter = 500,
  type = c("parametric", "resampling"),
  ncores,
  EGA.type = c("EGA", "EGA.fit", "hierEGA", "riEGA"),
  plot.itemStability = TRUE,
  typicalStructure = FALSE,
  plot.typicalStructure = FALSE,
  seed = NULL,
  verbose = TRUE,
  ...
)
bootEGA(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  iter = 500,
  type = c("parametric", "resampling"),
  ncores,
  EGA.type = c("EGA", "EGA.fit", "hierEGA", "riEGA"),
  plot.itemStability = TRUE,
  typicalStructure = FALSE,
  plot.typicalStructure = FALSE,
  seed = NULL,
  verbose = TRUE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis
`n`	Numeric (length = 1). Sample size if `data` provided is a correlation matrix
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"cosine"` — Uses `cosine` to compute cosine similarity `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`model`	Character (length = 1). Defaults to `"glasso"`. Available options: `"BGGM"` — Computes the Bayesian Gaussian Graphical Model. Set argument `ordinal.categories` to determine levels allowed for a variable to be considered ordinal. See `?BGGM::estimate` for more details `"glasso"` — Computes the GLASSO with EBIC model selection. See `EBICglasso.qgraph` for more details `"TMFG"` — Computes the TMFG method. See `TMFG` for more details
`algorithm`	Character or `igraph` `cluster_*` function (length = 1). Defaults to `"walktrap"`. Three options are listed below but all are available (see `community.detection` for other options): `"leiden"` — See `cluster_leiden` for more details `"louvain"` — By default, `"louvain"` will implement the Louvain algorithm using the consensus clustering method (see `community.consensus` for more information). This function will implement `consensus.method = "most_common"` and `consensus.iter = 1000` unless specified otherwise `"walktrap"` — See `cluster_walktrap` for more details
`uni.method`	Character (length = 1). What unidimensionality method should be used? Defaults to `"louvain"`. Available options: `"expand"` — Expands the correlation matrix with four variables correlated 0.50. If number of dimension returns 2 or less in check, then the data are unidimensional; otherwise, regular EGA with no matrix expansion is used. This method was used in the Golino et al.'s (2020) Psychological Methods simulation `"LE"` — Applies the Leading Eigenvector algorithm (`cluster_leading_eigen`) on the empirical correlation matrix. If the number of dimensions is 1, then the Leading Eigenvector solution is used; otherwise, regular EGA is used. This method was used in the Christensen et al.'s (2023) Behavior Research Methods simulation `"louvain"` — Applies the Louvain algorithm (`cluster_louvain`) on the empirical correlation matrix. If the number of dimensions is 1, then the Louvain solution is used; otherwise, regular EGA is used. This method was validated Christensen's (2022) PsyArXiv simulation. Consensus clustering can be used by specifying either `"consensus.method"` or `"consensus.iter"`
`iter`	Numeric (length = 1). Number of replica samples to generate from the bootstrap analysis. Defaults to `500` (recommended)
`type`	Character (length = 1). What type of bootstrap should be performed? Defaults to `"parametric"`. Available options: `"parametric"` — Generates `iter` new datasets from (multivariate normal random distributions) based on the original dataset using `mvrnorm` `"resampling"` — Generates `iter` new datasets from random subsamples of the original data
`ncores`	Numeric (length = 1). Number of cores to use in computing results. Defaults to `ceiling(parallel::detectCores() / 2)` or half of your computer's processing power. Set to `1` to not use parallel computing If you're unsure how many cores your computer has, then type: `parallel::detectCores()`
`EGA.type`	Character (length = 1). Type of EGA model to use. Defaults to `"EGA"` Available options: `"EGA"` — Uses standard exploratory graph analysis `"EGA.fit"` — Uses `tefi` to determine best fit of `EGA` `"hierEGA"` — Uses hierarchical exploratory graph analysis `"riEGA"` — Uses random-intercept exploratory graph analysis Arguments for `EGA.type` can be added (see links for details on specific function arguments)
`plot.itemStability`	Boolean (length = 1). Should the plot be produced for `item.replication`? Defaults to `TRUE`
`typicalStructure`	Boolean (length = 1). If `TRUE`, returns the median (`"glasso"` or `"BGGM"`) or mean (`"TMFG"`) network structure and estimates its dimensions (see Details for more information). Defaults to `FALSE`
`plot.typicalStructure`	Boolean (length = 1). If `TRUE`, returns a plot of the typical network structure. Defaults to `FALSE`
`seed`	Numeric (length = 1). Defaults to `NULL` or random results. Set for reproducible results. See Reproducibility and PRNG for more details on random number generation in `EGAnet`
`verbose`	Boolean (length = 1). Should progress be displayed? Defaults to `TRUE`. Set to `FALSE` to not display progress
`...`	Additional arguments that can be passed on to `auto.correlate`, `network.estimation`, `community.detection`, `community.consensus`, `EGA`, `EGA.fit`, `hierEGA`, and `riEGA`

Details

The typical network structure is derived from the median (or mean) value of each pairwise relationship. These values tend to reflect the "typical" value taken by an edge across the bootstrap networks. Afterward, the same community detection algorithm is applied to the typical network as the bootstrap networks.

Because the community detection algorithm is applied to the typical network structure, there is a possibility that the community algorithm determines a different number of dimensions than the median number derived from the bootstraps. The typical network structure (and number of dimensions) may not match the empirical EGA number of dimensions or the median number of dimensions from the bootstrap. This result is known and not a bug.

Value

Returns a list containing:

`iter`	Number of replica samples in bootstrap
`bootGraphs`	A list containing the networks of each replica sample
`bootCorrs`	A list containing the zero-order correlations of each replica sample
`boot.wc`	A matrix of membership assignments for each replica network with variables down the columns and replicas across the rows
`boot.ndim`	Number of dimensions identified in each replica sample
`summary.table`	A data frame containing number of replica samples, median, standard deviation, standard error, 95% confidence intervals, and quantiles (lower = 2.5% and upper = 97.5%)
`frequency`	A data frame containing the proportion of times the number of dimensions was identified (e.g., .85 of 1,000 = 850 times that specific number of dimensions was found)
`TEFI`	`tefi` value for each replica sample
`type`	Type of bootstrap used
`EGA`	Output of the empirical EGA results (output will vary based on `EGA.type`)
`EGA.type`	Type of `*EGA` function used
`typicalGraph`	A list containing: `graph` — Network matrix of the median network structure `typical.dim.variables` — An ordered matrix of item allocation `wc` — Membership assignments of the median network
`plot.typical.ega`	Plot output if `plot.typicalStructure = TRUE`

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Original implementation of bootEGA
Christensen, A. P., & Golino, H. (2021). Estimating the stability of the number of factors via Bootstrap Exploratory Graph Analysis: A tutorial. Psych, 3(3), 479-500.

Examples

# Load data
wmt <- wmt2[,7:24]

## Not run: 
# Standard EGA parametric example
boot.wmt <- bootEGA(
  data = wmt, iter = 500,
  type = "parametric", ncores = 2
)

# Standard resampling example
boot.wmt <- bootEGA(
  data = wmt, iter = 500,
  type = "resampling", ncores = 2
)

# Example using {igraph} `cluster_*` function
boot.wmt.spinglass <- bootEGA(
  data = wmt, iter = 500,
  algorithm = igraph::cluster_spinglass,
  # use any function from {igraph}
  type = "parametric", ncores = 2
)

# EGA fit example
boot.wmt.fit <- bootEGA(
  data = wmt, iter = 500,
  EGA.type = "EGA.fit",
  type = "parametric", ncores = 2
)

# Hierarchical EGA example
boot.wmt.hier <- bootEGA(
  data = wmt, iter = 500,
  EGA.type = "hierEGA",
  type = "parametric", ncores = 2
)

# Random-intercept EGA example
boot.wmt.ri <- bootEGA(
  data = wmt, iter = 500,
  EGA.type = "riEGA",
  type = "parametric", ncores = 2
)
## End(Not run)

# Load data
wmt <- wmt2[,7:24]

## Not run: 
# Standard EGA parametric example
boot.wmt <- bootEGA(
  data = wmt, iter = 500,
  type = "parametric", ncores = 2
)

# Standard resampling example
boot.wmt <- bootEGA(
  data = wmt, iter = 500,
  type = "resampling", ncores = 2
)

# Example using {igraph} `cluster_*` function
boot.wmt.spinglass <- bootEGA(
  data = wmt, iter = 500,
  algorithm = igraph::cluster_spinglass,
  # use any function from {igraph}
  type = "parametric", ncores = 2
)

# EGA fit example
boot.wmt.fit <- bootEGA(
  data = wmt, iter = 500,
  EGA.type = "EGA.fit",
  type = "parametric", ncores = 2
)

# Hierarchical EGA example
boot.wmt.hier <- bootEGA(
  data = wmt, iter = 500,
  EGA.type = "hierEGA",
  type = "parametric", ncores = 2
)

# Random-intercept EGA example
boot.wmt.ri <- bootEGA(
  data = wmt, iter = 500,
  EGA.type = "riEGA",
  type = "parametric", ncores = 2
)
## End(Not run)

CFA Fit of `EGA` or `hierEGA` Structure

Description

Verifies the fit of the structure suggested by EGA or by hierEGA using confirmatory factor analysis

Usage

CFA(ega.obj, data, estimator, plot.CFA = TRUE, layout = "spring", ...)
CFA(ega.obj, data, estimator, plot.CFA = TRUE, layout = "spring", ...)

Arguments

`ega.obj`	An `EGA` object or an `hierEGA`
`data`	Matrix or data frame. Should consist only of variables to be used in the analysis
`estimator`	The estimator used in the confirmatory factor analysis. 'WLSMV' is the estimator of choice for ordinal variables. 'ML' or 'WLS' for interval variables. See `lavOptions` for more details
`plot.CFA`	Logical. Should the CFA structure with its standardized loadings be plot? Defaults to TRUE
`layout`	Layout of plot (see `semPaths`). Defaults to "spring"
`...`	Arguments passed to `cfa`

Value

Returns a list containing:

`fit`	Output from `cfa`
`summary`	Summary output from `lavaan-class`
`fit.measures`	Fit measures: chi-squared, degrees of freedom, p-value, CFI, RMSEA, GFI, and NFI. Additional fit measures can be applied using the `fitMeasures` function (see examples)

Author(s)

Hudson F. Golino <hfg9s at virginia.edu>

References

Demonstrative use
Christensen, A. P., Gross, G. M., Golino, H., Silvia, P. J., & Kwapil, T. R. (2019). Exploratory graph analysis of the Multidimensional Schizotypy Scale. Schizophrenia Research, 206, 43-51.

Initial implementation
Golino, H., & Epskamp, S. (2017). Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. PLoS ONE, 12, e0174035.

Examples

# Load data
wmt <- wmt2[,7:24]

## Not run: 
# Estimate EGA
ega.wmt <- EGA(
  data = wmt,
  plot.EGA = FALSE # No plot for CRAN checks
)

# Fit CFA model to EGA results
cfa.wmt <- CFA(
  ega.obj = ega.wmt, estimator = "WLSMV",
  plot.CFA = FALSE, # No plot for CRAN checks
  data = wmt
)

# Additional fit measures
lavaan::fitMeasures(cfa.wmt$fit, fit.measures = "all")
## End(Not run)

# Load data
wmt <- wmt2[,7:24]

## Not run: 
# Estimate EGA
ega.wmt <- EGA(
  data = wmt,
  plot.EGA = FALSE # No plot for CRAN checks
)

# Fit CFA model to EGA results
cfa.wmt <- CFA(
  ega.obj = ega.wmt, estimator = "WLSMV",
  plot.CFA = FALSE, # No plot for CRAN checks
  data = wmt
)

# Additional fit measures
lavaan::fitMeasures(cfa.wmt$fit, fit.measures = "all")
## End(Not run)

`EGA` Color Palettes

Description

Color palettes for plotting ggnet2 EGA network plots

Usage

color_palette_EGA(
  name = c("polychrome", "blue.ridge1", "blue.ridge2", "rainbow", "rio", "itacare",
    "grayscale"),
  wc,
  sorted = FALSE
)
color_palette_EGA(
  name = c("polychrome", "blue.ridge1", "blue.ridge2", "rainbow", "rio", "itacare",
    "grayscale"),
  wc,
  sorted = FALSE
)

Arguments

name

Character. Name of color scheme (see RColorBrewer). Defaults to "polychrome". EGA palettes:

"polychrome" — Default 40 color palette
"grayscale" — "grayscale", "greyscale", or "colorblind" will produce plots suitable for publication purposes
"blue.ridge1" — Palette inspired by the Blue Ridge Mountains
"blue.ridge2" — Second palette inspired by the Blue Ridge Mountains
"rainbow" — Rainbow colors. Default for qgraph
"rio" — Palette inspired by Rio de Janiero, Brazil
"itacare" — Palette inspired by Itacare, Brazil

For custom colors, enter HEX codes for each dimension in a vector

wc

Numeric vector. A vector representing the community (dimension) membership of each node in the network. NA values mean that the node was disconnected from the network

sorted

Boolean. Should colors be sorted by wc? Defaults to FALSE

Value

Vector of colors for community memberships

Author(s)

Hudson Golino <hfg9s at virginia.edu>, Alexander P. Christensen <alexpaulchristensen at gmail.com>

Examples

# Default
color_palette_EGA(name = "polychrome", wc = ega.wmt$wc)

# Blue Ridge Moutains 1
color_palette_EGA(name = "blue.ridge1", wc = ega.wmt$wc)

# Custom
color_palette_EGA(name = c("#7FD1B9", "#24547e"), wc = ega.wmt$wc)

# Default
color_palette_EGA(name = "polychrome", wc = ega.wmt$wc)

# Blue Ridge Moutains 1
color_palette_EGA(name = "blue.ridge1", wc = ega.wmt$wc)

# Custom
color_palette_EGA(name = c("#7FD1B9", "#24547e"), wc = ega.wmt$wc)

Compares Community Detection Solutions Using Permutation

Description

A permutation implementation to determine statistical significance of whether the community comparison measure is different from zero

Usage

community.compare(
  base,
  comparison,
  method = c("vi", "nmi", "split.join", "rand", "adjusted.rand"),
  iter = 1000,
  shuffle.base = TRUE,
  verbose = TRUE,
  seed = NULL
)
community.compare(
  base,
  comparison,
  method = c("vi", "nmi", "split.join", "rand", "adjusted.rand"),
  iter = 1000,
  shuffle.base = TRUE,
  verbose = TRUE,
  seed = NULL
)

Arguments

`base`	Character or numeric vector. A vector of characters or numbers that are treated as the baseline communities
`comparison`	Character or numeric vector (length = `length(base)`). A vector of characters or numbers that are treated as the baseline communities
`method`	Character (length = 1). Comparison metrics from `compare`. Defaults to `"adjusted.rand"`. Available options: `"vi"` — Variation of information (Meila, 2003) `"nmi"` — Normalized mutual information (Danon et al., 2003) `"split.join"` — Split-join distance (Dongen, 2000) `"rand"` — Rand index (Rand, 1971) `"adjusted.rand"` — adjusted Rand index (Hubert & Arabie, 1985; Steinley, 2004)
`iter`	Numeric (length = 1). Number of permutations to perform. Defaults to `1000` (recommended)
`shuffle.base`	Boolean (length = 1). Whether the `base` cluster solution should be shuffled. Defaults to `TRUE` to remain consistent with original implementation (Qannari et al., 2014); however, from a theoretical standpoint, it might make sense to only shuffle the `comparison` to determine whether it is specifically different from the recognized `base`
`verbose`	Boolean (length = 1). Should progress be displayed? Defaults to `TRUE`. Set to `FALSE` to not display progress
`seed`	Numeric (length = 1). Defaults to `NULL` or random results. Set for reproducible results. See Reproducibility and PRNG for more details on random number generation in `EGAnet`

Value

Returns data frame containing method used (Method), empirical or observed value (Empirical), and p-value based on the permutation test (p.value)

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Implementation of Permutation Test
Qannari, E. M., Courcoux, P., & Faye, P. (2014). Significance test of the adjusted Rand index. Application to the free sorting task. Food Quality and Preference, 32, 93–97.

Variation of Information
Meila, M. (2003, August). Comparing clusterings by the variation of information. In Learning Theory and Kernel Machines: 16th Annual Conference on Learning Theory and 7th Kernel Workshop, COLT/Kernel 2003, Washington, DC, USA, August 24-27, 2003. Proceedings (pp. 173-187). Berlin, DE: Springer Berlin Heidelberg.

Normalized Mutual Information
Danon, L., Diaz-Guilera, A., Duch, J., & Arenas, A. (2005). Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 2005(09), P09008.

Split-join Distance
Dongen, S. (2000). Performance criteria for graph clustering and Markov cluster experiments. CWI (Centre for Mathematics and Computer Science).

Rand Index
Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336), 846-850.

Adjusted Rand Index
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193-218.

Steinley, D. (2004). Properties of the Hubert-Arabie adjusted rand index. Psychological Methods, 9(3), 386.

Examples

# Load data
wmt <- wmt2[,7:24]

# Estimate network
network <- EBICglasso.qgraph(data = wmt)

# Compute Edge Betweenness
edge_between <- community.detection(network, algorithm = "edge_betweenness")

# Compute Fast Greedy
fast_greedy <- community.detection(network, algorithm = "fast_greedy")

# Perform permutation test
community.compare(edge_between, fast_greedy)

# Load data
wmt <- wmt2[,7:24]

# Estimate network
network <- EBICglasso.qgraph(data = wmt)

# Compute Edge Betweenness
edge_between <- community.detection(network, algorithm = "edge_betweenness")

# Compute Fast Greedy
fast_greedy <- community.detection(network, algorithm = "fast_greedy")

# Perform permutation test
community.compare(edge_between, fast_greedy)

Applies the Consensus Clustering Method (Louvain only)

Description

Applies the consensus clustering method introduced by (Lancichinetti & Fortunato, 2012). The original implementation of this method applies a community detection algorithm repeatedly to the same network. With stochastic networks, the algorithm is likely to identify different community solutions with many repeated applications.

Usage

community.consensus(
  network,
  order = c("lower", "higher"),
  resolution = 1,
  consensus.method = c("highest_modularity", "iterative", "most_common", "lowest_tefi"),
  consensus.iter = 1000,
  correlation.matrix = NULL,
  allow.singleton = FALSE,
  membership.only = TRUE,
  ...
)
community.consensus(
  network,
  order = c("lower", "higher"),
  resolution = 1,
  consensus.method = c("highest_modularity", "iterative", "most_common", "lowest_tefi"),
  consensus.iter = 1000,
  correlation.matrix = NULL,
  allow.singleton = FALSE,
  membership.only = TRUE,
  ...
)

Arguments

`network`	Matrix or `igraph` network object
`order`	Character (length = 1). Defaults to `"higher"`. Whether `"lower"` or `"higher"` order memberships from the Louvain algorithm should be obtained for the consensus. The `"lower"` order Louvain memberships are from the first initial pass of the Louvain algorithm whereas the `"higher"` order Louvain memberships are from the last pass of the Louvain algorithm
`resolution`	Numeric (length = 1). A parameter that adjusts modularity to allow the algorithm to prefer smaller (`resolution` > 1) or larger (0 < `resolution` < 1) communities. Defaults to `1` (standard modularity computation)
`consensus.method`	Character (length = 1). Defaults to `"most_common"`. Available options for arriving at a consensus (Note: All methods except `"iterative"` are considered experimental until validated): `"highest_modularity"` — EXPERIMENTAL. Selects the community solution with the highest modularity across the applications. Modularity is a reasonable metric for identifying the number of communities in a network but it comes with limitations (e.g., resolution limit) `"iterative"` — The original approach proposed by Lancichinetti & Fortunato (2012). See "Details" for more information `"most_common"` — Selects the community solution that appears the most frequently across the applications. The idea behind this method is that the solution that appears most often will be the most likely solution for the algorithm as well as most reproducible. Can be less stable as the number of nodes increase requiring a larger value for `consensus.iter`. This method is the default `"lowest_tefi"` — EXPERIMENTAL. Selects the community solution with the lowest Total Entropy Fit Index (`tefi`) across the applications. TEFI is a reasonable metric to identify the number of communities in a network based on Golino, Moulder et al. (2020)
`consensus.iter`	Numeric (length = 1). Number of algorithm applications to the network. Defaults to `1000`
`correlation.matrix`	Symmetric matrix. Used for computation of `tefi`. Only needed when `consensus.method = "tefi"`
`allow.singleton`	Boolean (length = 1). Whether singleton or single node communities should be allowed. Defaults to `FALSE`. When `FALSE`, singleton communities will be set to missing (`NA`); otherwise, when `TRUE`, singleton communities will be allowed
`membership.only`	Boolean. Whether the memberships only should be output. Defaults to `TRUE`. Set to `FALSE` to obtain all output for the community detection algorithm
`...`	Not actually used but makes it easier for general functionality in the package

Details

The goal of the consensus clustering method is to identify a stable solution across algorithm applications to derive a "consensus" clustering. The standard or "iterative" approach is to apply the community detection algorithm N times. Then, a co-occurrence matrix is created representing how often each pair of nodes co-occurred across the applications. Based on some cut-off value (e.g., 0.30), co-occurrences below this value are set to zero, forming a "new" sparse network. The procedure proceeds until all nodes co-occur with all other nodes in their community (or a proportion of 1.00).

Variations of this procedure are also available in this package but are experimental. Use these experimental procedures with caution. More work is necessary before these experimental procedures are validated

At this time, seed setting for consensus clustering is not supported

Value

Returns either a vector with the selected solution or a list when membership.only = FALSE:

`selected_solution`	Resulting solution from the consensus method
`memberships`	Matrix of memberships across the consensus iterations
`proportion_table`	For methods that use frequency, a table that reports those frequencies alongside their corresponding memberships

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Louvain algorithm
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008.

Consensus clustering
Lancichinetti, A., & Fortunato, S. (2012). Consensus clustering in complex networks. Scientific Reports, 2(1), 1–7.

Entropy fit indices
Golino, H., Moulder, R. G., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Nesselroade, J., Sadana, R., Thiyagarajan, J. A., & Boker, S. M. (2020). Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables. Multivariate Behavioral Research.

Examples

# Load data
wmt <- wmt2[,7:24]

# Estimate correlation matrix
correlation.matrix <- auto.correlate(wmt)

# Estimate network
network <- EBICglasso.qgraph(data = wmt)

# Compute standard Louvain with highest modularity approach
community.consensus(
  network,
  consensus.method = "highest_modularity"
)

# Compute standard Louvain with iterative (original) approach
community.consensus(
  network,
  consensus.method = "iterative"
)

# Compute standard Louvain with most common approach
community.consensus(
  network,
  consensus.method = "most_common"
)

# Compute standard Louvain with lowest TEFI approach
community.consensus(
  network,
  consensus.method = "lowest_tefi",
  correlation.matrix = correlation.matrix
)

# Load data
wmt <- wmt2[,7:24]

# Estimate correlation matrix
correlation.matrix <- auto.correlate(wmt)

# Estimate network
network <- EBICglasso.qgraph(data = wmt)

# Compute standard Louvain with highest modularity approach
community.consensus(
  network,
  consensus.method = "highest_modularity"
)

# Compute standard Louvain with iterative (original) approach
community.consensus(
  network,
  consensus.method = "iterative"
)

# Compute standard Louvain with most common approach
community.consensus(
  network,
  consensus.method = "most_common"
)

# Compute standard Louvain with lowest TEFI approach
community.consensus(
  network,
  consensus.method = "lowest_tefi",
  correlation.matrix = correlation.matrix
)

Apply a Community Detection Algorithm

Description

General function to apply community detection algorithms available in igraph. Follows the EGAnet approach of setting singleton and disconnected nodes to missing (NA)

Usage

community.detection(
  network,
  algorithm = c("edge_betweenness", "fast_greedy", "fluid", "infomap", "label_prop",
    "leading_eigen", "leiden", "louvain", "optimal", "spinglass", "walktrap"),
  allow.singleton = FALSE,
  membership.only = TRUE,
  ...
)
community.detection(
  network,
  algorithm = c("edge_betweenness", "fast_greedy", "fluid", "infomap", "label_prop",
    "leading_eigen", "leiden", "louvain", "optimal", "spinglass", "walktrap"),
  allow.singleton = FALSE,
  membership.only = TRUE,
  ...
)

Arguments

`network`	Matrix or `igraph` network object
`algorithm`	Character or `igraph` `cluster_` function (length = 1). Available options: `"edge_betweenness"` — See `cluster_edge_betweenness` for more details `"fast_greedy"` — See `cluster_fast_greedy` for more details `"fluid"` — See `cluster_fluid_communities` for more details `"infomap"` — See `cluster_infomap` for more details `"label_prop"` — See `cluster_label_prop` for more details `"leading_eigen"` — See `cluster_leading_eigen` for more details `"leiden"` — See `cluster_leiden` for more details. Note*: The Leiden algorithm will default to the modularity objective function (`objective_function = "modularity"`). Set `objective_function = "CPM"` to use the Constant Potts Model instead (see examples) `"louvain"` — See `cluster_louvain` for more details `"optimal"` — See `cluster_optimal` for more details `"spinglass"` — See `cluster_spinglass` for more details `"walktrap"` — See `cluster_walktrap` for more details
`allow.singleton`	Boolean (length = 1). Whether singleton or single node communities should be allowed. Defaults to `FALSE`. When `FALSE`, singleton communities will be set to missing (`NA`); otherwise, when `TRUE`, singleton communities will be allowed
`membership.only`	Boolean (length = 1). Whether the memberships only should be output. Defaults to `TRUE`. Set to `FALSE` to obtain all output for the community detection algorithm
`...`	Additional arguments to be passed on to `igraph`'s community detection functions (see `algorithm` for link to arguments of each algorithm)

Value

Returns memberships from a community detection algorithm

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Csardi, G., & Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems, 1695.

Examples

# Load data
wmt <- wmt2[,7:24]

# Estimate network
network <- EBICglasso.qgraph(data = wmt)

# Compute Edge Betweenness
community.detection(network, algorithm = "edge_betweenness")

# Compute Fast Greedy
community.detection(network, algorithm = "fast_greedy")

# Compute Fluid
community.detection(
  network, algorithm = "fluid",
  no.of.communities = 2 # needs to be set
)

# Compute Infomap
community.detection(network, algorithm = "infomap")

# Compute Label Propagation
community.detection(network, algorithm = "label_prop")

# Compute Leading Eigenvector
community.detection(network, algorithm = "leading_eigen")

# Compute Leiden (with modularity)
community.detection(
  network, algorithm = "leiden",
  objective_function = "modularity"
)

# Compute Leiden (with CPM)
community.detection(
  network, algorithm = "leiden",
  objective_function = "CPM",
  resolution_parameter = 0.05 # "edge density"
)

# Compute Louvain
community.detection(network, algorithm = "louvain")

# Compute Optimal (identifies maximum modularity solution)
community.detection(network, algorithm = "optimal")

# Compute Spinglass
community.detection(network, algorithm = "spinglass")

# Compute Walktrap
community.detection(network, algorithm = "walktrap")

# Example with {igraph} network
community.detection(
  convert2igraph(network), algorithm = "walktrap"
)

# Load data
wmt <- wmt2[,7:24]

# Estimate network
network <- EBICglasso.qgraph(data = wmt)

# Compute Edge Betweenness
community.detection(network, algorithm = "edge_betweenness")

# Compute Fast Greedy
community.detection(network, algorithm = "fast_greedy")

# Compute Fluid
community.detection(
  network, algorithm = "fluid",
  no.of.communities = 2 # needs to be set
)

# Compute Infomap
community.detection(network, algorithm = "infomap")

# Compute Label Propagation
community.detection(network, algorithm = "label_prop")

# Compute Leading Eigenvector
community.detection(network, algorithm = "leading_eigen")

# Compute Leiden (with modularity)
community.detection(
  network, algorithm = "leiden",
  objective_function = "modularity"
)

# Compute Leiden (with CPM)
community.detection(
  network, algorithm = "leiden",
  objective_function = "CPM",
  resolution_parameter = 0.05 # "edge density"
)

# Compute Louvain
community.detection(network, algorithm = "louvain")

# Compute Optimal (identifies maximum modularity solution)
community.detection(network, algorithm = "optimal")

# Compute Spinglass
community.detection(network, algorithm = "spinglass")

# Compute Walktrap
community.detection(network, algorithm = "walktrap")

# Example with {igraph} network
community.detection(
  convert2igraph(network), algorithm = "walktrap"
)

Homogenize Community Memberships

Description

Memberships from community detection algorithms do not always align numerically. This function seeks to homogenize community memberships between a target membership (the membership to homogenize toward) and one or more other memberships. This function is the core of the dimensionStability and itemStability functions

Usage

community.homogenize(target.membership, convert.membership)
community.homogenize(target.membership, convert.membership)

Arguments

`target.membership`	Vector, matrix, or data frame. The target memberships that all other memberships input into `convert.membership` should be homogenize toward
`convert.membership`	Vector, matrix, or data frame. Either a vector of memberships the same length as `target.membership` or a matrix or data frame of many membership solutions with either across rows or down columns the same length as `target.membership` (this function will automatically determine this orientation for you with precedence given solutions across rows)

Value

Returns a vector or matrix the length or size of convert.membership with memberships homogenized toward target.membership

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Examples

# Get network
network <- network.estimation(wmt2[,7:24])

# Apply Walktrap
network_walktrap <- community.detection(
  network, algorithm = "walktrap"
)

# Apply Louvain
network_louvain <- community.detection(
  network, algorithm = "louvain"
)

# Homogenize toward Walktrap
community.homogenize(network_walktrap, network_louvain)

# Get network
network <- network.estimation(wmt2[,7:24])

# Apply Walktrap
network_walktrap <- community.detection(
  network, algorithm = "walktrap"
)

# Apply Louvain
network_louvain <- community.detection(
  network, algorithm = "louvain"
)

# Homogenize toward Walktrap
community.homogenize(network_walktrap, network_louvain)

Approaches to Detect Unidimensional Communities

Description

A function to apply several approaches to detect a unidimensional community in networks. There have many different approaches recently such as expanding the correlation matrix to have orthogonal correlations ("expand"), applying the Leading Eigenvalue community detection algorithm cluster_leading_eigen to the correlation matrix ("LE"), and applying the Louvain community detection algorithm cluster_louvain to the correlation matrix ("louvain"). Not necessarily intended for individual use – it's better to use EGA

Usage

community.unidimensional(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  uni.method = c("expand", "LE", "louvain"),
  verbose = FALSE,
  ...
)
community.unidimensional(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  uni.method = c("expand", "LE", "louvain"),
  verbose = FALSE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables that are desired to be in analysis
`n`	Numeric (length = 1). Sample size if `data` provided is a correlation matrix
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`model`	Character (length = 1). Defaults to `"glasso"`. Available options: `"BGGM"` — Computes the Bayesian Gaussian Graphical Model. Set argument `ordinal.categories` to determine levels allowed for a variable to be considered ordinal. See `?BGGM::estimate` for more details `"glasso"` — Computes the GLASSO with EBIC model selection. See `EBICglasso.qgraph` for more details `"TMFG"` — Computes the TMFG method. See `TMFG` for more details
`uni.method`	Character (length = 1). What unidimensionality method should be used? Defaults to `"louvain"`. Available options: `"expand"` — Expands the correlation matrix with four variables correlated 0.50. If number of dimension returns 2 or less in check, then the data are unidimensional; otherwise, regular EGA with no matrix expansion is used. This method was used in the Golino et al.'s (2020) Psychological Methods simulation `"LE"` — Applies the Leading Eigenvector algorithm (`cluster_leading_eigen`) on the empirical correlation matrix. If the number of dimensions is 1, then the Leading Eigenvector solution is used; otherwise, regular EGA is used. This method was used in the Christensen et al.'s (2023) Behavior Research Methods simulation `"louvain"` — Applies the Louvain algorithm (`cluster_louvain`) on the empirical correlation matrix. If the number of dimensions is 1, then the Louvain solution is used; otherwise, regular EGA is used. This method was validated Christensen's (2022) PsyArXiv simulation. Consensus clustering can be used by specifying either `"consensus.method"` or `"consensus.iter"`
`verbose`	Boolean. Whether messages and (insignificant) warnings should be output. Defaults to `FALSE` (silent calls). Set to `TRUE` to see all messages and warnings for every function call
`...`	Additional arguments to be passed on to `auto.correlate`, `network.estimation`, `community.consensus`, and `community.detection`

Value

Returns the memberships of the community detection algorithm. The memberships will output regardless of whether the network is unidimensional

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Expand approach
Golino, H., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Sadana, R., Thiyagarajan, J. A., & Martinez-Molina, A. (2020). Investigating the performance of exploratory graph analysis and traditional techniques to identify the number of latent factors: A simulation and tutorial. Psychological Methods, 25, 292-320.

Leading Eigenvector approach
Christensen, A. P., Garrido, L. E., Guerra-Pena, K., & Golino, H. (2023). Comparing community detection algorithms in psychometric networks: A Monte Carlo simulation. Behavior Research Methods.

Louvain approach
Christensen, A. P. (2023). Unidimensional community detection: A Monte Carlo simulation, grid search, and comparison. PsyArXiv.

Examples

# Load data
wmt <- wmt2[,7:24]

# Louvain with Consensus Clustering (default)
community.unidimensional(wmt)

# Leading Eigenvector
community.unidimensional(wmt, uni.method = "LE")

# Expand
community.unidimensional(wmt, uni.method = "expand")

# Load data
wmt <- wmt2[,7:24]

# Louvain with Consensus Clustering (default)
community.unidimensional(wmt)

# Leading Eigenvector
community.unidimensional(wmt, uni.method = "LE")

# Expand
community.unidimensional(wmt, uni.method = "expand")

Visually Compare Two or More `EGAnet` plots

Description

Organizes EGA plots for comparison. Ensures that nodes are placed in the same layout to maximize comparison

Usage

compare.EGA.plots(
  ...,
  input.list = NULL,
  base = 1,
  labels = NULL,
  rows = NULL,
  columns = NULL,
  plot.all = TRUE
)
compare.EGA.plots(
  ...,
  input.list = NULL,
  base = 1,
  labels = NULL,
  rows = NULL,
  columns = NULL,
  plot.all = TRUE
)

Arguments

`...`	Handles multiple arguments: `*EGA` objects — can be dropped in without any argument designation. The function will search across input to find necessary `EGAnet` objects `ggnet2` arguments — can be passed along to `ggnet2` `gplot.layout` — can be specified using `mode =` or `layout =` using the name of the layout (e.g., `mode = "circle"` will produce the circle layout from gplot.layout). By default, the layout is the same as `qgraph`
`input.list`	List. Bypasses `...` argument in favor of using a list as an input
`base`	Numeric (length = 1). Plot to be used as the base for the configuration of the networks. Uses the number of the order in which the plots are input. Defaults to `1` or the first plot
`labels`	Character (same length as input). Labels for each `EGAnet` object
`rows`	Numeric (length = 1). Number of rows to spread plots across
`columns`	Numeric (length = 1). Number of columns to spread plots down
`plot.all`	Boolean (length = 1). Whether plot should be produced or just output. Defaults to `TRUE`. Set to `FALSE` to avoid plotting (but still obtain plot objects)

Value

Visual comparison of EGAnet objects

Author(s)

Alexander Christensen <[email protected]>

Examples

# Obtain WMT-2 data
wmt <- wmt2[,7:24]

# Draw random samples of 300 cases
sample1 <- wmt[sample(1:nrow(wmt), 300),]
sample2 <- wmt[sample(1:nrow(wmt), 300),]

# Estimate EGAs
ega1 <- EGA(sample1)
ega2 <- EGA(sample2)


# Compare EGAs via plot
compare.EGA.plots(
  ega1, ega2,
  base = 1, # use "ega1" as base for comparison
  labels = c("Sample 1", "Sample 2"),
  rows = 1, columns = 2
)

# Change layout to circle plots
compare.EGA.plots(
  ega1, ega2,
  labels = c("Sample 1", "Sample 2"),
  mode = "circle"
)

# Obtain WMT-2 data
wmt <- wmt2[,7:24]

# Draw random samples of 300 cases
sample1 <- wmt[sample(1:nrow(wmt), 300),]
sample2 <- wmt[sample(1:nrow(wmt), 300),]

# Estimate EGAs
ega1 <- EGA(sample1)
ega2 <- EGA(sample2)


# Compare EGAs via plot
compare.EGA.plots(
  ega1, ega2,
  base = 1, # use "ega1" as base for comparison
  labels = c("Sample 1", "Sample 2"),
  rows = 1, columns = 2
)

# Change layout to circle plots
compare.EGA.plots(
  ega1, ega2,
  labels = c("Sample 1", "Sample 2"),
  mode = "circle"
)

Convert networks to `igraph`

Description

Converts networks to igraph format

Usage

convert2igraph(A, diagonal = 0)
convert2igraph(A, diagonal = 0)

Arguments

`A`	Matrix or data frame. N x N matrix where N is the number of nodes
`diagonal`	Numeric. Value to be placed on the diagonal of `A`. Defaults to `0`

Value

Returns a network in the igraph format

Author(s)

Hudson Golino <hfg9s at virginia.edu> & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>

Examples

convert2igraph(ega.wmt$network)

convert2igraph(ega.wmt$network)

Convert networks to `tidygraph`

Description

Converts networks to tidygraph format

Usage

convert2tidygraph(EGA.object)
convert2tidygraph(EGA.object)

Arguments

EGA.object

A single EGAnet object containing the outputs $network and $wc

Value

Returns a network in the tidygraph format

Author(s)

Dominique Makowski, Hudson Golino <hfg9s at virginia.edu>, & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>

Examples

convert2tidygraph(ega.wmt)

convert2tidygraph(ega.wmt)

Cosine similarity

Description

Computes cosine similarity

Usage

cosine(x, y = NULL, ...)
cosine(x, y = NULL, ...)

Arguments

`x`	Numeric vector, matrix, or data frame. If `nrow(x) > 1`, then `x` will be treated as a matrix to compute an n by n similarity matrix (`y` will not be used!)
`y`	Numeric vector, matrix, or data frame. Only used if `x` is a single variable. Used to compute similarity between one variable and n other variables. Defaults to `NULL`
`...`	Not actually used but makes it easier for general functionality in the package

Details

On missing values: 0 will be used to replace missing values. When using (matrix) multiplication, the 0 value cancels out the product rendering the missing value as "not counting" in the sums

Author(s)

Alexander P. Christensen <[email protected]>

Examples

# Load data
wmt <- wmt2[,7:24]

# Obtain cosines
wmt_cosine <- cosine(wmt)

# Load data
wmt <- wmt2[,7:24]

# Obtain cosines
wmt_cosine <- cosine(wmt)

Depression Data

Description

A response matrix (n = 574) of the Beck Depression Inventory, Beck Anxiety Inventory, and the Athens Insomnia Scale.

Usage

data(depression)
data(depression)

Format

A 574x78 response matrix

Examples

data("depression")

data("depression")

Dimension Stability Statistics from `bootEGA`

Description

Based on the bootEGA results, this function computes the stability of dimensions. Stability is computed by assessing the proportion of times the original dimension is exactly replicated in across bootstrap samples

Usage

dimensionStability(bootega.obj, IS.plot = TRUE, structure = NULL, ...)
dimensionStability(bootega.obj, IS.plot = TRUE, structure = NULL, ...)

Arguments

`bootega.obj`	A `bootEGA` object
`IS.plot`	Boolean (length = 1). Should the plot be produced for `item.replication`? Defaults to `TRUE`
`structure`	Numeric (length = number of variables). A theoretical or pre-defined structure. Defaults to `NULL` or the empirical `EGA` result in the `bootega.obj`
`...`	Additional arguments. Used for deprecated arguments from previous versions of `itemStability`

Value

Returns a list containing:

dimension.stability

A list containing:

structural.consistency — The proportion of times that each empirical EGA dimension exactly replicates across the bootEGA samples
average.item.stability — The average item stability in each empirical EGA dimension

item.stability

Results from itemStability

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Conceptual introduction
Christensen, A. P., Golino, H., & Silvia, P. J. (2020). A psychometric network perspective on the validity and validation of personality trait questionnaires. European Journal of Personality, 34(6), 1095-1108.

Examples

# Load data
wmt <- wmt2[,7:24]

## Not run: 
# Estimate bootstrap EGA
boot.wmt <- bootEGA(
  data = wmt, iter = 500,
  type = "parametric", ncores = 2
)
## End(Not run)

# Estimate stability statistics
dimensionStability(boot.wmt)

# Load data
wmt <- wmt2[,7:24]

## Not run: 
# Estimate bootstrap EGA
boot.wmt <- bootEGA(
  data = wmt, iter = 500,
  type = "parametric", ncores = 2
)
## End(Not run)

# Estimate stability statistics
dimensionStability(boot.wmt)

Loadings Comparison Test Deep Learning Neural Network Weights

Description

A list of weights from four different neural network models: random vs. non-random model (r_nr_weights), low correlation factor vs. network model (lf_n_weights), high correlation with variables less than or equal to factors vs. network model (hlf_n_weights), and high correlation with variables greater than factors vs. network model (hgf_n_weights)

Usage

data(dnn.weights)
data(dnn.weights)

Format

A list of with a length of 4

Examples

data("dnn.weights")

data("dnn.weights")

Dynamic Exploratory Graph Analysis

Description

Estimates dynamic communities in multivariate time series (e.g., panel data, longitudinal data, intensive longitudinal data) at multiple time scales and at different levels of analysis: individuals (intraindividual structure), groups, and population (interindividual structure)

Usage

dynEGA(
  data,
  id = NULL,
  group = NULL,
  n.embed = 5,
  tau = 1,
  delta = 1,
  use.derivatives = 1,
  level = c("individual", "group", "population"),
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  ncores,
  verbose = TRUE,
  ...
)
dynEGA(
  data,
  id = NULL,
  group = NULL,
  n.embed = 5,
  tau = 1,
  delta = 1,
  use.derivatives = 1,
  level = c("individual", "group", "population"),
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  ncores,
  verbose = TRUE,
  ...
)

Arguments

`data`	Matrix or data frame. Participants and variable should be in long format such that row t represents observations for all variables at time point t for a participant. The next row, t + 1, represents the next measurement occasion for that same participant. The next participant's data should immediately follow, in the same pattern, after the previous participant `data` should have an ID variable labeled `"ID"`; otherwise, it is assumed that the data represent the population For groups, `data` should have a Group variable labeled `"Group"`; otherwise, it is assumed that there are no groups in `data` Arguments `id` and `group` can be specified to tell the function which column in `data` it should use as the ID and Group variable, respectively A measurement occasion variable is not necessary and should be removed from the data before proceeding with the analysis
`id`	Numeric or character (length = 1). Number or name of the column identifying each individual. Defaults to `NULL`
`group`	Numeric or character (length = 1). Number of the column identifying group membership. Defaults to `NULL`
`n.embed`	Numeric (length = 1). Defaults to `5`. Number of embedded dimensions (the number of observations to be used in the `Embed` function). For example, an `"n.embed = 5"` will use five consecutive observations to estimate a single derivative
`tau`	Numeric (length = 1). Defaults to `1`. Number of observations to offset successive embeddings in the `Embed` function. Generally recommended to leave "as is"
`delta`	Numeric (length = 1). Defaults to `1`. The time between successive observations in the time series (i.e, lag). Generally recommended to leave "as is"
`use.derivatives`	Numeric (length = 1). Defaults to `1`. The order of the derivative to be used in the analysis. Available options: `0` — No derivatives; consistent with moving average `1` — First-order derivatives; interpreted as "velocity" or rate of change over time `2` — Second-order derivatives; interpreted as "acceleration" or rate of the rate of change over time Generally recommended to leave "as is"
`level`	Character vector (up to length of 3). A character vector indicating which level(s) to estimate: `"individual"` — Estimates `EGA` for each individual in `data` (intraindividual structure; requires an `"ID"` column, see `data`) `"group"` — Estimates `EGA` for each group in `data` (group structure; requires a `"Group"` column, see `data`) `"population"` — Estimates `EGA` across all `data` (interindividual structure)
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`model`	Character (length = 1). Defaults to `"glasso"`. Available options: `"BGGM"` — Computes the Bayesian Gaussian Graphical Model. Set argument `ordinal.categories` to determine levels allowed for a variable to be considered ordinal. See `?BGGM::estimate` for more details `"glasso"` — Computes the GLASSO with EBIC model selection. See `EBICglasso.qgraph` for more details `"TMFG"` — Computes the TMFG method. See `TMFG` for more details
`algorithm`	Character or `igraph` `cluster_*` function (length = 1). Defaults to `"walktrap"`. Three options are listed below but all are available (see `community.detection` for other options): `"leiden"` — See `cluster_leiden` for more details `"louvain"` — By default, `"louvain"` will implement the Louvain algorithm using the consensus clustering method (see `community.consensus` for more information). This function will implement `consensus.method = "most_common"` and `consensus.iter = 1000` unless specified otherwise `"walktrap"` — See `cluster_walktrap` for more details
`uni.method`	Character (length = 1). What unidimensionality method should be used? Defaults to `"louvain"`. Available options: `"expand"` — Expands the correlation matrix with four variables correlated 0.50. If number of dimension returns 2 or less in check, then the data are unidimensional; otherwise, regular EGA with no matrix expansion is used. This method was used in the Golino et al.'s (2020) Psychological Methods simulation `"LE"` — Applies the Leading Eigenvector algorithm (`cluster_leading_eigen`) on the empirical correlation matrix. If the number of dimensions is 1, then the Leading Eigenvector solution is used; otherwise, regular EGA is used. This method was used in the Christensen et al.'s (2023) Behavior Research Methods simulation `"louvain"` — Applies the Louvain algorithm (`cluster_louvain`) on the empirical correlation matrix. If the number of dimensions is 1, then the Louvain solution is used; otherwise, regular EGA is used. This method was validated Christensen's (2022) PsyArXiv simulation. Consensus clustering can be used by specifying either `"consensus.method"` or `"consensus.iter"`
`ncores`	Numeric (length = 1). Number of cores to use in computing results. Defaults to `ceiling(parallel::detectCores() / 2)` or half of your computer's processing power. Set to `1` to not use parallel computing If you're unsure how many cores your computer has, then type: `parallel::detectCores()`
`verbose`	Boolean (length = 1). Should progress be displayed? Defaults to `TRUE`. Set to `FALSE` to not display progress
`...`	Additional arguments to be passed on to `auto.correlate`, `network.estimation`, `community.detection`, `community.consensus`, and `EGA`

Details

Derivatives for each variable's time series for each participant are estimated using generalized local linear approximation (see glla). EGA is then applied to these derivatives to model how variables are changing together over time. Variables that change together over time are detected as communities

Value

A list containing:

Derivatives

A list containing:

Estimates — A list the length of the unique IDs containing data frames of zero- to second-order derivatives for each ID in data
EstimatesDF — A data frame of derivatives across all IDs containing columns of the zero- to second-order derivatives as well as id and group variables (group is automatically set to 1 for all if no group is provided)

dynEGA

A list containing:

population — If level includes "populaton", then the EGA results for the entire sample
group — If level includes "group", then a list containing the EGA results for each group
individual — If level includes "individual", then a list containing the EGA results for each id

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Generalized local linear approximation
Boker, S. M., Deboeck, P. R., Edler, C., & Keel, P. K. (2010) Generalized local linear approximation of derivatives from time series. In S.-M. Chow, E. Ferrer, & F. Hsieh (Eds.), The Notre Dame series on quantitative methodology. Statistical methods for modeling human dynamics: An interdisciplinary dialogue, (p. 161-178). Routledge/Taylor & Francis Group.

Deboeck, P. R., Montpetit, M. A., Bergeman, C. S., & Boker, S. M. (2009) Using derivative estimates to describe intraindividual variability at multiple time scales. Psychological Methods, 14(4), 367-386.

Original dynamic EGA implementation
Golino, H., Christensen, A. P., Moulder, R. G., Kim, S., & Boker, S. M. (2021). Modeling latent topics in social media using Dynamic Exploratory Graph Analysis: The case of the right-wing and left-wing trolls in the 2016 US elections. Psychometrika.

Time delay embedding procedure
Savitzky, A., & Golay, M. J. (1964). Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry, 36(8), 1627-1639.

Examples

# Population structure
simulated_population <- dynEGA(
  data = sim.dynEGA, level = "population"
  # uses simulated data in package
  # useful to understand how data should be structured
)

# Group structure
simulated_group <- dynEGA(
  data = sim.dynEGA, level = "group"
  # uses simulated data in package
  # useful to understand how data should be structured
)

## Not run: 
# Individual structure
simulated_individual <- dynEGA(
  data = sim.dynEGA, level = "individual",
  ncores = 2, # use more for quicker results
  verbose = TRUE # progress bar
)

# Population, group, and individual structure
simulated_all <- dynEGA(
  data = sim.dynEGA,
  level = c("individual", "group", "population"),
  ncores = 2, # use more for quicker results
  verbose = TRUE # progress bar
)

# Plot population
plot(simulated_all$dynEGA$population)

# Plot groups
plot(simulated_all$dynEGA$group)

# Plot individual
plot(simulated_all$dynEGA$individual, id = 1)

# Step through all plots
# Unless `id` is specified, 4 random IDs
# will be drawn from individuals
plot(simulated_all)
## End(Not run)

# Population structure
simulated_population <- dynEGA(
  data = sim.dynEGA, level = "population"
  # uses simulated data in package
  # useful to understand how data should be structured
)

# Group structure
simulated_group <- dynEGA(
  data = sim.dynEGA, level = "group"
  # uses simulated data in package
  # useful to understand how data should be structured
)

## Not run: 
# Individual structure
simulated_individual <- dynEGA(
  data = sim.dynEGA, level = "individual",
  ncores = 2, # use more for quicker results
  verbose = TRUE # progress bar
)

# Population, group, and individual structure
simulated_all <- dynEGA(
  data = sim.dynEGA,
  level = c("individual", "group", "population"),
  ncores = 2, # use more for quicker results
  verbose = TRUE # progress bar
)

# Plot population
plot(simulated_all$dynEGA$population)

# Plot groups
plot(simulated_all$dynEGA$group)

# Plot individual
plot(simulated_all$dynEGA$individual, id = 1)

# Step through all plots
# Unless `id` is specified, 4 random IDs
# will be drawn from individuals
plot(simulated_all)
## End(Not run)

Intra- and Inter-individual `dynEGA`

Description

A wrapper function to estimate both intraindividiual (level = "individual") and interindividual (level = "population") structures using dynEGA

Usage

dynEGA.ind.pop(
  data,
  id = NULL,
  n.embed = 5,
  tau = 1,
  delta = 1,
  use.derivatives = 1,
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  ncores,
  verbose = TRUE,
  ...
)
dynEGA.ind.pop(
  data,
  id = NULL,
  n.embed = 5,
  tau = 1,
  delta = 1,
  use.derivatives = 1,
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  ncores,
  verbose = TRUE,
  ...
)

Arguments

`data`	Matrix or data frame. Participants and variable should be in long format such that row t represents observations for all variables at time point t for a participant. The next row, t + 1, represents the next measurement occasion for that same participant. The next participant's data should immediately follow, in the same pattern, after the previous participant `data` should have an ID variable labeled `"ID"`; otherwise, it is assumed that the data represent the population For groups, `data` should have a Group variable labeled `"Group"`; otherwise, it is assumed that there are no groups in `data` Arguments `id` and `group` can be specified to tell the function which column in `data` it should use as the ID and Group variable, respectively A measurement occasion variable is not necessary and should be removed from the data before proceeding with the analysis
`id`	Numeric or character (length = 1). Number or name of the column identifying each individual. Defaults to `NULL`
`n.embed`	Numeric (length = 1). Defaults to `5`. Number of embedded dimensions (the number of observations to be used in the `Embed` function). For example, an `"n.embed = 5"` will use five consecutive observations to estimate a single derivative
`tau`	Numeric (length = 1). Defaults to `1`. Number of observations to offset successive embeddings in the `Embed` function. Generally recommended to leave "as is"
`delta`	Numeric (length = 1). Defaults to `1`. The time between successive observations in the time series (i.e, lag). Generally recommended to leave "as is"
`use.derivatives`	Numeric (length = 1). Defaults to `1`. The order of the derivative to be used in the analysis. Available options: `0` — No derivatives; consistent with moving average `1` — First-order derivatives; interpreted as "velocity" or rate of change over time `2` — Second-order derivatives; interpreted as "acceleration" or rate of the rate of change over time Generally recommended to leave "as is"
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`model`	Character (length = 1). Defaults to `"glasso"`. Available options: `"BGGM"` — Computes the Bayesian Gaussian Graphical Model. Set argument `ordinal.categories` to determine levels allowed for a variable to be considered ordinal. See `?BGGM::estimate` for more details `"glasso"` — Computes the GLASSO with EBIC model selection. See `EBICglasso.qgraph` for more details `"TMFG"` — Computes the TMFG method. See `TMFG` for more details
`algorithm`	Character or `igraph` `cluster_*` function (length = 1). Defaults to `"walktrap"`. Three options are listed below but all are available (see `community.detection` for other options): `"leiden"` — See `cluster_leiden` for more details `"louvain"` — By default, `"louvain"` will implement the Louvain algorithm using the consensus clustering method (see `community.consensus` for more information). This function will implement `consensus.method = "most_common"` and `consensus.iter = 1000` unless specified otherwise `"walktrap"` — See `cluster_walktrap` for more details
`uni.method`	Character (length = 1). What unidimensionality method should be used? Defaults to `"louvain"`. Available options: `"expand"` — Expands the correlation matrix with four variables correlated 0.50. If number of dimension returns 2 or less in check, then the data are unidimensional; otherwise, regular EGA with no matrix expansion is used. This method was used in the Golino et al.'s (2020) Psychological Methods simulation `"LE"` — Applies the Leading Eigenvector algorithm (`cluster_leading_eigen`) on the empirical correlation matrix. If the number of dimensions is 1, then the Leading Eigenvector solution is used; otherwise, regular EGA is used. This method was used in the Christensen et al.'s (2023) Behavior Research Methods simulation `"louvain"` — Applies the Louvain algorithm (`cluster_louvain`) on the empirical correlation matrix. If the number of dimensions is 1, then the Louvain solution is used; otherwise, regular EGA is used. This method was validated Christensen's (2022) PsyArXiv simulation. Consensus clustering can be used by specifying either `"consensus.method"` or `"consensus.iter"`
`ncores`	Numeric (length = 1). Number of cores to use in computing results. Defaults to `ceiling(parallel::detectCores() / 2)` or half of your computer's processing power. Set to `1` to not use parallel computing If you're unsure how many cores your computer has, then type: `parallel::detectCores()`
`verbose`	Boolean (length = 1). Should progress be displayed? Defaults to `TRUE`. Set to `FALSE` to not display progress
`...`	Additional arguments to be passed on to `auto.correlate`, `network.estimation`, `community.detection`, `community.consensus`, and `EGA`

Value

Same output as EGAnet{dynEGA} returning list objects for level = "individual" and level = "population"

Author(s)

Hudson Golino <hfg9s at virginia.edu>

Examples

# Obtain data
sim.dynEGA <- sim.dynEGA # bypasses CRAN checks

## Not run: 
# Dynamic EGA individual and population structure
dyn.ega1 <- dynEGA.ind.pop(
  data = sim.dynEGA, n.embed = 5, tau = 1,
  delta = 1, id = 25, use.derivatives = 1,
  ncores = 2, corr = "pearson"
)
## End(Not run)

# Obtain data
sim.dynEGA <- sim.dynEGA # bypasses CRAN checks

## Not run: 
# Dynamic EGA individual and population structure
dyn.ega1 <- dynEGA.ind.pop(
  data = sim.dynEGA, n.embed = 5, tau = 1,
  delta = 1, id = 25, use.derivatives = 1,
  ncores = 2, corr = "pearson"
)
## End(Not run)

`EBICglasso` from `qgraph` 1.4.4

Description

This function uses the glasso package (Friedman, Hastie and Tibshirani, 2011) to compute a sparse gaussian graphical model with the graphical lasso (Friedman, Hastie & Tibshirani, 2008). The tuning parameter is chosen using the Extended Bayesian Information criterion (EBIC) described by Foygel & Drton (2010).

Usage

EBICglasso.qgraph(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  gamma = 0.5,
  penalize.diagonal = FALSE,
  nlambda = 100,
  lambda.min.ratio = 0.1,
  fast = FALSE,
  returnAllResults = FALSE,
  penalizeMatrix = NULL,
  countDiagonal = FALSE,
  refit = FALSE,
  model.selection = c("EBIC", "JSD"),
  verbose = FALSE,
  ...
)
EBICglasso.qgraph(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  gamma = 0.5,
  penalize.diagonal = FALSE,
  nlambda = 100,
  lambda.min.ratio = 0.1,
  fast = FALSE,
  returnAllResults = FALSE,
  penalizeMatrix = NULL,
  countDiagonal = FALSE,
  refit = FALSE,
  model.selection = c("EBIC", "JSD"),
  verbose = FALSE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis
`n`	Numeric (length = 1). Sample size if `data` provided is a correlation matrix
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"cosine"` — Uses `cosine` to compute cosine similarity `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`gamma`	Numeric (length = 1) EBIC tuning parameter. Defaults to `0.50` and is generally a good choice. Setting to `0` will cause regular BIC to be used
`penalize.diagonal`	Boolean (length = 1). Should the diagonal be penalized? Defaults to `FALSE`
`nlambda`	Numeric (length = 1). Number of lambda values to test. Defaults to `100`
`lambda.min.ratio`	Numeric (length = 1). Ratio of lowest lambda value compared to maximal lambda. Defaults to `0.1`. NOTE `qgraph` sets the default to `0.01`
`fast`	Boolean (length = 1). Whether the `glassoFast` version should be used to estimate the GLASSO. Defaults to `FALSE`. The fast results may differ by less than floating point of the original GLASSO implemented by `glasso` and should not impact reproducibility much
`returnAllResults`	Boolean (length = 1). Whether all results should be returned. Defaults to `FALSE` (network only). Set to `TRUE` to access `glassopath` output
`penalizeMatrix`	Boolean matrix. Optional logical matrix to indicate which elements are penalized
`countDiagonal`	Boolean (length = 1). Should diagonal be counted in EBIC computation? Defaults to `FALSE`. Set to `TRUE` to mimic `qgraph` < 1.3 behavior (not recommended!)
`refit`	Boolean (length = 1). Should the optimal graph be refitted without LASSO regularization? Defaults to `FALSE`
`model.selection`	Character (length = 1). How lambda should be selected within GLASSO. Defaults to `"EBIC"`. `"JSD"` is experimental and should not be used otherwise
`verbose`	Boolean (length = 1). Whether messages and (insignificant) warnings should be output. Defaults to `FALSE` (silent calls). Set to `TRUE` to see all messages and warnings for every function call
`...`	Arguments sent to `glasso`

Details

The glasso is run for 100 values of the tuning parameter logarithmically spaced between the maximal value of the tuning parameter at which all edges are zero, lambda_max, and lambda_max/100. For each of these graphs the EBIC is computed and the graph with the best EBIC is selected. The partial correlation matrix is computed using wi2net and returned.

Value

A partial correlation matrix

Author(s)

Sacha Epskamp; for maintanence, Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen at gmail.com>

References

Instantiation of GLASSO
Friedman, J., Hastie, T., & Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9, 432-441.

glasso + EBIC
Foygel, R., & Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. In Advances in neural information processing systems (pp. 604-612).

glasso package
Friedman, J., Hastie, T., & Tibshirani, R. (2011). glasso: Graphical lasso-estimation of Gaussian graphical models. R package version 1.7.

Tutorial on EBICglasso
Epskamp, S., & Fried, E. I. (2018). A tutorial on regularized partial correlation networks. Psychological Methods, 23(4), 617–634.

Examples

# Obtain data
wmt <- wmt2[,7:24]

# Fast
fast <- EBICglasso.qgraph(wmt)

# Regular
regular <- EBICglasso.qgraph(wmt, fast = FALSE)

# Difference between fast and regular
sum(abs(fast - regular))

# Compute graph with tuning = 0 (BIC)
BICgraph <- EBICglasso.qgraph(data = wmt, gamma = 0)

# Compute graph with tuning = 0.5 (EBIC)
EBICgraph <- EBICglasso.qgraph(data = wmt, gamma = 0.5)

# Obtain data
wmt <- wmt2[,7:24]

# Fast
fast <- EBICglasso.qgraph(wmt)

# Regular
regular <- EBICglasso.qgraph(wmt, fast = FALSE)

# Difference between fast and regular
sum(abs(fast - regular))

# Compute graph with tuning = 0 (BIC)
BICgraph <- EBICglasso.qgraph(data = wmt, gamma = 0)

# Compute graph with tuning = 0.5 (EBIC)
EBICgraph <- EBICglasso.qgraph(data = wmt, gamma = 0.5)

Exploratory Graph Analysis

Description

Estimates the number of communities (dimensions) of a dataset or correlation matrix using a network estimation method (Golino & Epskamp, 2017; Golino et al., 2020). After, a community detection algorithm is applied (Christensen et al., 2023) for multidimensional data. A unidimensional check is also applied based on findings from Golino et al. (2020) and Christensen (2023)

Usage

EGA(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  plot.EGA = TRUE,
  verbose = FALSE,
  ...
)
EGA(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  plot.EGA = TRUE,
  verbose = FALSE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis. Can be raw data or a correlation matrix
`n`	Numeric (length = 1). Sample size if `data` provided is a correlation matrix
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"cosine"` — Uses `cosine` to compute cosine similarity `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`model`	Character (length = 1). Defaults to `"glasso"`. Available options: `"BGGM"` — Computes the Bayesian Gaussian Graphical Model. Set argument `ordinal.categories` to determine levels allowed for a variable to be considered ordinal. See `?BGGM::estimate` for more details `"glasso"` — Computes the GLASSO with EBIC model selection. See `EBICglasso.qgraph` for more details `"TMFG"` — Computes the TMFG method. See `TMFG` for more details
`algorithm`	Character or `igraph` `cluster_*` function (length = 1). Defaults to `"walktrap"`. Three options are listed below but all are available (see `community.detection` for other options): `"leiden"` — See `cluster_leiden` for more details `"louvain"` — By default, `"louvain"` will implement the Louvain algorithm using the consensus clustering method (see `community.consensus` for more information). This function will implement `consensus.method = "most_common"` and `consensus.iter = 1000` unless specified otherwise `"walktrap"` — See `cluster_walktrap` for more details
`uni.method`	Character (length = 1). What unidimensionality method should be used? Defaults to `"louvain"`. Available options: `"expand"` — Expands the correlation matrix with four variables correlated 0.50. If number of dimension returns 2 or less in check, then the data are unidimensional; otherwise, regular EGA with no matrix expansion is used. This method was used in the Golino et al.'s (2020) Psychological Methods simulation `"LE"` — Applies the Leading Eigenvector algorithm (`cluster_leading_eigen`) on the empirical correlation matrix. If the number of dimensions is 1, then the Leading Eigenvector solution is used; otherwise, regular EGA is used. This method was used in the Christensen et al.'s (2023) Behavior Research Methods simulation `"louvain"` — Applies the Louvain algorithm (`cluster_louvain`) on the empirical correlation matrix. If the number of dimensions is 1, then the Louvain solution is used; otherwise, regular EGA is used. This method was validated Christensen's (2022) PsyArXiv simulation. Consensus clustering can be used by specifying either `"consensus.method"` or `"consensus.iter"`
`plot.EGA`	Boolean (length = 1). Defaults to `TRUE`. Whether the plot should be returned with the results. Set to `FALSE` for no plot
`verbose`	Boolean (length = 1). Whether messages and (insignificant) warnings should be output. Defaults to `FALSE` (silent calls). Set to `TRUE` to see all messages and warnings for every function call
`...`	Additional arguments to be passed on to `auto.correlate`, `network.estimation`, `community.detection`, `community.consensus`, and `community.unidimensional`

Value

Returns a list containing:

`network`	A matrix containing a network estimated using `link[EGAnet]{network.estimation}`
`wc`	A vector representing the community (dimension) membership of each node in the network. `NA` values mean that the node was disconnected from the network
`n.dim`	A scalar of how many total dimensions were identified in the network
`correlation`	The zero-order correlation matrix
`n`	Number of cases in `data`
`dim.variables`	An ordered matrix of item allocation
`TEFI`	`link[EGAnet]{tefi}` for the estimated structure
`plot.EGA`	Plot output if `plot.EGA = TRUE`

Author(s)

Hudson Golino <hfg9s at virginia.edu>, Alexander P. Christensen <alexpaulchristensen at gmail.com>, Maria Dolores Nieto <acinodam at gmail.com> and Luis E. Garrido <garrido.luiseduardo at gmail.com>

References

Original simulation and implementation of EGA
Golino, H. F., & Epskamp, S. (2017). Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. PLoS ONE, 12, e0174035.

Current implementation of EGA, introduced unidimensional checks, continuous and dichotomous data
Golino, H., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Sadana, R., & Thiyagarajan, J. A. (2020). Investigating the performance of Exploratory Graph Analysis and traditional techniques to identify the number of latent factors: A simulation and tutorial. Psychological Methods, 25, 292-320.

Compared all igraph community detection algorithms, introduced Louvain algorithm, simulation with continuous and polytomous data
Also implements the Leading Eigenvalue unidimensional method
Christensen, A. P., Garrido, L. E., Pena, K. G., & Golino, H. (2023). Comparing community detection algorithms in psychological data: A Monte Carlo simulation. Behavior Research Methods.

Comprehensive unidimensionality simulation
Christensen, A. P. (2023). Unidimensional community detection: A Monte Carlo simulation, grid search, and comparison. PsyArXiv.

Compared all igraph community detection algorithms, simulation with continuous and polytomous data
Christensen, A. P., Garrido, L. E., Guerra-Pena, K., & Golino, H. (2023). Comparing community detection algorithms in psychometric networks: A Monte Carlo simulation. Behavior Research Methods.

Examples

# Obtain data
wmt <- wmt2[,7:24]

# Estimate EGA
ega.wmt <- EGA(
  data = wmt,
  plot.EGA = FALSE # No plot for CRAN checks
)

# Print results
print(ega.wmt)

# Estimate EGAtmfg
ega.wmt.tmfg <- EGA(
  data = wmt, model = "TMFG",
  plot.EGA = FALSE # No plot for CRAN checks
)

# Estimate EGA with Louvain algorithm
ega.wmt.louvain <- EGA(
  data = wmt, algorithm = "louvain",
  plot.EGA = FALSE # No plot for CRAN checks
)

# Estimate EGA with an {igraph} function (Fast-greedy)
ega.wmt.greedy <- EGA(
  data = wmt,
  algorithm = igraph::cluster_fast_greedy,
  plot.EGA = FALSE # No plot for CRAN checks
)

# Obtain data
wmt <- wmt2[,7:24]

# Estimate EGA
ega.wmt <- EGA(
  data = wmt,
  plot.EGA = FALSE # No plot for CRAN checks
)

# Print results
print(ega.wmt)

# Estimate EGAtmfg
ega.wmt.tmfg <- EGA(
  data = wmt, model = "TMFG",
  plot.EGA = FALSE # No plot for CRAN checks
)

# Estimate EGA with Louvain algorithm
ega.wmt.louvain <- EGA(
  data = wmt, algorithm = "louvain",
  plot.EGA = FALSE # No plot for CRAN checks
)

# Estimate EGA with an {igraph} function (Fast-greedy)
ega.wmt.greedy <- EGA(
  data = wmt,
  algorithm = igraph::cluster_fast_greedy,
  plot.EGA = FALSE # No plot for CRAN checks
)

Estimates `EGA` for Multidimensional Structures

Description

A basic function to estimate EGA for multidimensional structures. This function does not include the unidimensional check and it does not plot the results. This function can be used as a streamlined approach for quick EGA estimation when unidimensionality or visualization is not a priority

Usage

EGA.estimate(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  verbose = FALSE,
  ...
)
EGA.estimate(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  verbose = FALSE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis
`n`	Numeric (length = 1). Sample size if `data` provided is a correlation matrix
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"cosine"` — Uses `cosine` to compute cosine similarity `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`model`	Character (length = 1). Defaults to `"glasso"`. Available options: `"BGGM"` — Computes the Bayesian Gaussian Graphical Model. Set argument `ordinal.categories` to determine levels allowed for a variable to be considered ordinal. See `?BGGM::estimate` for more details `"glasso"` — Computes the GLASSO with EBIC model selection. See `EBICglasso.qgraph` for more details `"TMFG"` — Computes the TMFG method. See `TMFG` for more details
`algorithm`	Character or `igraph` `cluster_*` function (length = 1). Defaults to `"walktrap"`. Three options are listed below but all are available (see `community.detection` for other options): `"leiden"` — See `cluster_leiden` for more details `"louvain"` — By default, `"louvain"` will implement the Louvain algorithm using the consensus clustering method (see `community.consensus` for more information). This function will implement `consensus.method = "most_common"` and `consensus.iter = 1000` unless specified otherwise `"walktrap"` — See `cluster_walktrap` for more details
`verbose`	Boolean (length = 1). Whether messages and (insignificant) warnings should be output. Defaults to `FALSE` (silent calls). Set to `TRUE` to see all messages and warnings for every function call
`...`	Additional arguments to be passed on to `auto.correlate`, `network.estimation`, `community.detection`, and `community.consensus`

Value

Returns a list containing:

`network`	A matrix containing a network estimated using `link[EGAnet]{network.estimation}`
`wc`	A vector representing the community (dimension) membership of each node in the network. `NA` values mean that the node was disconnected from the network
`n.dim`	A scalar of how many total dimensions were identified in the network
`cor.data`	The zero-order correlation matrix
`n`	Number of cases in `data`

Author(s)

Alexander P. Christensen <alexpaulchristensen at gmail.com> and Hudson Golino <hfg9s at virginia.edu>

References

Introduced unidimensional checks, simulation with continuous and dichotomous data
Golino, H., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Sadana, R., & Thiyagarajan, J. A. (2020). Investigating the performance of Exploratory Graph Analysis and traditional techniques to identify the number of latent factors: A simulation and tutorial. Psychological Methods, 25, 292-320.

Examples

# Obtain data
wmt <- wmt2[,7:24]

# Estimate EGA
ega.wmt <- EGA.estimate(data = wmt)

# Estimate EGA with TMFG
ega.wmt.tmfg <- EGA.estimate(data = wmt, model = "TMFG")

# Estimate EGA with an {igraph} function (Fast-greedy)
ega.wmt.greedy <- EGA.estimate(
  data = wmt,
  algorithm = igraph::cluster_fast_greedy
)

# Obtain data
wmt <- wmt2[,7:24]

# Estimate EGA
ega.wmt <- EGA.estimate(data = wmt)

# Estimate EGA with TMFG
ega.wmt.tmfg <- EGA.estimate(data = wmt, model = "TMFG")

# Estimate EGA with an {igraph} function (Fast-greedy)
ega.wmt.greedy <- EGA.estimate(
  data = wmt,
  algorithm = igraph::cluster_fast_greedy
)

`EGA` Optimal Model Fit using the Total Entropy Fit Index (`tefi`)

Description

Estimates the best fitting model using EGA. The number of steps in the cluster_walktrap detection algorithm is varied and unique community solutions are compared using tefi.

Usage

EGA.fit(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  plot.EGA = TRUE,
  verbose = FALSE,
  ...
)
EGA.fit(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  plot.EGA = TRUE,
  verbose = FALSE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis
`n`	Numeric (length = 1). Sample size if `data` is a correlation matrix
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"cosine"` — Uses `cosine` to compute cosine similarity `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`model`	Character (length = 1). Defaults to `"glasso"`. Available options: `"BGGM"` — Computes the Bayesian Gaussian Graphical Model. Set argument `ordinal.categories` to determine levels allowed for a variable to be considered ordinal. See `?BGGM::estimate` for more details `"glasso"` — Computes the GLASSO with EBIC model selection. See `EBICglasso.qgraph` for more details `"TMFG"` — Computes the TMFG method. See `TMFG` for more details
`algorithm`	Character or `igraph` `cluster_` function. Three options are listed below but all are available (see `community.detection` for other options): `"leiden"` — See `cluster_leiden` for more details. Note*: The Leiden algorithm will default to the Constant Potts Model objective function (`objective_function = "CPM"`). Set `objective_function = "modularity"` to use modularity instead (see examples). By default, searches along resolutions from 0 to `max(abs(network))` or the maximum absolute edge weight in the network in 0.01 increments (`resolution_parameter = seq.int(0, max(abs(network)), 0.01)`). For modularity, searches along resolutions from 0 to 2 in 0.05 increments (`resolution_parameter = seq.int(0, 2, 0.05)`) by default. Use the argument `resolution_parameter` to change the search parameters (see examples) `"louvain"` — See `community.consensus` for more details. By default, searches along resolutions from 0 to 2 in 0.05 increments (`resolution_parameter = seq.int(0, 2, 0.05)`). Use the argument `resolution_parameter` to change the search parameters (see examples) `"walktrap"` — This algorithm is the default. See `cluster_walktrap` for more details. By default, searches along 3 to 8 steps (`steps = 3:8`). Use the argument `steps` to change the search parameters (see examples)
`plot.EGA`	Boolean. If `TRUE`, returns a plot of the network and its estimated dimensions. Defaults to `TRUE`
`verbose`	Boolean. Whether messages and (insignificant) warnings should be output. Defaults to `FALSE` (silent calls). Set to `TRUE` to see all messages and warnings for every function call
`...`	Additional arguments to be passed on to `auto.correlate`, `network.estimation`, `community.detection`, `community.consensus`, and `EGA.estimate`

Value

Returns a list containing:

`EGA`	`EGA` results of the best fitting solution
`EntropyFit`	`tefi` fit values for each solution
`Lowest.EntropyFit`	The best fitting solution based on `tefi`
`parameter.space`	Parameter values used in search space

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Entropy fit measures
Golino, H., Moulder, R. G., Shi, D., Christensen, A. P., Garrido, L. E., Neito, M. D., Nesselroade, J., Sadana, R., Thiyagarajan, J. A., & Boker, S. M. (in press). Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables. Multivariate Behavioral Research.

Simulation for EGA.fit
Jamison, L., Christensen, A. P., & Golino, H. (under review). Optimizing Walktrap's community detection in networks using the Total Entropy Fit Index. PsyArXiv.

Leiden algorithm
Traag, V. A., Waltman, L., & Van Eck, N. J. (2019). From Louvain to Leiden: guaranteeing well-connected communities. Scientific Reports, 9(1), 1-12.

Louvain algorithm
Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008.

Walktrap algorithm
Pons, P., & Latapy, M. (2006). Computing communities in large networks using random walks. Journal of Graph Algorithms and Applications, 10, 191-218.

Examples

# Load data
wmt <- wmt2[,7:24]

# Estimate optimal EGA with Walktrap
fit.walktrap <- EGA.fit(
  data = wmt, algorithm = "walktrap",
  steps = 3:8, # default
  plot.EGA = FALSE # no plot for CRAN checks
)

# Estimate optimal EGA with Leiden and CPM
fit.leiden <- EGA.fit(
  data = wmt, algorithm = "leiden",
  objective_function = "CPM", # default
  # resolution_parameter = seq.int(0, max(abs(network)), 0.01),
  # For CPM, the default max resolution parameter
  # is set to the largest absolute edge in the network
  plot.EGA = FALSE # no plot for CRAN checks
)

# Estimate optimal EGA with Leiden and modularity
fit.leiden <- EGA.fit(
  data = wmt, algorithm = "leiden",
  objective_function = "modularity",
  resolution_parameter = seq.int(0, 2, 0.05),
  # default for modularity
  plot.EGA = FALSE # no plot for CRAN checks
)

## Not run: 
# Estimate optimal EGA with Louvain
fit.louvain <- EGA.fit(
  data = wmt, algorithm = "louvain",
  resolution_parameter = seq.int(0, 2, 0.05), # default
  plot.EGA = FALSE # no plot for CRAN checks
)
## End(Not run)

# Load data
wmt <- wmt2[,7:24]

# Estimate optimal EGA with Walktrap
fit.walktrap <- EGA.fit(
  data = wmt, algorithm = "walktrap",
  steps = 3:8, # default
  plot.EGA = FALSE # no plot for CRAN checks
)

# Estimate optimal EGA with Leiden and CPM
fit.leiden <- EGA.fit(
  data = wmt, algorithm = "leiden",
  objective_function = "CPM", # default
  # resolution_parameter = seq.int(0, max(abs(network)), 0.01),
  # For CPM, the default max resolution parameter
  # is set to the largest absolute edge in the network
  plot.EGA = FALSE # no plot for CRAN checks
)

# Estimate optimal EGA with Leiden and modularity
fit.leiden <- EGA.fit(
  data = wmt, algorithm = "leiden",
  objective_function = "modularity",
  resolution_parameter = seq.int(0, 2, 0.05),
  # default for modularity
  plot.EGA = FALSE # no plot for CRAN checks
)

## Not run: 
# Estimate optimal EGA with Louvain
fit.louvain <- EGA.fit(
  data = wmt, algorithm = "louvain",
  resolution_parameter = seq.int(0, 2, 0.05), # default
  plot.EGA = FALSE # no plot for CRAN checks
)
## End(Not run)

`EGA` Network of `wmt2`Data

Description

EGA results from ega.wmt <- EGA(wmt2[,7:24]) for the Wiener Matrizen-Test (WMT-2)

Usage

data(ega.wmt)
data(ega.wmt)

Format

A list with 8 objects (see Value in EGA)

Examples

data("ega.wmt")
data("ega.wmt")

S3 Plot Methods for `EGAnet`

Description

General usage for plots created by EGAnet's S3 methods. Plots across the EGAnet package leverage GGally's ggnet2 and ggplot2's ggplot.

Most plots allow the full usage of the gg* series functionality and therefore plotting arguments should be referenced through those packages rather than here in EGAnet.

The sections below list the functions and their usage for the S3 plot methods. The plot methods are intended to be generic and without many arguments so that nearly all arguments are passed to ggnet2 and ggplot.

There are some constraints placed on certain plots to keep the EGAnet style throughout the (network) plots in the package, so be aware that if some settings are not changing your plot output, then these settings might be fixed to maintain the EGAnet style

General Usage

plot(x, ...)

plot.dynEGA(x, base = 1, id = NULL, ...)

plot.dynEGA.Group(x, base = 1, ...)

plot.dynEGA.Individual(x, base = 1, id = NULL, ...)

plot.hierEGA(
  x, plot.type = c("multilevel", "separate"),
  color.match = FALSE, ...
)

plot.invariance(x, p_type = c("p", "p_BH"), p_value = 0.05, ...)

plot.TEFI.compare(x, base.name, comparison.name, base.color, comparison.color, ...)

General Arguments

x — EGAnet object with available S3 plot method (see full list below)
color.palette — Character (vector). Either a character (length = 1) from the pre-defined palettes in color_palette_EGA or character (length = total number of communities) using HEX codes (see Color Palettes and Examples sections)
layout — Character (length = 1). Layouts can be set using gplot.layout and the ending layout name; for example, gplot.layout.circle can be set in these functions using layout = "circle" or mode = "circle" (see Examples)
base — Numeric (length = 1). Plot to be used as the base for the configuration of the networks. Uses the number of the order in which the plots are input. Defaults to 1 or the first plot
id — Numeric index(es) or character name(s). IDs to use when plotting dynEGA level = "individual". Defaults to NULL or 4 IDs drawn at random
plot.type — Character (length = 1). Whether hierEGA networks should plotted in a stacked, "multilevel" fashion or as "separate" plots. Defaults to "multilevel"
color.match — Boolean (length = 1). Whether lower order community colors in the hierEGA plot should be "matched" and used as the border color for the higher order communities. Defaults to FALSE
p_type — Character (length = 1). Type of p-value when plotting invariance. Defaults to "p" or uncorrected p-value. Set to "p_BH" for the Benjamini-Hochberg corrected p-value
p_value — Numeric (length = 1). The p-value to use alongside p_type when plotting invariance. Defaults to 0.05
base.name — Character (length = 1). A string to label the base structure in the plot. Defaults to "Base"
comparison.name — Character (length = 1). A string to label the comparison structure in the plot. Defaults to "Comparison"
base.color — Character (length = 1). A string to specifying the color of the base structure in the plot. Hex codes can be used. Defaults to "blue"
comparison.color — Character (length = 1). A string to specifying the color of the comparison structure in the plot. Hex codes can be used. Defaults to "red"
... — Additional arguments to pass on to ggnet2 and gplot.layout (see Examples)

`*EGA` Plots

bootEGA, dynEGA, EGA, EGA.estimate, EGA.fit, hierEGA, invariance, riEGA

All Available S3 Plot Methods

boot.ergoInfo, bootEGA, dynEGA, dynEGA.Group, dynEGA.Individual, dynEGA.Population, EGA, EGA.estimate, EGA.fit, hierEGA, infoCluster, invariance, itemStability, riEGA

Color Palettes

color_palette_EGA will implement some color palettes in EGAnet. The main EGAnet style palette is "polychrome". This palette currently has 40 colors but there will likely be a need to expand it further (e.g., hierEGA demands a lot of colors).

The color.palette argument will also accept HEX code colors that are the same length as the number of communities in the plot.

In any network plots, the color.palette argument can be used to select color palettes from color_palette_EGA as well as those in the color scheme of RColorBrewer

For more worked examples than below, see Plots in {EGAnet}

Examples


# Using different arguments in {GGally}'s `ggnet2`
plot(ega.wmt, node.size = 6, edge.size = 4)

# Using a different layout in {sna}'s `gplot.layout`
plot(ega.wmt, layout = "circle") # 'layout' argument
plot(ega.wmt, mode = "circle") # 'mode' argument

# Using different color palettes with `color_palette_EGA`

## Pre-defined palette
plot(ega.wmt, color.palette = "blue.ridge2")

## University of Virginia colors
plot(ega.wmt, color.palette = c("#232D4B", "#F84C1E"))

## Vanderbilt University colors
## (with additional {GGally} `ggnet2` argument)
plot(
  ega.wmt, color.palette = c("#FFFFFF", "#866D4B"),
  label.color = "#000000"
)

# Using different arguments in {GGally}'s `ggnet2`
plot(ega.wmt, node.size = 6, edge.size = 4)

# Using a different layout in {sna}'s `gplot.layout`
plot(ega.wmt, layout = "circle") # 'layout' argument
plot(ega.wmt, mode = "circle") # 'mode' argument

# Using different color palettes with `color_palette_EGA`

## Pre-defined palette
plot(ega.wmt, color.palette = "blue.ridge2")

## University of Virginia colors
plot(ega.wmt, color.palette = c("#232D4B", "#F84C1E"))

## Vanderbilt University colors
## (with additional {GGally} `ggnet2` argument)
plot(
  ega.wmt, color.palette = c("#FFFFFF", "#866D4B"),
  label.color = "#000000"
)

Exploratory Graph Model

Description

Function to fit the Exploratory Graph Model

Usage

EGM(
  data,
  EGM.model = c("standard", "EGA"),
  communities = NULL,
  structure = NULL,
  search = FALSE,
  p.in = NULL,
  p.out = NULL,
  opt = c("AIC", "BIC", "CFI", "chisq", "logLik", "RMSEA", "SRMR", "TEFI", "TEFI.adj",
    "TLI"),
  constrained = TRUE,
  verbose = TRUE,
  ...
)
EGM(
  data,
  EGM.model = c("standard", "EGA"),
  communities = NULL,
  structure = NULL,
  search = FALSE,
  p.in = NULL,
  p.out = NULL,
  opt = c("AIC", "BIC", "CFI", "chisq", "logLik", "RMSEA", "SRMR", "TEFI", "TEFI.adj",
    "TLI"),
  constrained = TRUE,
  verbose = TRUE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis. Can be raw data or a correlation matrix
`EGM.model`	Character vector (length = 1). Sets the procedure to conduct `EGM`. Available options: `"EGA"` (default) — Applies `EGA` to obtain the (sparse) regularized network structure, communities, and memberships `"standard"` — Applies the standard EGM model which estimates communities based on the non-regularized empirical partial correlation matrix and sparsity is set using `p.in` and `p.out`
`communities`	Numeric vector (length = 1). Number of communities to use for the `"standard"` type of EGM. Defaults to `NULL`. Providing no input will use the communities and memberships output from the Walktrap algorithm (`cluster_walktrap`) based on the empirical non-regularized partial correlation matrix
`structure`	Numeric or character vector (length = `ncol(data)`). Can be theoretical factors or the structure detected by `EGA`. Defaults to `NULL`
`search`	Boolean (length = 1). Whether a search over parameters should be conducted. Defaults to `FALSE`. Set to `TRUE` to select a model over a variety of parameters that optimizes the `opt` objective
`p.in`	Numeric vector (length = 1). Probability that a node is randomly linked to other nodes in the same community. Within community edges are set to zero based on `quantile(x, prob = 1 - p.in)` ensuring the lowest edge values are set to zero (i.e., most probable to not be randomly connected). Only used for `EGM.type = "standard"`. Defaults to `NULL` but must be set
`p.out`	Numeric vector (length = 1). Probability that a node is randomly linked to other nodes not in the same community. Between community edges are set to zero based on `quantile(x, prob = 1 - p.out)` ensuring the lowest edge values are set to zero (i.e., most probable to not be randomly connected). Only used for `EGM.type = "standard"` and `search = FALSE`. Defaults to `NULL` but must be set
`opt`	Character vector (length = 1). Fit index used to select from when searching over models (only applies to `search = TRUE`). Available options include: `"AIC"` `"BIC"` `"CFI"` `"chisq"` `"logLik"` `"RMSEA"` `"SRMR"` `"TEFI"` `"TEFI.adj"` `"TLI"` Defaults to `"SRMR"`
`constrained`	Boolean (length = 1). Whether memberships of the communities should be added as a constraint when optimizing the network loadings. Defaults to `TRUE` which ensures assigned loadings are guaranteed to never be smaller than any cross-loadings. Set to `FALSE` to freely estimate each loading similar to exploratory factor analysis
`verbose`	Boolean (length = 1). Should progress be displayed? Defaults to `TRUE`. Set to `FALSE` to not display progress
`...`	Additional arguments to be passed on to `auto.correlate`, `network.estimation`, `community.detection`, `community.consensus`, `community.unidimensional`, `EGA`, and `net.loads`

Author(s)

Hudson F. Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

Examples

# Get depression data
data <- depression[,24:44]

# Estimate EGM (using EGA)
egm_ega <- EGM(data)

# Estimate EGM (using EGA) specifying communities
egm_ega_communities <- EGM(data, communities = 3)

# Estimate EGM (using EGA) specifying structure
egm_ega_structure <- EGM(
  data, structure = c(
    1, 1, 1, 2, 1, 1, 1,
    1, 1, 1, 3, 2, 2, 2,
    2, 3, 3, 3, 3, 3, 2
  )
)

# Estimate EGM (using standard)
egm_standard <- EGM(
  data, EGM.model = "standard",
  communities = 3, # specify number of communities
  p.in = 0.95, # probability of edges *in* each community
  p.out = 0.80 # probability of edges *between* each community
)

## Not run: 
# Estimate EGM (using EGA search)
egm_ega_search <- EGM(
  data, EGM.model = "EGA", search = TRUE
)

# Estimate EGM (using EGA search and AIC criterion)
egm_ega_search_AIC <- EGM(
  data, EGM.model = "EGA", search = TRUE, opt = "AIC"
)

# Estimate EGM (using search)
egm_search <- EGM(
  data, EGM.model = "standard", search = TRUE,
  communities = 3, # need communities or structure
  p.in = 0.95 # only need 'p.in'
)
## End(Not run)

# Get depression data
data <- depression[,24:44]

# Estimate EGM (using EGA)
egm_ega <- EGM(data)

# Estimate EGM (using EGA) specifying communities
egm_ega_communities <- EGM(data, communities = 3)

# Estimate EGM (using EGA) specifying structure
egm_ega_structure <- EGM(
  data, structure = c(
    1, 1, 1, 2, 1, 1, 1,
    1, 1, 1, 3, 2, 2, 2,
    2, 3, 3, 3, 3, 3, 2
  )
)

# Estimate EGM (using standard)
egm_standard <- EGM(
  data, EGM.model = "standard",
  communities = 3, # specify number of communities
  p.in = 0.95, # probability of edges *in* each community
  p.out = 0.80 # probability of edges *between* each community
)

## Not run: 
# Estimate EGM (using EGA search)
egm_ega_search <- EGM(
  data, EGM.model = "EGA", search = TRUE
)

# Estimate EGM (using EGA search and AIC criterion)
egm_ega_search_AIC <- EGM(
  data, EGM.model = "EGA", search = TRUE, opt = "AIC"
)

# Estimate EGM (using search)
egm_search <- EGM(
  data, EGM.model = "standard", search = TRUE,
  communities = 3, # need communities or structure
  p.in = 0.95 # only need 'p.in'
)
## End(Not run)

Compare `EGM` with EFA

Description

Estimates an EGM based on EGA and uses the number of communities as the number of dimensions in exploratory factor analysis (EFA) using fa

Usage

EGM.compare(data, constrained = FALSE, rotation = "geominQ", ...)
EGM.compare(data, constrained = FALSE, rotation = "geominQ", ...)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis. Can be raw data or a correlation matrix
`constrained`	Boolean (length = 1). Whether memberships of the communities should be added as a constraint when optimizing the network loadings. Defaults to `FALSE` to freely estimate each loading similar to exploratory factor analysis. Note: This default differs from `EGM`. Constraining loadings puts EGM at a deficit relative to EFA and therefore biases the comparability between the methods. It's best to leave the default of unconstrained when using this function.
`rotation`	Character. A rotation to use to obtain a simpler structure for EFA. For a list of rotations, see `rotations` for options. Defaults to `"geominQ"`
`...`	Additional arguments to be passed on to `auto.correlate`, `network.estimation`, `community.detection`, `community.consensus`, `community.unidimensional`, `EGA`, `EGM`, `net.loads`, and `fa`

Author(s)

Hudson F. Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

Examples

# Get depression data
data <- depression[,24:44]

# Compare EGM (using EGA) with EFA
## Not run: 
results <- EGM.compare(data)

# Print summary
summary(results)
## End(Not run)

# Get depression data
data <- depression[,24:44]

# Compare EGM (using EGA) with EFA
## Not run: 
results <- EGM.compare(data)

# Print summary
summary(results)
## End(Not run)

Time-delay Embedding

Description

Reorganizes a single observed time series into an embedded matrix. The embedded matrix is constructed with replicates of an individual time series that are offset from each other in time. The function requires two parameters, one that specifies the number of observations to be used (i.e., the number of embedded dimensions) and the other that specifies the number of observations to offset successive embeddings

Usage

Embed(x, E, tau)
Embed(x, E, tau)

Arguments

`x`	Numeric vector. An observed time series to be reorganized into a time-delayed embedded matrix.
`E`	Numeric (length = 1). Number of embedded dimensions or the number of observations to be used. `E = 5`, for example, will generate a matrix with five columns corresponding to five consecutive observations across each row of the embedded matrix
`tau`	Numeric (length = 1). Number of observations to offset successive embeddings. A tau of one uses adjacent observations. Default is `tau = 1`

Value

Returns a numeric matrix

Author(s)

Pascal Deboeck <pascal.deboeck at psych.utah.edu> and Alexander P. Christensen <[email protected]>

References

Examples

# A time series with 8 time points
time_series <- 49:56

# Time series embedding
Embed(time_series, E = 5, tau = 1)

# A time series with 8 time points
time_series <- 49:56

# Time series embedding
Embed(time_series, E = 5, tau = 1)

Entropy Fit Index

Description

Computes the fit of a dimensionality structure using empirical entropy. Lower values suggest better fit of a structure to the data.

Usage

entropyFit(data, structure)
entropyFit(data, structure)

Arguments

`data`	Matrix or data frame. Contains variables to be used in the analysis
`structure`	Numeric or character vector (length = `ncol(data)`). A vector representing the structure (numbers or labels for each item). Can be theoretical factors or the structure detected by `EGA`

Value

Returns a list containing:

`Total.Correlation`	The total correlation of the dataset
`Total.Correlation.MM`	Miller-Madow correction for the total correlation of the dataset
`Entropy.Fit`	The Entropy Fit Index
`Entropy.Fit.MM`	Miller-Madow correction for the Entropy Fit Index
`Average.Entropy`	The average entropy of the dataset

Author(s)

Hudson F. Golino <hfg9s at virginia.edu>, Alexander P. Christensen <[email protected]> and Robert Moulder <[email protected]>

References

Initial formalization and simulation
Golino, H., Moulder, R. G., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Nesselroade, J., Sadana, R., Thiyagarajan, J. A., & Boker, S. M. (2020). Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables. Multivariate Behavioral Research.

Examples

# Load data
wmt <- wmt2[,7:24]

## Not run: 
# Estimate EGA model
ega.wmt <- EGA(data = wmt)
## End(Not run)

# Compute entropy indices
entropyFit(data = wmt, structure = ega.wmt$wc)

# Load data
wmt <- wmt2[,7:24]

## Not run: 
# Estimate EGA model
ega.wmt <- EGA(data = wmt)
## End(Not run)

# Compute entropy indices
entropyFit(data = wmt, structure = ega.wmt$wc)

Ergodicity Information Index

Description

Computes the Ergodicity Information Index

Usage

ergoInfo(
  dynEGA.object,
  use = c("edge.list", "unweighted", "weighted"),
  shuffles = 5000
)
ergoInfo(
  dynEGA.object,
  use = c("edge.list", "unweighted", "weighted"),
  shuffles = 5000
)

Arguments

dynEGA.object

A dynEGA.ind.pop object

use

Character (length = 1). A string indicating what network element will be used to compute the algorithm complexity, the list of edges or the weights of the network. Defaults to use = "unweighted". Current options are:

"edge.list" — Calculates the algorithm complexity using the list of edges
"unweighted" — Calculates the algorithm complexity using the binary weights of the encoded prime transformed network. 0 = edge absent and 1 = edge present
"weighted" — Calculates the algorithm complexity using the weights of encoded prime-weight transformed network

shuffles

Numeric. Number of shuffles used to compute the Kolmogorov complexity. Defaults to 5000

Value

Returns a list containing:

`PrimeWeight`	The prime-weight encoding of the individual networks
`PrimeWeight.pop`	The prime-weight encoding of the population network
`Kcomp`	The Kolmogorov complexity of the prime-weight encoded individual networks
`Kcomp.pop`	The Kolmogorov complexity of the prime-weight encoded population network
`complexity`	The complexity metric proposed by Santora and Nicosia (2020)
`EII`	The Ergodicity Information Index

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander Christensen <[email protected]>

References

Examples

# Obtain data
sim.dynEGA <- sim.dynEGA # bypasses CRAN checks

## Not run: 
# Dynamic EGA individual and population structure
dyn.ega1 <- dynEGA.ind.pop(
  data = sim.dynEGA[,-26], n.embed = 5, tau = 1,
  delta = 1, id = 25, use.derivatives = 1,
  ncores = 2, corr = "pearson"
)

# Compute empirical ergodicity information index
eii <- ergoInfo(dyn.ega1)
## End(Not run)

# Obtain data
sim.dynEGA <- sim.dynEGA # bypasses CRAN checks

## Not run: 
# Dynamic EGA individual and population structure
dyn.ega1 <- dynEGA.ind.pop(
  data = sim.dynEGA[,-26], n.embed = 5, tau = 1,
  delta = 1, id = 25, use.derivatives = 1,
  ncores = 2, corr = "pearson"
)

# Compute empirical ergodicity information index
eii <- ergoInfo(dyn.ega1)
## End(Not run)

Frobenius Norm (Similarity)

Description

Computes the Frobenius Norm (Ulitzsch et al., 2023)

Usage

frobenius(network1, network2)
frobenius(network1, network2)

Arguments

`network1`	Matrix or data frame. Network to be compared
`network2`	Matrix or data frame. Second network to be compared

Value

Returns Frobenius Norm

Author(s)

Hudson Golino <hfg9s at virginia.edu> & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>

References

Simulation Study
Ulitzsch, E., Khanna, S., Rhemtulla, M., & Domingue, B. W. (2023). A graph theory based similarity metric enables comparison of subpopulation psychometric networks Psychological Methods.

Examples

# Obtain wmt2 data
wmt <- wmt2[,7:24]

# Set seed (for reproducibility)
set.seed(1234)

# Split data
split1 <- sample(
  1:nrow(wmt), floor(nrow(wmt) / 2)
)
split2 <- setdiff(1:nrow(wmt), split1)

# Obtain split data
data1 <- wmt[split1,]
data2 <- wmt[split2,]

# Perform EBICglasso
glas1 <- EBICglasso.qgraph(data1)
glas2 <- EBICglasso.qgraph(data2)

# Frobenius norm
frobenius(glas1, glas2)
# 0.7070395

# Obtain wmt2 data
wmt <- wmt2[,7:24]

# Set seed (for reproducibility)
set.seed(1234)

# Split data
split1 <- sample(
  1:nrow(wmt), floor(nrow(wmt) / 2)
)
split2 <- setdiff(1:nrow(wmt), split1)

# Obtain split data
data1 <- wmt[split1,]
data2 <- wmt[split2,]

# Perform EBICglasso
glas1 <- EBICglasso.qgraph(data1)
glas2 <- EBICglasso.qgraph(data2)

# Frobenius norm
frobenius(glas1, glas2)
# 0.7070395

Generalized Total Entropy Fit Index using Von Neumman's entropy (Quantum Information Theory) for correlation matrices

Description

Computes the fit (Generalized TEFI) of a hierarchical or correlated bifactor dimensionality structure (or hierEGA objects) using Von Neumman's entropy when the input is a correlation matrix. Lower values suggest better fit of a structure to the data

Usage

genTEFI(data, structure = NULL, verbose = TRUE)
genTEFI(data, structure = NULL, verbose = TRUE)

Arguments

data

Matrix, data frame, or hierEGA object. Can be raw data or correlation matrix

structure

For high-order and correlated bifactor structures, structure should be a list containing:

lower_order — A vector (length = ncol(data)) representing the first-order structure (numbers or labels for each item in each first-order factor or community)
higher_order — A vector (length = ncol(data) or number of lower_order communities)representing the second-order structure (numbers or labels for each item in each second-order factor or community)

verbose

Boolean (length = 1). Whether messages and (insignificant) warnings should be output. Defaults to TRUE to see all messages and warnings for every function call. Set to FALSE to ignore messages and warnings

Value

Returns a three-column data frame of the Generalized Total Entropy Fit Index using Von Neumman's entropy (VN.Entropy.Fit) (first column), as well as Lower.Order.VN - TEFI for the first-order factors (second column), and Higher.Order.VN, the equivalent for the second-order factors.

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

Examples

# Example using network scores
opt.hier <- hierEGA(
  data = optimism, scores = "network",
  plot.EGA = FALSE # No plot for CRAN checks
)

# Compute the Generalized Total Entropy Fit Index
genTEFI(opt.hier)

# Example using network scores
opt.hier <- hierEGA(
  data = optimism, scores = "network",
  plot.EGA = FALSE # No plot for CRAN checks
)

# Compute the Generalized Total Entropy Fit Index
genTEFI(opt.hier)

Generalized Local Linear Approximation

Description

Estimates the derivatives of a time series using generalized local linear approximation (GLLA). GLLA is a filtering method for estimating derivatives from data that uses time delay embedding and a variant of Savitzky-Golay filtering to accomplish the task.

Usage

glla(x, n.embed, tau, delta, order)
glla(x, n.embed, tau, delta, order)

Arguments

`x`	Numeric vector. An observed time series
`n.embed`	Numeric (length = 1). Number of embedded dimensions (the number of observations to be used in the `Embed` function)
`tau`	Numeric (length = 1). Number of observations to offset successive embeddings in the `Embed` function. A `tau` of one uses adjacent observations. Default is `1`
`delta`	Numeric (length = 1). The time between successive observations in the time series. Default is `1`
`order`	Numeric (length = 1). The maximum order of the derivative to be estimated. For example, `"order = 2"` will return a matrix with three columns with the estimates of the observed scores and the first and second derivative for each row of the embedded matrix (i.e. the reorganization of the time series implemented via the `Embed` function)

Value

Returns a matrix containing n columns in which n is one plus the maximum order of the derivatives to be estimated via generalized local linear approximation

Author(s)

Hudson Golino <hfg9s at virginia.edu>

References

GLLA implementation
Boker, S. M., Deboeck, P. R., Edler, C., & Keel, P. K. (2010) Generalized local linear approximation of derivatives from time series. In S.-M. Chow, E. Ferrer, & F. Hsieh (Eds.), The Notre Dame series on quantitative methodology. Statistical methods for modeling human dynamics: An interdisciplinary dialogue, (p. 161-178). Routledge/Taylor & Francis Group.

Filtering procedure
Savitzky, A., & Golay, M. J. (1964). Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry, 36(8), 1627-1639.

Examples

# A time series with 8 time points
tseries <- 49:56
deriv.tseries <- glla(tseries, n.embed = 4, tau = 1, delta = 1, order = 2)

# A time series with 8 time points
tseries <- 49:56
deriv.tseries <- glla(tseries, n.embed = 4, tau = 1, delta = 1, order = 2)

Hierarchical `EGA`

Description

Estimates EGA using the lower-order solution of the Louvain algorithm (cluster_louvain)to identify the lower-order dimensions and then uses factor or network loadings to estimate factor or network scores, which are used to estimate the higher-order dimensions (for more details, see Jiménez et al., 2023)

Usage

hierEGA(
  data,
  loading.method = c("original", "revised"),
  rotation = NULL,
  scores = c("factor", "network"),
  loading.structure = c("simple", "full"),
  impute = c("mean", "median", "none"),
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  lower.algorithm = "louvain",
  higher.algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  plot.EGA = TRUE,
  verbose = FALSE,
  ...
)
hierEGA(
  data,
  loading.method = c("original", "revised"),
  rotation = NULL,
  scores = c("factor", "network"),
  loading.structure = c("simple", "full"),
  impute = c("mean", "median", "none"),
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  lower.algorithm = "louvain",
  higher.algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  plot.EGA = TRUE,
  verbose = FALSE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis (does not accept correlation matrices)
`loading.method`	Character (length = 1). Sets network loading calculation based on implementation described in `"original"` (Christensen & Golino, 2021) or the `"revised"` (Christensen et al., 2024) implementation. Defaults to `"revised"`
`rotation`	Character. A rotation to use to obtain a simpler structure. For a list of rotations, see `rotations` for options. Defaults to `NULL` or no rotation. By setting a rotation, `scores` estimation will be based on the rotated loadings rather than unrotated loadings
`scores`	Character (length = 1). How should scores for the higher-order structure be estimated? Defaults to `"network"` for network scores computed using the `net.scores` function. Set to `"factor"` for factor scores computed using `fa`. Factors scores will be based on EFA (as in Jiménez et al., 2023) Factor scores use the number of communities from `EGA`. Estimated factor loadings may not align with these communities. The plots using factor scores will have higher order factors that may not completely map on to the lower order communities. Look at `$hierarchical$higher_order$lower_loadings` to determine the composition of the lower order factors.
`loading.structure`	Character (length = 1). Whether simple structure or the saturated loading matrix should be used when computing scores (`scores = "network"` only). Defaults to `"simple"` `"simple"` structure more closely mirrors traditional hierarchical factor analytic methods such as CFA; `"full"` structure more closely mirrors EFA methods Simple structure is the more conservative (established) approach and is therefore the default. Treat `"full"` as experimental as proper vetting and validation has not been established
`impute`	Character (length = 1). If there are any missing data, then imputation can be implemented. Available options: `"none"` — Default. No imputation is performed `"mean"` — The mean value of each variable is used to replace missing data for that variable `"median"` — The median value of each variable is used to replace missing data for that variable
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`model`	Character (length = 1). Defaults to `"glasso"`. Available options: `"BGGM"` — Computes the Bayesian Gaussian Graphical Model. Set argument `ordinal.categories` to determine levels allowed for a variable to be considered ordinal. See `?BGGM::estimate` for more details `"glasso"` — Computes the GLASSO with EBIC model selection. See `EBICglasso.qgraph` for more details `"TMFG"` — Computes the TMFG method. See `TMFG` for more details
`lower.algorithm`	Character or `cluster_` function (length = 1). Defaults to the lower order `"louvain"` with most common consensus clustering (1000 iterations; see `community.consensus` for more details) Louvain with consensus clustering is strongly* recommended. Using any other algorithm is considered experimental as they have not been designed to capture lower order communities
`higher.algorithm`	Character or `cluster_*` function (length = 1). Defaults to `"louvain"`. Three options are listed below but all are available (see `community.detection` for other options): `"leiden"` — See `cluster_leiden` for more details `"louvain"` — By default, `"louvain"` will implement the higher-order (`order = "higher"`) Louvain algorithm using the consensus clustering method (see `community.consensus` for more information). This function will implement `consensus.method = "most_common"` and `consensus.iter = 1000` unless specified otherwise `"walktrap"` — See `cluster_walktrap` for more details Using `algorithm` will set only `higher.algorithm` and `lower.algorithm` will default to Louvain with most common consensus clustering (1000 iterations)
`uni.method`	Character (length = 1). What unidimensionality method should be used? Defaults to `"louvain"`. Available options: `"expand"` — Expands the correlation matrix with four variables correlated 0.50. If number of dimension returns 2 or less in check, then the data are unidimensional; otherwise, regular EGA with no matrix expansion is used. This method was used in the Golino et al.'s (2020) Psychological Methods simulation `"LE"` — Applies the Leading Eigenvector algorithm (`cluster_leading_eigen`) on the empirical correlation matrix. If the number of dimensions is 1, then the Leading Eigenvector solution is used; otherwise, regular EGA is used. This method was used in the Christensen et al.'s (2023) Behavior Research Methods simulation `"louvain"` — Applies the Louvain algorithm (`cluster_louvain`) on the empirical correlation matrix. If the number of dimensions is 1, then the Louvain solution is used; otherwise, regular EGA is used. This method was validated Christensen's (2022) PsyArXiv simulation. Consensus clustering can be used by specifying either `"consensus.method"` or `"consensus.iter"`
`plot.EGA`	Boolean. If `TRUE`, returns a plot of the network and its estimated dimensions. Defaults to `TRUE`
`verbose`	Boolean (length = 1). Whether messages and (insignificant) warnings should be output. Defaults to `FALSE` (silent calls). Set to `TRUE` to see all messages and warnings for every function call
`...`	Additional arguments to be passed on to `auto.correlate`, `network.estimation`, `community.detection`, `community.consensus`, `EGA`, and `rotations`

Value

Returns a list of lists containing:

`lower_order`	`EGA` results for the lower order structure
`higher_order`	`EGA` results for the higher order structure
`parameters`	A list containing `lower_loadings` and `lower_scores` that were used to estimate scores and the higher order `EGA` results, respectively
`dim.variables`	A data frame with variable names and their lower and higher order assignments
`TEFI`	Generalized TEFI using `tefi`
`plot.hierEGA`	Plot output if `plot.EGA = TRUE`

Author(s)

Marcos Jiménez <marcosjnezhquez@gmailcom>, Francisco J. Abad <[email protected]>, Eduardo Garcia-Garzon <[email protected]>, Hudson Golino <[email protected]>, Alexander P. Christensen <[email protected]>, and Luis Eduardo Garrido <[email protected]>

References

Hierarchical EGA simulation
Jiménez, M., Abad, F. J., Garcia-Garzon, E., Golino, H., Christensen, A. P., & Garrido, L. E. (2023). Dimensionality assessment in bifactor structures with multiple general factors: A network psychometrics approach. Psychological Methods.

3+ level hierarchical EGA
Samo, A., Christensen, A. P., Abad, F. J., Garrido, L. E., Garcia-Garzon, E., Golino, H. & McAbee, S. T. (2023). Building the structure of personality from the bottom-up using Hierarchical Exploratory Graph Analysis. PsyArXiv.

Conceptual implementation
Golino, H., Thiyagarajan, J. A., Sadana, R., Teles, M., Christensen, A. P., & Boker, S. M. (2020). Investigating the broad domains of intrinsic capacity, functional ability and environment: An exploratory graph analysis approach for improving analytical methodologies for measuring healthy aging. PsyArXiv.

Revised network loadings
Christensen, A. P., Golino, H., Abad, F. J., & Garrido, L. E. (2024). Revised network loadings. PsyArXiv.

Examples

# Example using network scores
opt.hier <- hierEGA(
  data = optimism, scores = "network",
  plot.EGA = FALSE # No plot for CRAN checks
)


# Plot multilevel plot
plot(opt.hier, plot.type = "multilevel")

# Plot multilevel plot with higher order
# border color matching the corresponding
# lower order color
plot(opt.hier, color.match = TRUE)

# Plot levels separately
plot(opt.hier, plot.type = "separate")

# Example using network scores
opt.hier <- hierEGA(
  data = optimism, scores = "network",
  plot.EGA = FALSE # No plot for CRAN checks
)


# Plot multilevel plot
plot(opt.hier, plot.type = "multilevel")

# Plot multilevel plot with higher order
# border color matching the corresponding
# lower order color
plot(opt.hier, color.match = TRUE)

# Plot levels separately
plot(opt.hier, plot.type = "separate")

Convert network to matrix

Description

Converts network to matrix

Usage

igraph2matrix(igraph_network, diagonal = 0)
igraph2matrix(igraph_network, diagonal = 0)

Arguments

`igraph_network`	network object
`diagonal`	Numeric (length = 1). Value to be placed on the diagonal of `network`. Defaults to `0`

Value

Returns a network in the format

Author(s)

Hudson Golino <hfg9s at virginia.edu> & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>

Examples

# Convert network to {igraph}
igraph_network <- convert2igraph(ega.wmt$network)

# Convert network back to matrix
igraph2matrix(igraph_network)

# Convert network to {igraph}
igraph_network <- convert2igraph(ega.wmt$network)

# Convert network back to matrix
igraph2matrix(igraph_network)

Information Theoretic Mixture Clustering for `dynEGA`

Description

Performs hierarchical clustering using Jensen-Shannon distance followed by the Louvain algorithm with consensus clustering. The method iteratively identifies smaller and smaller clusters until there is no change in the clusters identified

Usage

infoCluster(dynEGA.object, plot.cluster = TRUE, ...)
infoCluster(dynEGA.object, plot.cluster = TRUE, ...)

Arguments

`dynEGA.object`	A `dynEGA` or a `dynEGA.ind.pop` object that is used to match the arguments of the EII object
`plot.cluster`	Boolean (length = 1). Should plot of optimal and hierarchical clusters be output? Defaults to `TRUE`. Set to `FALSE` to not plot
`...`	Additional arguments to be passed on to `jsd`

Value

Returns a list containing:

`clusters`	A vector corresponding to cluster each participant belongs to
`clusterTree`	The dendogram from `hclust` the hierarhical clustering
`clusterPlot`	Plot output from results
`JSD`	Jensen-Shannon Distance

Author(s)

Hudson Golino <hfg9s at virginia.edu> & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>

Examples

# Obtain data
sim.dynEGA <- sim.dynEGA # bypasses CRAN checks

## Not run: 
# Dynamic EGA individual and population structure
dyn.ega1 <- dynEGA.ind.pop(
  data = sim.dynEGA, n.embed = 5, tau = 1,
  delta = 1, id = 25, use.derivatives = 1,
  ncores = 2, corr = "pearson"
)

# Perform information-theoretic clustering
clust1 <- infoCluster(dynEGA.object = dyn.ega1)
## End(Not run)

# Obtain data
sim.dynEGA <- sim.dynEGA # bypasses CRAN checks

## Not run: 
# Dynamic EGA individual and population structure
dyn.ega1 <- dynEGA.ind.pop(
  data = sim.dynEGA, n.embed = 5, tau = 1,
  delta = 1, id = 25, use.derivatives = 1,
  ncores = 2, corr = "pearson"
)

# Perform information-theoretic clustering
clust1 <- infoCluster(dynEGA.object = dyn.ega1)
## End(Not run)

Information Theory Metrics

Description

A general function to compute several different information theory metrics

Usage

information(
  data,
  base = 2.718282,
  bins = floor(sqrt(nrow(data)/5)),
  statistic = c("entropy", "joint.entropy", "conditional.entropy", "total.correlation",
    "dual.total.correlation", "o.information")
)
information(
  data,
  base = 2.718282,
  bins = floor(sqrt(nrow(data)/5)),
  statistic = c("entropy", "joint.entropy", "conditional.entropy", "total.correlation",
    "dual.total.correlation", "o.information")
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis
`base`	Numeric (length = 1). Base of logarithm to use for entropy. Common options include: `2` — bits `2.718282` — nats `10` — bans Defaults to `exp(1)` or `2.718282`
`bins`	Numeric (length = 1). Number of bins if data are not discrete. Defaults to `floor(sqrt(nrow(data) / 5))`
`statistic`	Character. Information theory statistics to compute. Available options: `"entropy"` — Shannon's entropy (Shannon, 1948) for each variable in `data`. Values range from `0` to `log(k)` where `k` is the number of categories for the variable `"joint.entropy"` — shared uncertainty over all variables in `data`. Values range from the maximum of the individual entropies to the sum of individual entropies `"conditional.entropy"` — uncertainty remaining after considering all other variables in `data`. Values range from `0` to the individual entropy of the conditioned variable `"total.correlation"` — generalization of mutual information to more than two variables (Watanabe, 1960). Quantifies the redundancy of information in `data`. Values range from `0` to the sum of individual entropies minus the maximum of the individual entropies `"dual.total.correlation"` — "shared randomness" or total uncertainty remaining in the `data` (Han, 1978). Values range from `0` to joint entropy `"o.information"` — quantifies the extent to which the `data` is represented by lower-order (`> 0`; redundancy) or higher-order (`< 0`; synergy) constraint (Crutchfield, 1994) By default, all statistics are computed

Value

Returns list containing only requested statistic

Author(s)

Hudson F. Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Shannon's entropy
Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379-423.

Formalization of total correlation
Watanabe, S. (1960). Information theoretical analysis of multivariate correlation. IBM Journal of Research and Development 4, 66-82.

Applied implementation of total correlation
Felix, L. M., Mansur-Alves, M., Teles, M., Jamison, L., & Golino, H. (2021). Longitudinal impact and effects of booster sessions in a cognitive training program for healthy older adults. Archives of Gerontology and Geriatrics, 94, 104337.

Formalization of dual total correlation
Te Sun, H. (1978). Nonnegative entropy measures of multivariate symmetric correlations. Information and Control, 36, 133-156.

Formalization of O-information
Crutchfield, J. P. (1994). The calculi of emergence: Computation, dynamics and induction. Physica D: Nonlinear Phenomena, 75(1-3), 11-54.

Applied implementation of O-information
Marinazzo, D., Van Roozendaal, J., Rosas, F. E., Stella, M., Comolatti, R., Colenbier, N., Stramaglia, S., & Rosseel, Y. (2024). An information-theoretic approach to build hypergraphs in psychometrics. Behavior Research Methods, 1-23.

Examples

# All measures
information(wmt2[,7:24])

# One measures
information(wmt2[,7:24], statistic = "joint.entropy")

# All measures
information(wmt2[,7:24])

# One measures
information(wmt2[,7:24], statistic = "joint.entropy")

Intelligence Data

Description

A response matrix (n = 1152) of the International Cognitive Ability Resource (ICAR) intelligence battery developed by Condon and Revelle (2016).

Usage

data(intelligenceBattery)
data(intelligenceBattery)

Format

A 1185x125 response matrix

Examples

data("intelligenceBattery")

data("intelligenceBattery")

Measurement Invariance of `EGA` Structure

Description

Estimates configural invariance using bootEGA on all data (across groups) first. After configural variance is established, then metric invariance is tested using the community structure that established configural invariance (see Details for more information on this process)

Usage

invariance(
  data,
  groups,
  structure = NULL,
  iter = 500,
  configural.threshold = 0.7,
  configural.type = c("parametric", "resampling"),
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  ncores,
  seed = NULL,
  verbose = TRUE,
  ...
)
invariance(
  data,
  groups,
  structure = NULL,
  iter = 500,
  configural.threshold = 0.7,
  configural.type = c("parametric", "resampling"),
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  ncores,
  seed = NULL,
  verbose = TRUE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis
`groups`	Numeric or character vector (length = `nrow(data)`). Group membership corresponding to each case in data
`structure`	Numeric or character vector (length = `ncol(data)`). A vector representing the structure (numbers or labels for each item). Can be theoretical factors or the structure detected by `EGA`. If supplied, then configural invariance check is skipped (i.e., configural invariance is assumed based on the given structure)
`iter`	Numeric (length = 1). Number of iterations to perform for the permutation. Defaults to `500` (recommended)
`configural.threshold`	Numeric (length = 1). Value to use a threshold in `itemStability` to determine which items should be removed during configural invariance (see Details for more information). Defaults to `0.70` (recommended)
`configural.type`	Character (length = 1). Type of bootstrap to use for configural invariance in `bootEGA`. Defaults to `"parametric"`
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`model`	Character (length = 1). Defaults to `"glasso"`. Available options: `"BGGM"` — Computes the Bayesian Gaussian Graphical Model. Set argument `ordinal.categories` to determine levels allowed for a variable to be considered ordinal. See `?BGGM::estimate` for more details `"glasso"` — Computes the GLASSO with EBIC model selection. See `EBICglasso.qgraph` for more details `"TMFG"` — Computes the TMFG method. See `TMFG` for more details
`algorithm`	Character or `cluster_*` function (length = 1). Defaults to `"walktrap"`. Three options are listed below but all are available (see `community.detection` for other options): `"leiden"` — See `cluster_leiden` for more details `"louvain"` — By default, `"louvain"` will implement the Louvain algorithm using the consensus clustering method (see `community.consensus` for more information). This function will implement `consensus.method = "most_common"` and `consensus.iter = 1000` unless specified otherwise `"walktrap"` — See `cluster_walktrap` for more details
`uni.method`	Character (length = 1). What unidimensionality method should be used? Defaults to `"louvain"`. Available options: `"expand"` — Expands the correlation matrix with four variables correlated 0.50. If number of dimension returns 2 or less in check, then the data are unidimensional; otherwise, regular EGA with no matrix expansion is used. This method was used in the Golino et al.'s (2020) Psychological Methods simulation `"LE"` — Applies the Leading Eigenvector algorithm (`cluster_leading_eigen`) on the empirical correlation matrix. If the number of dimensions is 1, then the Leading Eigenvector solution is used; otherwise, regular EGA is used. This method was used in the Christensen et al.'s (2023) Behavior Research Methods simulation `"louvain"` — Applies the Louvain algorithm (`cluster_louvain`) on the empirical correlation matrix. If the number of dimensions is 1, then the Louvain solution is used; otherwise, regular EGA is used. This method was validated Christensen's (2022) PsyArXiv simulation. Consensus clustering can be used by specifying either `"consensus.method"` or `"consensus.iter"`
`ncores`	Numeric (length = 1). Number of cores to use in computing results. Defaults to `ceiling(parallel::detectCores() / 2)` or half of your computer's processing power. Set to `1` to not use parallel computing If you're unsure how many cores your computer has, then type: `parallel::detectCores()`
`seed`	Numeric (length = 1). Defaults to `NULL` or random results. Set for reproducible results. See Reproducibility and PRNG for more details on random number generation in
`verbose`	Boolean (length = 1). Should progress be displayed? Defaults to `TRUE`. Set to `FALSE` to not display progress
`...`	Additional arguments that can be passed on to `auto.correlate`, `network.estimation`, `community.detection`, `community.consensus`, `EGA`, `bootEGA`, and `net.loads`

Details

In traditional psychometrics, measurement invariance is performed in sequential testing from more flexible (more free parameters) to more rigid (fewer free parameters) structures. Measurement invariance in network psychometrics is no different.

Configural Invariance

To establish configural invariance, the data are collapsed across groups and a common sample structure is identified used bootEGA and itemStability. If some variables have a replication less than 0.70 in their assigned dimension, then they are considered unstable and therefore not invariant. These variables are removed and this process is repeated until all items are considered stable (replication values greater than 0.70) or there are no variables left. If configural invariance cannot be established, then the last run of results are returned and metric invariance is not tested (because configural invariance is not met). Importantly, if any variables are removed, then configural invariance is not met for the original structure. Any removal would suggest only partial configural invariance is met.

Metric Invariance

The variables that remain after configural invariance are submitted to metric invariance. First, each group estimates a network and then network loadings (net.loads) are computed using the assigned community memberships (determined during configural invariance). Then, the difference between the assigned loadings of the groups is computed. This difference represents the empirical values. Second, the group memberships are permutated and networks are estimated based on the these permutated groups for iter times. Then, network loadings are computed and the difference between the assigned loadings of the group is computed, resulting in a null distribution. The empirical difference is then compared against the null distribution using a two-tailed p-value based on the number of null distribution differences that are greater and less than the empirical differences for each variable. Both uncorrected and false discovery rate corrected p-values are returned in the results. Uncorrected p-values are flagged for significance along with the direction of group differences.

Three or More Groups

When there are 3 or more groups, the function performs metric invariance testing by comparing all possible pairs of groups. Specifically:

Pairwise Comparisons: The function generates all possible unique group pairings and computes the differences in network loadings for each pair. The same community structure, derived from configural invariance or provided by the user, is used for all groups.
Permutation Testing: For each group pair, permutation tests are conducted to assess the statistical significance of the observed differences in loadings. p-values are calculated based on the proportion of permuted differences that are greater than or equal to the observed difference.
Result Compilation: The function compiles the results for each pair including both uncorrected (p) and FDR-corrected (Benjamini-Hochberg; p_BH) p-values, and the direction of differences. It returns a summary of the findings for all pairwise comparisons.

This approach allows for a detailed examination of metric invariance across multiple groups, ensuring that all potential differences are thoroughly assessed while maintaining the ability to identify specific group differences.

For more details, see Jamison, Golino, and Christensen (2023)

Value

Returns a list containing:

`configural.results`	`bootEGA` results from the final run that produced configural invariance. This output will be output on the final run of unsuccessful configural invariance runs
`memberships`	Original memberships provided in `structure` or from `EGA` if `structure = NULL`
`EGA`	Original `EGA` results for the full sample
`groups`	A list containing: `EGA` — `EGA` results for each group `loadings` — Network loadings (`net.loads`) for each group `loadingsDifference` — Difference between the dominant loadings of each group
`permutation`	A list containing: `groups` — Permutated groups acorss iterations `loadings` — Network loadings (`net.loads`) for each group for each permutation `loadingsDifference` — Difference between the dominant loadings of each group for each permutation
`results`	Data frame of the results (which are printed)

Author(s)

Laura Jamison <[email protected]>, Hudson F. Golino <hfg9s at virginia.edu>, and Alexander P. Christensen <[email protected]>,

References

Original implementation
Jamison, L., Christensen, A. P., & Golino, H. F. (2024). Metric invariance in exploratory graph analysis via permutation testing. Methodology, 20(2), 144-186.

Examples

# Load data
wmt <- wmt2[-1,7:24]

# Groups
groups <- rep(1:2, each = nrow(wmt) / 2)

## Not run: 
# Measurement invariance
results <- invariance(wmt, groups, ncores = 2)

# Plot with uncorrected alpha = 0.05
plot(results, p_type = "p", p_value = 0.05)

# Plot with BH-corrected alpha = 0.10
plot(results, p_type = "p_BH", p_value = 0.10)
## End(Not run)

# Load data
wmt <- wmt2[-1,7:24]

# Groups
groups <- rep(1:2, each = nrow(wmt) / 2)

## Not run: 
# Measurement invariance
results <- invariance(wmt, groups, ncores = 2)

# Plot with uncorrected alpha = 0.05
plot(results, p_type = "p", p_value = 0.05)

# Plot with BH-corrected alpha = 0.10
plot(results, p_type = "p_BH", p_value = 0.10)
## End(Not run)

Item Stability Statistics from `bootEGA`

Description

Based on the bootEGA results, this function computes and plots the number of times an variable is estimated in the same dimension as originally estimated by an empirical EGA structure or a theoretical/input structure. The output also contains each variable's replication frequency (i.e., proportion of bootstraps that a variable appeared in each dimension

Usage

itemStability(bootega.obj, IS.plot = TRUE, structure = NULL, ...)
itemStability(bootega.obj, IS.plot = TRUE, structure = NULL, ...)

Arguments

`bootega.obj`	A `bootEGA` object
`IS.plot`	Boolean (length = 1). Should the plot be produced for `item.replication`? Defaults to `TRUE`
`structure`	Numeric (length = number of variables). A theoretical or pre-defined structure. Defaults to `NULL` or the empirical `EGA` result in the `bootega.obj`
`...`	Deprecated arguments from previous versions of `itemStability`

Value

Returns a list containing:

membership

A list containing:

empirical — A vector of the empirical memberships from the empirical EGA result
bootstrap — A matrix of the homogenized memberships from the replicate samples in the bootEGA results
structure — A vector of the structure used in the analysis. If structure = NULL, then this output will be the same as empirical

item.stability

A list containing:

empirical.dimensions — A vector of the proportion of times each item replicated within the structure defined by structure
all.dimensions — A matrix of the proportion of times each item replicated in each of the structure defined dimensions

plot

Plot output if IS.plot = TRUE

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Examples

# Load data
wmt <- wmt2[,7:24]

## Not run: 
# Standard EGA example
boot.wmt <- bootEGA(
  data = wmt, iter = 500,
  type = "parametric", ncores = 2
)
## End(Not run)

# Standard item stability
wmt.is <- itemStability(boot.wmt)

## Not run: 
# EGA fit example
boot.wmt.fit <- bootEGA(
  data = wmt, iter = 500,
  EGA.type = "EGA.fit",
  type = "parametric", ncores = 2
)

# EGA fit item stability
wmt.is.fit <- itemStability(boot.wmt.fit)

# Hierarchical EGA example
boot.wmt.hier <- bootEGA(
  data = wmt, iter = 500,
  EGA.type = "hierEGA",
  type = "parametric", ncores = 2
)

# Hierarchical EGA item stability
wmt.is.hier <- itemStability(boot.wmt.hier)

# Random-intercept EGA example
boot.wmt.ri <- bootEGA(
  data = wmt, iter = 500,
  EGA.type = "riEGA",
  type = "parametric", ncores = 2
)

# Random-intercept EGA item stability
wmt.is.ri <- itemStability(boot.wmt.ri)
## End(Not run)

# Load data
wmt <- wmt2[,7:24]

## Not run: 
# Standard EGA example
boot.wmt <- bootEGA(
  data = wmt, iter = 500,
  type = "parametric", ncores = 2
)
## End(Not run)

# Standard item stability
wmt.is <- itemStability(boot.wmt)

## Not run: 
# EGA fit example
boot.wmt.fit <- bootEGA(
  data = wmt, iter = 500,
  EGA.type = "EGA.fit",
  type = "parametric", ncores = 2
)

# EGA fit item stability
wmt.is.fit <- itemStability(boot.wmt.fit)

# Hierarchical EGA example
boot.wmt.hier <- bootEGA(
  data = wmt, iter = 500,
  EGA.type = "hierEGA",
  type = "parametric", ncores = 2
)

# Hierarchical EGA item stability
wmt.is.hier <- itemStability(boot.wmt.hier)

# Random-intercept EGA example
boot.wmt.ri <- bootEGA(
  data = wmt, iter = 500,
  EGA.type = "riEGA",
  type = "parametric", ncores = 2
)

# Random-intercept EGA item stability
wmt.is.ri <- itemStability(boot.wmt.ri)
## End(Not run)

Jensen-Shannon Distance

Description

Computes the Jensen-Shannon Distance between two networks

Usage

jsd(network1, network2, method = c("kld", "spectral"), signed = TRUE)
jsd(network1, network2, method = c("kld", "spectral"), signed = TRUE)

Arguments

`network1`	Matrix or data frame. Network to be compared
`network2`	Matrix or data frame. Second network to be compared
`method`	Character (length = 1). Method to compute Jensen-Shannon Distance. Defaults to `"spectral"`. Available options: `"kld"` — Uses Kullback-Leibler Divergence `"spectral"` — Uses eigenvalues of combinatorial Laplacian matrix to compute Von Neumann entropy
`signed`	Boolean. (length = 1). Should networks be remain signed? Defaults to `TRUE`

Value

Returns Jensen-Shannon Distance

Author(s)

Hudson Golino <hfg9s at virginia.edu> & Alexander P. Christensen <alexander.christensen at Vanderbilt.Edu>

Examples

# Obtain wmt2 data
wmt <- wmt2[,7:24]

# Set seed (for reproducibility)
set.seed(1234)

# Split data
split1 <- sample(
  1:nrow(wmt), floor(nrow(wmt) / 2)
)
split2 <- setdiff(1:nrow(wmt), split1)

# Obtain split data
data1 <- wmt[split1,]
data2 <- wmt[split2,]

# Perform EBICglasso
glas1 <- EBICglasso.qgraph(data1)
glas2 <- EBICglasso.qgraph(data2)

# Spectral JSD
jsd(glas1, glas2)
# 0.1595893

# Spectral JSS (similarity)
1 - jsd(glas1, glas2)
# 0.8404107

# Jensen-Shannon Divergence
jsd(glas1, glas2, method = "kld")
# 0.1393621

# Obtain wmt2 data
wmt <- wmt2[,7:24]

# Set seed (for reproducibility)
set.seed(1234)

# Split data
split1 <- sample(
  1:nrow(wmt), floor(nrow(wmt) / 2)
)
split2 <- setdiff(1:nrow(wmt), split1)

# Obtain split data
data1 <- wmt[split1,]
data2 <- wmt[split2,]

# Perform EBICglasso
glas1 <- EBICglasso.qgraph(data1)
glas2 <- EBICglasso.qgraph(data2)

# Spectral JSD
jsd(glas1, glas2)
# 0.1595893

# Spectral JSS (similarity)
1 - jsd(glas1, glas2)
# 0.8404107

# Jensen-Shannon Divergence
jsd(glas1, glas2, method = "kld")
# 0.1393621

Loadings Comparison Test

Description

An algorithm to identify whether data were generated from a factor or network model using factor and network loadings. The algorithm uses heuristics based on theory and simulation. These heuristics were then submitted to several deep learning neural networks with 240,000 samples per model with varying parameters.

Usage

LCT(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  iter = 100,
  seed = NULL,
  verbose = TRUE,
  ...
)
LCT(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  iter = 100,
  seed = NULL,
  verbose = TRUE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis. Can be raw data or a correlation matrix
`n`	Numeric (length = 1). Sample size if `data` provided is a correlation matrix
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`model`	Character (length = 1). Defaults to `"glasso"`. Available options: `"BGGM"` — Computes the Bayesian Gaussian Graphical Model. Set argument `ordinal.categories` to determine levels allowed for a variable to be considered ordinal. See `?BGGM::estimate` for more details `"glasso"` — Computes the GLASSO with EBIC model selection. See `EBICglasso.qgraph` for more details `"TMFG"` — Computes the TMFG method. See `TMFG` for more details
`algorithm`	Character or `igraph` `cluster_*` function (length = 1). Defaults to `"walktrap"`. Three options are listed below but all are available (see `community.detection` for other options): `"leiden"` — See `cluster_leiden` for more details `"louvain"` — By default, `"louvain"` will implement the Louvain algorithm using the consensus clustering method (see `community.consensus` for more information). This function will implement `consensus.method = "most_common"` and `consensus.iter = 1000` unless specified otherwise `"walktrap"` — See `cluster_walktrap` for more details
`uni.method`	Character (length = 1). What unidimensionality method should be used? Defaults to `"louvain"`. Available options: `"expand"` — Expands the correlation matrix with four variables correlated 0.50. If number of dimension returns 2 or less in check, then the data are unidimensional; otherwise, regular EGA with no matrix expansion is used. This method was used in the Golino et al.'s (2020) Psychological Methods simulation `"LE"` — Applies the Leading Eigenvector algorithm (`cluster_leading_eigen`) on the empirical correlation matrix. If the number of dimensions is 1, then the Leading Eigenvector solution is used; otherwise, regular EGA is used. This method was used in the Christensen et al.'s (2023) Behavior Research Methods simulation `"louvain"` — Applies the Louvain algorithm (`cluster_louvain`) on the empirical correlation matrix. If the number of dimensions is 1, then the Louvain solution is used; otherwise, regular EGA is used. This method was validated Christensen's (2022) PsyArXiv simulation. Consensus clustering can be used by specifying either `"consensus.method"` or `"consensus.iter"`
`iter`	Numeric (length = 1). Number of replicate samples to be drawn from a multivariate normal distribution (uses `MASS::mvrnorm`). Defaults to `100` (recommended)
`seed`	Numeric (length = 1). Defaults to `NULL` or random results. Set for reproducible results. See Reproducibility and PRNG for more details on random number generation in `EGAnet`
`verbose`	Boolean (length = 1). Should progress be displayed? Defaults to `TRUE`. Set to `FALSE` to not display progress
`...`	Additional arguments that can be passed on to `auto.correlate`, `network.estimation`, `community.detection`, `community.consensus`, and `EGA`

Value

Returns a list containing:

`empirical`	Prediction of model based on empirical dataset only
`bootstrap`	Prediction of model based on means of the loadings across the bootstrap replicate samples
`proportion`	Proportions of models suggested across bootstraps

Author(s)

Hudson F. Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen at gmail.com>

References

Model training and validation
Christensen, A. P., & Golino, H. (2021). Factor or network model? Predictions from neural networks. Journal of Behavioral Data Science, 1(1), 85-126.

Examples

# Get data
data <- psych::bfi[,1:25]

## Not run: # Compute LCT
## Factor model
LCT(data)
## End(Not run)

# Get data
data <- psych::bfi[,1:25]

## Not run: # Compute LCT
## Factor model
LCT(data)
## End(Not run)

Computes the (Signed) Modularity Statistic

Description

Computes (signed) modularity statistic given a network and community structure. Allows the resolution parameter to be set

Usage

modularity(network, memberships, resolution = 1, signed = FALSE)
modularity(network, memberships, resolution = 1, signed = FALSE)

Arguments

`network`	Matrix or data frame. A symmetric matrix representing a network
`memberships`	Numeric (length = `ncol(network)`). A numeric vector of integer values corresponding to each node's community membership
`resolution`	Numeric (length = 1). A parameter that adjusts modularity to prefer smaller (`resolution` > 1) or larger (0 < `resolution` < 1) communities. Defaults to `1` (standard modularity computation)
`signed`	Boolean (length = 1). Whether signed or absolute modularity should be computed. The most common modularity metric is defined by positive values only. Gomez et al. (2009) introduced a signed version of modularity that will discount modularity for edges with negative values. This property isn't always desired for psychometric networks. If `TRUE`, then this signed modularity metric will be computed. If `FALSE`, then the absolute value of the edges in the network (using `abs`) will be used to compute modularity. Defaults to `FALSE`

Value

Returns the modularity statistic

Author(s)

Alexander P. Christensen <[email protected]> with assistance from GPT-4

References

Gomez, S., Jensen, P., & Arenas, A. (2009). Analysis of community structure in networks of correlated data. Physical Review E, 80(1), 016114.

Examples

# Load data
wmt <- wmt2[,7:24]

# Estimate EGA
ega.wmt <- EGA(wmt, model = "glasso")

# Compute standard (absolute values) modularity
modularity(
  network = ega.wmt$network,
  memberships = ega.wmt$wc,
  signed = FALSE
)
# 0.1697952

# Compute signed modularity
modularity(
  network = ega.wmt$network,
  memberships = ega.wmt$wc,
  signed = TRUE
)
# 0.1701946

# Load data
wmt <- wmt2[,7:24]

# Estimate EGA
ega.wmt <- EGA(wmt, model = "glasso")

# Compute standard (absolute values) modularity
modularity(
  network = ega.wmt$network,
  memberships = ega.wmt$wc,
  signed = FALSE
)
# 0.1697952

# Compute signed modularity
modularity(
  network = ega.wmt$network,
  memberships = ega.wmt$wc,
  signed = TRUE
)
# 0.1701946

Network Loadings

Description

Computes the between- and within-community strength of each variable for each community

Usage

net.loads(
  A,
  wc,
  loading.method = c("original", "revised"),
  scaling = 2,
  rotation = NULL,
  ...
)
net.loads(
  A,
  wc,
  loading.method = c("original", "revised"),
  scaling = 2,
  rotation = NULL,
  ...
)

Arguments

`A`	Network matrix, data frame, or `EGA` object
`wc`	Numeric or character vector (length = `ncol(A)`). A vector of community assignments. If input into `A` is an `EGA` object, then `wc` is automatically detected
`loading.method`	Character (length = 1). Sets network loading calculation based on implementation described in `"original"` (Christensen & Golino, 2021) or the `"revised"` (Christensen et al., 2024) implementation. Defaults to `"revised"`
`scaling`	Numeric (length = 1). Scaling factor for the magnitude of the `"experimental"` network loadings. Defaults to `2`. `10` makes loadings roughly the size of factor loadings when correlations between factors are orthogonal
`rotation`	Character. A rotation to use to obtain a simpler structure. For a list of rotations, see `rotations` for options. Defaults to `NULL` or no rotation. By setting a rotation, `scores` estimation will be based on the rotated loadings rather than unrotated loadings
`...`	Additional arguments to pass on to `rotations`

Details

Simulation studies have demonstrated that a node's strength centrality is roughly equivalent to factor loadings (Christensen & Golino, 2021; Hallquist, Wright, & Molenaar, 2019). Hallquist and colleagues (2019) found that node strength represented a combination of dominant and cross-factor loadings. This function computes each node's strength within each specified dimension, providing a rough equivalent to factor loadings (including cross-loadings; Christensen & Golino, 2021).

Value

Returns a list containing:

`unstd`	A matrix of the unstandardized within- and between-community strength values for each node
`std`	A matrix of the standardized within- and between-community strength values for each node
`rotated`	`NULL` if `rotation = NULL`; otherwise, a list containing the rotated standardized network loadings (`loadings`) and correlations between dimensions (`Phi`) from the rotation

Author(s)

Alexander P. Christensen <[email protected]> and Hudson Golino <hfg9s at virginia.edu>

References

Original implementation and simulation
Christensen, A. P., & Golino, H. (2021). On the equivalency of factor and network loadings. Behavior Research Methods, 53, 1563-1580.

Demonstration of node strength similarity to CFA loadings
Hallquist, M., Wright, A. C. G., & Molenaar, P. C. M. (2019). Problems with centrality measures in psychopathology symptom networks: Why network psychometrics cannot escape psychometric theory. Multivariate Behavioral Research, 1-25.

Revised network loadings
Christensen, A. P., Golino, H., Abad, F. J., & Garrido, L. E. (2024). Revised network loadings. PsyArXiv.

Examples

# Load data
wmt <- wmt2[,7:24]

# Estimate EGA
ega.wmt <- EGA(
  data = wmt,
  plot.EGA = FALSE # No plot for CRAN checks
)

# Network loadings
net.loads(ega.wmt)

# Load data
wmt <- wmt2[,7:24]

# Estimate EGA
ega.wmt <- EGA(
  data = wmt,
  plot.EGA = FALSE # No plot for CRAN checks
)

# Network loadings
net.loads(ega.wmt)

Network Scores

Description

This function computes network scores computed based on each node's strength within each community in the network (see net.loads). These values are used as "network loadings" for the weights of each variable.

Network scores are computed as a formative composite rather than a reflective factor. This composite representation is consistent with no latent factors that psychometric network theory proposes.

Scores can be computed as a "simple" structure, which is equivalent to a weighted sum scores or as a "full" structure, which is equivalent to an EFA approach. Conservatively, the "simple" structure approach is recommended until further validation

Usage

net.scores(
  data,
  A,
  wc,
  loading.method = c("original", "revised"),
  rotation = NULL,
  scores = c("Anderson", "Bartlett", "components", "Harman", "network", "tenBerge",
    "Thurstone"),
  loading.structure = c("simple", "full"),
  impute = c("mean", "median", "none"),
  ...
)
net.scores(
  data,
  A,
  wc,
  loading.method = c("original", "revised"),
  rotation = NULL,
  scores = c("Anderson", "Bartlett", "components", "Harman", "network", "tenBerge",
    "Thurstone"),
  loading.structure = c("simple", "full"),
  impute = c("mean", "median", "none"),
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis
`A`	Network matrix, data frame, or `EGA` object
`wc`	Numeric or character vector (length = `ncol(A)`). A vector of community assignments. If input into `A` is an `EGA` object, then `wc` is automatically detected
`loading.method`	Character (length = 1). Sets network loading calculation based on implementation described in `"original"` (Christensen & Golino, 2021) or the `"revised"` (Christensen et al., 2024) implementation. Defaults to `"revised"`
`rotation`	Character. A rotation to use to obtain a simpler structure. For a list of rotations, see `rotations` for options. Defaults to `NULL` or no rotation. By setting a rotation, `scores` estimation will be based on the rotated loadings rather than unrotated loadings
`scores`	Character (length = 1). How should scores be estimated? Defaults to `"network"` for network scores. Set to other scoring methods which will be computed using `factor.scores` (see link for arguments and explanations for other methods)
`loading.structure`	Character (length = 1). Whether simple structure or the saturated loading matrix should be used when computing scores. Defaults to `"simple"` `"simple"` structure more closely mirrors sum scores and CFA; `"full"` structure more closely mirrors EFA Simple structure is the more "conservative" (established) approach and is therefore the default. Treat `"full"` as experimental as proper vetting and validation has not been established
`impute`	Character (length = 1). If there are any missing data, then imputation can be implemented. Available options: `"none"` — Default. No imputation is performed `"mean"` — The mean value of each variable is used to replace missing data for that variable `"median"` — The median value of each variable is used to replace missing data for that variable
`...`	Additional arguments to be passed on to `net.loads` and `factor.scores`

Value

Returns a list containing:

`scores`	A list containing the standardized (`std.scores`) rotated (`rot.scores`) scores. If `rotation = NULL`, then `rot.scores` will be `NULL`
`loadings`	Output from `net.loads`

Author(s)

Alexander P. Christensen <[email protected]> and Hudson F. Golino <hfg9s at virginia.edu>

References

Original implementation and simulation for loadings
Christensen, A. P., & Golino, H. (2021). On the equivalency of factor and network loadings. Behavior Research Methods, 53, 1563-1580.

Preliminary simulation for scores
Golino, H., Christensen, A. P., Moulder, R., Kim, S., & Boker, S. M. (2021). Modeling latent topics in social media using Dynamic Exploratory Graph Analysis: The case of the right-wing and left-wing trolls in the 2016 US elections. Psychometrika.

Revised network loadings
Christensen, A. P., Golino, H., Abad, F. J., & Garrido, L. E. (2024). Revised network loadings. PsyArXiv.

Examples

# Load data
wmt <- wmt2[,7:24]

# Estimate EGA
ega.wmt <- EGA(
  data = wmt,
  plot.EGA = FALSE # No plot for CRAN checks
)

# Network scores
net.scores(data = wmt, A = ega.wmt)

# Load data
wmt <- wmt2[,7:24]

# Estimate EGA
ega.wmt <- EGA(
  data = wmt,
  plot.EGA = FALSE # No plot for CRAN checks
)

# Network scores
net.scores(data = wmt, A = ega.wmt)

Compares Network Structures Using Permutation

Description

A permutation implementation to determine statistical significance of whether the network structures are different from one another

Usage

network.compare(
  base,
  comparison,
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  iter = 1000,
  ncores,
  verbose = TRUE,
  seed = NULL,
  ...
)
network.compare(
  base,
  comparison,
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  iter = 1000,
  ncores,
  verbose = TRUE,
  seed = NULL,
  ...
)

Arguments

`base`	Matrix or data frame. Should consist only of variables to be used in the analysis. First dataset
`comparison`	Matrix or data frame. Should consist only of variables to be used in the analysis. Second dataset
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`model`	Character (length = 1). Defaults to `"glasso"`. Available options: `"BGGM"` — Computes the Bayesian Gaussian Graphical Model. Set argument `ordinal.categories` to determine levels allowed for a variable to be considered ordinal. See `?BGGM::estimate` for more details `"glasso"` — Computes the GLASSO with EBIC model selection. See `EBICglasso.qgraph` for more details `"TMFG"` — Computes the TMFG method. See `TMFG` for more details
`iter`	Numeric (length = 1). Number of permutations to perform. Defaults to `1000` (recommended)
`ncores`	Numeric (length = 1). Number of cores to use in computing results. Defaults to `ceiling(parallel::detectCores() / 2)` or half of your computer's processing power. Set to `1` to not use parallel computing
`verbose`	Boolean (length = 1). Should progress be displayed? Defaults to `TRUE`. Set to `FALSE` to not display progress
`seed`	Numeric (length = 1). Defaults to `NULL` or random results. Set for reproducible results. See Reproducibility and PRNG for more details on random number generation in `EGAnet`
`...`	Additional arguments that can be passed on to `auto.correlate`, `network.estimation`, `community.detection`, `community.consensus`, `EGA`, and `jsd`

Value

Returns a list:

`network`	Data frame with row names of each measure, empirical value (`statistic`), and p-value based on the permutation test (`p.value`)
`edges`	List containing matrices of values for empirical values (`statistic`), p-values (`p.value`), and Benjamini-Hochberg corrected p-values (`p.adjusted`)

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Frobenius Norm
Ulitzsch, E., Khanna, S., Rhemtulla, M., & Domingue, B. W. (2023). A graph theory based similarity metric enables comparison of subpopulation psychometric networks. Psychological Methods.

Jensen-Shannon Similarity (1 - Distance)
De Domenico, M., Nicosia, V., Arenas, A., & Latora, V. (2015). Structural reducibility of multilayer networks. Nature Communications, 6(1), 1–9.

Total Network Strength
van Borkulo, C. D., van Bork, R., Boschloo, L., Kossakowski, J. J., Tio, P., Schoevers, R. A., Borsboom, D., & Waldorp, L. J. (2023). Comparing network structures on three aspects: A permutation test. Psychological Methods, 28(6), 1273–1285.

Examples

# Load data
wmt <- wmt2[,7:24]

# Set groups (if necessary)
groups <- rep(1:2, each = nrow(wmt) / 2)

# Groups
group1 <- wmt[groups == 1,]
group2 <- wmt[groups == 2,]

## Not run: # Perform comparison
results <- network.compare(group1, group2)

# Print results
print(results)

# Plot edge differences
plot(results)
## End(Not run)

# Load data
wmt <- wmt2[,7:24]

# Set groups (if necessary)
groups <- rep(1:2, each = nrow(wmt) / 2)

# Groups
group1 <- wmt[groups == 1,]
group2 <- wmt[groups == 2,]

## Not run: # Perform comparison
results <- network.compare(group1, group2)

# Print results
print(results)

# Plot edge differences
plot(results)
## End(Not run)

Apply a Network Estimation Method

Description

General function to apply network estimation methods in EGAnet

Usage

network.estimation(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  network.only = TRUE,
  verbose = FALSE,
  ...
)
network.estimation(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  network.only = TRUE,
  verbose = FALSE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis
`n`	Numeric (length = 1). Sample size if `data` provided is a correlation matrix
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"cosine"` — Uses `cosine` to compute cosine similarity `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`model`	Character (length = 1). Defaults to `"glasso"`. Available options: `"BGGM"` — Computes the Bayesian Gaussian Graphical Model. Set argument `ordinal.categories` to determine levels allowed for a variable to be considered ordinal. See `?BGGM::estimate` for more details `"glasso"` — Computes the GLASSO with EBIC model selection. See `EBICglasso.qgraph` for more details `"TMFG"` — Computes the TMFG method. See `TMFG` for more details
`network.only`	Boolean (length = 1). Whether the network only should be output. Defaults to `TRUE`. Set to `FALSE` to obtain all output for the network estimation method
`verbose`	Boolean (length = 1). Whether messages and (insignificant) warnings should be output. Defaults to `FALSE` (silent calls). Set to `TRUE` to see all messages and warnings for every function call
`...`	Additional arguments to be passed on to `auto.correlate` and the different network estimation methods (see `model` for model specific details)

Value

Returns a matrix populated with a network from the input data

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Graphical Least Absolute Shrinkage and Selection Operator (GLASSO)
Friedman, J., Hastie, T., & Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3), 432–441.

GLASSO with Extended Bayesian Information Criterion (EBICglasso)
Epskamp, S., & Fried, E. I. (2018). A tutorial on regularized partial correlation networks. Psychological Methods, 23(4), 617–634.

Bayesian Gaussian Graphical Model (BGGM)
Williams, D. R. (2021). Bayesian estimation for Gaussian graphical models: Structure learning, predictability, and network comparisons. Multivariate Behavioral Research, 56(2), 336–352.

Triangulated Maximally Filtered Graph (TMFG)
Massara, G. P., Di Matteo, T., & Aste, T. (2016). Network filtering for big data: Triangulated maximally filtered graph. Journal of Complex Networks, 5, 161-178.

Examples

# Load data
wmt <- wmt2[,7:24]

# EBICglasso (default for EGA functions)
glasso_network <- network.estimation(
  data = wmt, model = "glasso"
)

# TMFG
tmfg_network <- network.estimation(
  data = wmt, model = "TMFG"
)

# Load data
wmt <- wmt2[,7:24]

# EBICglasso (default for EGA functions)
glasso_network <- network.estimation(
  data = wmt, model = "glasso"
)

# TMFG
tmfg_network <- network.estimation(
  data = wmt, model = "TMFG"
)

GLASSO with Non-convex Penalties

Description

The graphical least absolute shrinkage and selection operator with a non-convex regularization penalties

Usage

network.nonconvex(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  penalty = c("iPOT", "LGP", "POP", "SPOT"),
  gamma = NULL,
  lambda = NULL,
  nlambda = 50,
  lambda.min.ratio = 0.01,
  penalize.diagonal = TRUE,
  optimize.over = c("none", "lambda", "both"),
  ic = c("AIC", "AICc", "BIC", "EBIC"),
  ebic.gamma = 0.5,
  fast = TRUE,
  verbose = FALSE,
  ...
)
network.nonconvex(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  penalty = c("iPOT", "LGP", "POP", "SPOT"),
  gamma = NULL,
  lambda = NULL,
  nlambda = 50,
  lambda.min.ratio = 0.01,
  penalize.diagonal = TRUE,
  optimize.over = c("none", "lambda", "both"),
  ic = c("AIC", "AICc", "BIC", "EBIC"),
  ebic.gamma = 0.5,
  fast = TRUE,
  verbose = FALSE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis
`n`	Numeric (length = 1). Sample size must be provided if `data` provided is a correlation matrix
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"cosine"` — Uses `cosine` to compute cosine similarity `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`penalty`	Character (length = 1). Available options: `"iPOT"` — Inverse power of two `"LGP"` — Lambda-gamma power `"POP"` — Plus one Pareto `"SPOT"` — Sigmoid power of two (default)
`gamma`	Numeric (length = 1). Adjusts the shape of the penalty. Defaults: `"iPOT"` = 5 `"LGP"` = 5 `"POP"` = 4 `"SPOT"` = 3
`lambda`	Numeric (length = 1). Adjusts the initial penalty provided to the non-convex penalty function
`nlambda`	Numeric (length = 1). Number of lambda values to test. Defaults to `100`
`lambda.min.ratio`	Numeric (length = 1). Ratio of lowest lambda value compared to maximal lambda. Defaults to `0.01`
`penalize.diagonal`	Boolean (length = 1). Should the diagonal be penalized? Defaults to `FALSE`
`optimize.over`	Character (length = 1). Whether optimization of lambda, gamma, both, or no hyperparamters should be performed. Defaults to `"none"` or no optimization
`ic`	Character (length = 1). What information criterion should be used for model selection? Available options include: `"AIC"` — Akaike's information criterion: $-2L + 2E$ `"AICc"` — AIC corrected: $AIC + \frac{2E^2 + 2E}{n - E - 1}$ `"BIC"` — Bayesian information criterion: $-2L + E \cdot \log{(n)}$ `"EBIC"` — Extended BIC: $BIC + 4E \cdot \gamma \cdot \log{(E)}$ Term definitions: $n$ — sample size $p$ — number of variables $E$ — edges $S$ — empirical correlation matrix $K$ — estimated inverse covariance matrix (network) $L = \frac{n}{2} \cdot \log \text{det} K - \sum_{i=1}^p (SK)_{ii}$ Defaults to `"BIC"`
`ebic.gamma`	Numeric (length = 1) Value to set gamma parameter in EBIC (see above). Defaults to `0.50` Only used if `ic = "EBIC"`
`fast`	Boolean (length = 1). Whether the `glassoFast` version should be used to estimate the GLASSO. Defaults to `TRUE`. The fast results may differ by less than floating point of the original GLASSO implemented by `glasso` and should not impact reproducibility much (set to `FALSE` if concerned)
`verbose`	Boolean (length = 1). Whether messages and (insignificant) warnings should be output. Defaults to `FALSE` (silent calls). Set to `TRUE` to see all messages and warnings for every function call
`...`	Additional arguments to be passed on to `auto.correlate`

Value

A network matrix

Author(s)

Alexander P. Christensen <alexpaulchristensen at gmail.com> and Hudson Golino <hfg9s at virginia.edu>

Examples

# Obtain data
wmt <- wmt2[,7:24]

# Obtain network
awe_network <- network.nonconvex(data = wmt)

# Obtain data
wmt <- wmt2[,7:24]

# Obtain network
awe_network <- network.nonconvex(data = wmt)

Predict New Data based on Network

Description

General function to compute a network's predictive power on new data, following Haslbeck and Waldorp (2018) and Williams and Rodriguez (2022)

This implementation is different from the predictability in the mgm package (Haslbeck), which is based on (regularized) regression. This implementation uses the network directly, converting the partial correlations into an implied precision (inverse covariance) matrix. See Details for more information

Usage

network.predictability(network, original.data, newdata, ordinal.categories = 7)
network.predictability(network, original.data, newdata, ordinal.categories = 7)

Arguments

`network`	Matrix or data frame. A partial correlation network
`original.data`	Matrix or data frame. Must consist only of variables to be used to estimate the `network`. See Examples
`newdata`	Matrix or data frame. Must consist of the same variables in the same order as `original.data`. See Examples
`ordinal.categories`	Numeric (length = 1). Up to the number of categories before a variable is considered continuous. Defaults to `7` categories before `8` is considered continuous

Details

This implementation of network predictability proceeds in several steps with important assumptions:

1. Network was estimated using (partial) correlations (not regression like the mgm package!)

2. Original data that was used to estimate the network in 1. is necessary to apply the original scaling to the new data

3. (Linear) regression-like coefficients are obtained by reserve engineering the inverse covariance matrix using the network's partial correlations (i.e., by setting the diagonal of the network to -1 and computing the inverse of the opposite signed partial correlation matrix; see EGAnet:::pcor2inv)

4. Predicted values are obtained by matrix multiplying the new data with these coefficients

5. Dichotomous and polytomous data are given categorical values based on the original data's thresholds and these thresholds are used to convert the continuous predicted values into their corresponding categorical values

6. Evaluation metrics:

dichotomous — "Accuracy" or the percent correctly predicted for the 0s and 1s and "Kappa" or Cohen's Kappa (see cite)
polytomous — "Linear Kappa" or linearly weighted Kappa and "Krippendorff's alpha" (see cite)
continuous — R-squared ("R2") and root mean square error ("RMSE")

Value

Returns a list containing:

`predictions`	Predicted values of `newdata` based on the `network`
`betas`	Beta coefficients derived from the `network`
`results`	Performance metrics for each variable in `newdata`

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

References

Original Implementation of Node Predictability
Haslbeck, J. M., & Waldorp, L. J. (2018). How well do network models predict observations? On the importance of predictability in network models. Behavior Research Methods, 50(2), 853–861.

Derivation of Regression Coefficients Used (Formula 3)
Williams, D. R., & Rodriguez, J. E. (2022). Why overfitting is not (usually) a problem in partial correlation networks. Psychological Methods, 27(5), 822–840.

Cohen's Kappa
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37-46.

Cohen, J. (1968). Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213-220.

Krippendorff's alpha
Krippendorff, K. (2013). Content analysis: An introduction to its methodology (3rd ed.). Thousand Oaks, CA: Sage.

Examples

# Load data
wmt <- wmt2[,7:24]

# Set seed (to reproduce results)
set.seed(42)

# Split data
training <- sample(
  1:nrow(wmt), round(nrow(wmt) * 0.80) # 80/20 split
)

# Set splits
wmt_train <- wmt[training,]
wmt_test <- wmt[-training,]

# EBICglasso (default for EGA functions)
glasso_network <- network.estimation(
  data = wmt_train, model = "glasso"
)

# Check predictability
network.predictability(
  network = glasso_network, original.data = wmt_train,
  newdata = wmt_test
)

# Load data
wmt <- wmt2[,7:24]

# Set seed (to reproduce results)
set.seed(42)

# Split data
training <- sample(
  1:nrow(wmt), round(nrow(wmt) * 0.80) # 80/20 split
)

# Set splits
wmt_train <- wmt[training,]
wmt_test <- wmt[-training,]

# EBICglasso (default for EGA functions)
glasso_network <- network.estimation(
  data = wmt_train, model = "glasso"
)

# Check predictability
network.predictability(
  network = glasso_network, original.data = wmt_train,
  newdata = wmt_test
)

Optimism Data

Description

A response matrix (n = 282) containing responses to 10 items of the Revised Life Orientation Test (LOT-R), developed by Scheier, Carver, & Bridges (1994).

Usage

data(optimism)
data(optimism)

Format

A 282x10 response matrix

References

Scheier, M. F., Carver, C. S., & Bridges, M. W. (1994). Distinguishing optimism from neuroticism (and trait anxiety, self-mastery, and self-esteem): a reevaluation of the Life Orientation Test. Journal of Personality and Social Psychology, 67, 1063-1078.

Examples

data("optimism")

data("optimism")

Computes Polychoric Correlations

Description

A fast implementation of polychoric correlations in C. Uses the Beasley-Springer-Moro algorithm (Boro & Springer, 1977; Moro, 1995) to estimate the inverse univariate normal CDF, the Drezner-Wesolosky approximation (Drezner & Wesolosky, 1990) to estimate the bivariate normal CDF, and Brent's method (Brent, 2013) for optimization of rho

Usage

polychoric.matrix(
  data,
  na.data = c("pairwise", "listwise"),
  empty.method = c("none", "zero", "all"),
  empty.value = c("none", "point_five", "one_over"),
  ...
)
polychoric.matrix(
  data,
  na.data = c("pairwise", "listwise"),
  empty.method = c("none", "zero", "all"),
  empty.value = c("none", "point_five", "one_over"),
  ...
)

Arguments

`data`	Matrix or data frame. A dataset with all ordinal values (rows = cases, columns = variables). Data are required to be between `0` and `11`. Proper adjustments should be made prior to analysis (e.g., scales from -3 to 3 in increments of 1 should be shifted by added 4 to all values)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`empty.method`	Character (length = 1). Method for empty cell correction. Available options: `"none"` — Adds no value (`empty.value = "none"`) to the empirical joint frequency table between two variables `"zero"` — Adds `empty.value` to the cells with zero in the joint frequency table between two variables `"all"` — Adds `empty.value` to all in the joint frequency table between two variables
`empty.value`	Character (length = 1). Value to add to the joint frequency table cells. Accepts numeric values between 0 and 1 or specific methods: `"none"` — Adds no value (`0`) to the empirical joint frequency table between two variables `"point_five"` — Adds `0.5` to the cells defined by `empty.method` `"one_over"` — Adds `1 / n` where n equals the number of cells based on `empty.method`. For `empty.method = "zero"`, n equals the number of zero cells
`...`	Not used but made available for easier argument passing

Value

Returns a polychoric correlation matrix

Author(s)

Alexander P. Christensen <[email protected]> with assistance from GPT-4

References

Beasley-Moro-Springer algorithm
Beasley, J. D., & Springer, S. G. (1977). Algorithm AS 111: The percentage points of the normal distribution. Journal of the Royal Statistical Society. Series C (Applied Statistics), 26(1), 118-121.

Moro, B. (1995). The full monte. Risk 8 (February), 57-58.

Brent optimization
Brent, R. P. (2013). Algorithms for minimization without derivatives. Mineola, NY: Dover Publications, Inc.

Drezner-Wesolowsky bivariate normal approximation
Drezner, Z., & Wesolowsky, G. O. (1990). On the computation of the bivariate normal integral. Journal of Statistical Computation and Simulation, 35(1-2), 101-107.

Examples

# Load data (ensure matrix for missing data example)
wmt <- as.matrix(wmt2[,7:24])

# Compute polychoric correlation matrix
correlations <- polychoric.matrix(wmt)

# Randomly assign missing data
wmt[sample(1:length(wmt), 1000)] <- NA

# Compute polychoric correlation matrix
# with pairwise missing
pairwise_correlations <- polychoric.matrix(
  wmt, na.data = "pairwise"
)

# Compute polychoric correlation matrix
# with listwise missing
pairwise_correlations <- polychoric.matrix(
  wmt, na.data = "listwise"
)

# Load data (ensure matrix for missing data example)
wmt <- as.matrix(wmt2[,7:24])

# Compute polychoric correlation matrix
correlations <- polychoric.matrix(wmt)

# Randomly assign missing data
wmt[sample(1:length(wmt), 1000)] <- NA

# Compute polychoric correlation matrix
# with pairwise missing
pairwise_correlations <- polychoric.matrix(
  wmt, na.data = "pairwise"
)

# Compute polychoric correlation matrix
# with listwise missing
pairwise_correlations <- polychoric.matrix(
  wmt, na.data = "listwise"
)

Prime Numbers through 100,000

Description

Numeric vector of primes generated from the primes package. Used in the function [EGAnet]{ergoInfo}. Not for general use

Usage

data(prime.num)
data(prime.num)

Format

A 1185x24 response matrix

Examples

data("prime.num")

data("prime.num")

Random-Intercept `EGA`

Description

Estimates the number of substantive dimensions after controlling for wording effects. EGA is applied to a residual correlation matrix after subtracting and random intercept factor with equal unstandardized loadings from all the regular and unrecoded reversed items in the database

Usage

riEGA(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  plot.EGA = TRUE,
  verbose = FALSE,
  ...
)
riEGA(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  plot.EGA = TRUE,
  verbose = FALSE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis. Must be raw data and not a correlation matrix
`n`	Numeric (length = 1). Sample size if `data` provided is a correlation matrix
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"cosine"` — Uses `cosine` to compute cosine similarity `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`model`	Character (length = 1). Defaults to `"glasso"`. Available options: `"BGGM"` — Computes the Bayesian Gaussian Graphical Model. Set argument `ordinal.categories` to determine levels allowed for a variable to be considered ordinal. See `?BGGM::estimate` for more details `"glasso"` — Computes the GLASSO with EBIC model selection. See `EBICglasso.qgraph` for more details `"TMFG"` — Computes the TMFG method. See `TMFG` for more details
`algorithm`	Character or `igraph` `cluster_*` function (length = 1). Defaults to `"walktrap"`. Three options are listed below but all are available (see `community.detection` for other options): `"leiden"` — See `cluster_leiden` for more details `"louvain"` — By default, `"louvain"` will implement the Louvain algorithm using the consensus clustering method (see `community.consensus` for more information). This function will implement `consensus.method = "most_common"` and `consensus.iter = 1000` unless specified otherwise `"walktrap"` — See `cluster_walktrap` for more details
`uni.method`	Character (length = 1). What unidimensionality method should be used? Defaults to `"louvain"`. Available options: `"expand"` — Expands the correlation matrix with four variables correlated 0.50. If number of dimension returns 2 or less in check, then the data are unidimensional; otherwise, regular EGA with no matrix expansion is used. This method was used in the Golino et al.'s (2020) Psychological Methods simulation `"LE"` — Applies the Leading Eigenvector algorithm (`cluster_leading_eigen`) on the empirical correlation matrix. If the number of dimensions is 1, then the Leading Eigenvector solution is used; otherwise, regular EGA is used. This method was used in the Christensen et al.'s (2023) Behavior Research Methods simulation `"louvain"` — Applies the Louvain algorithm (`cluster_louvain`) on the empirical correlation matrix. If the number of dimensions is 1, then the Louvain solution is used; otherwise, regular EGA is used. This method was validated Christensen's (2022) PsyArXiv simulation. Consensus clustering can be used by specifying either `"consensus.method"` or `"consensus.iter"`
`plot.EGA`	Boolean (length = 1). If `TRUE`, returns a plot of the network and its estimated dimensions. Defaults to `TRUE`
`verbose`	Boolean (length = 1). Whether messages and (insignificant) warnings should be output. Defaults to `FALSE` (silent calls). Set to `TRUE` to see all messages and warnings for every function call
`...`	Additional arguments to be passed on to `auto.correlate`, `network.estimation`, `community.detection`, `community.consensus`, and `EGA`

Value

Returns a list containing:

`EGA`	Results from `EGA`
`RI`	A list containing information about the random-intercept model (if the model converged): `fit` — The fit object for the random-intercept model using `cfa` `lavaan.args` — The arguments used in `cfa` `loadings` — Standardized loadings from the random-intercept model `correlation` — Residual correlations after accounting for the random-intercept model
`TEFI`	`link[EGAnet]{tefi}` for the estimated structure
`plot.EGA`	Plot output if `plot.EGA = TRUE`

Author(s)

Alejandro Garcia-Pardina <[email protected]>, Francisco J. Abad <[email protected]>, Alexander P. Christensen <[email protected]>, Hudson Golino <hfg9s at virginia.edu>, Luis Eduardo Garrido <[email protected]>, and Robert Moulder <[email protected]>

References

Selection of CFA Estimator
Rhemtulla, M., Brosseau-Liard, P. E., & Savalei, V. (2012). When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. Psychological Methods, 17, 354-373.

Examples

# Obtain example data
wmt <- wmt2[,7:24]

# riEGA example
riEGA(data = wmt, plot.EGA = FALSE)
# no plot for CRAN checks

# Obtain example data
wmt <- wmt2[,7:24]

# riEGA example
riEGA(data = wmt, plot.EGA = FALSE)
# no plot for CRAN checks

sim.dynEGA Data

Description

A simulated (multivariate time series) data with 24 variables, 100 individual observations, 50 time points per individual and 2 groups of individuals

Usage

data(sim.dynEGA)
data(sim.dynEGA)

Format

A 5000 x 26 multivariate time series

Details

Data were generated using the simDFM function with the following arguments:

Group 1

simDFM( variab = 12, timep = 50, nfact = 2, error = 0.125, dfm = "DAFS", loadings = EGAnet:::runif_xoshiro( 1, min = 0.50, max = 0.70 ), autoreg = 0.80, crossreg = 0.00, var.shock = 0.36, cov.shock = 0.18 )

Group 2

simDFM( variab = 8, timep = 50, nfact = 3, error = 0.125, dfm = "DAFS", loadings = EGAnet:::runif_xoshiro( 1, min = 0.50, max = 0.70 ), autoreg = 0.80, crossreg = 0.00, var.shock = 0.36, cov.shock = 0.18 )

Examples

data("sim.dynEGA")

data("sim.dynEGA")

Simulate data following a Dynamic Factor Model

Description

Function to simulate data following a dynamic factor model (DFM). Two DFMs are currently available: the direct autoregressive factor score model (Engle & Watson, 1981; Nesselroade, McArdle, Aggen, and Meyers, 2002) and the dynamic factor model with random walk factor scores.

Usage

simDFM(
  variab,
  timep,
  nfact,
  error,
  dfm = c("DAFS", "RandomWalk"),
  loadings,
  autoreg,
  crossreg,
  var.shock,
  cov.shock,
  burnin = 1000
)
simDFM(
  variab,
  timep,
  nfact,
  error,
  dfm = c("DAFS", "RandomWalk"),
  loadings,
  autoreg,
  crossreg,
  var.shock,
  cov.shock,
  burnin = 1000
)

Arguments

`variab`	Number of variables per factor.
`timep`	Number of time points.
`nfact`	Number of factors.
`error`	Value to be used to construct a diagonal matrix Q. This matrix is p x p covariance matrix Q that will generate random errors following a multivariate normal distribution with mean zeros. The value provided is squared before constructing Q.
`dfm`	A string indicating the dynamical factor model to use. Current options are: `DAFS` — Simulates data using the direct autoregressive factor score model. This is the default method `RandomWalk` — Simulates data using a dynamic factor model with random walk factor scores
`loadings`	Magnitude of the loadings.
`autoreg`	Magnitude of the autoregression coefficients.
`crossreg`	Magnitude of the cross-regression coefficients.
`var.shock`	Magnitude of the random shock variance.
`cov.shock`	Magnitude of the random shock covariance
`burnin`	Number of n first samples to discard when computing the factor scores. Defaults to 1000.

Author(s)

Hudson F. Golino <hfg9s at virginia.edu>

References

Engle, R., & Watson, M. (1981). A one-factor multivariate time series model of metropolitan wage rates. Journal of the American Statistical Association, 76(376), 774-781.

Nesselroade, J. R., McArdle, J. J., Aggen, S. H., & Meyers, J. M. (2002). Dynamic factor analysis models for representing process in multivariate time-series. In D. S. Moskowitz & S. L. Hershberger (Eds.), Multivariate applications book series. Modeling intraindividual variability with repeated measures data: Methods and applications, 235-265.

Examples

## Not run: 
# Estimate EGA network
data1 <- simDFM(variab = 5, timep = 50, nfact = 3, error = 0.05,
dfm = "DAFS", loadings = 0.7, autoreg = 0.8,
crossreg = 0.1, var.shock = 0.36,
cov.shock = 0.18, burnin = 1000)
## End(Not run)

## Not run: 
# Estimate EGA network
data1 <- simDFM(variab = 5, timep = 50, nfact = 3, error = 0.05,
dfm = "DAFS", loadings = 0.7, autoreg = 0.8,
crossreg = 0.1, var.shock = 0.36,
cov.shock = 0.18, burnin = 1000)
## End(Not run)

Simulate data following a Exploratory Graph Model (`EGM`)

Description

Function to simulate data based on EGM

Usage

simEGM(
  communities,
  variables,
  loadings,
  cross.loadings = 0.01,
  correlations,
  sample.size,
  p.in = 0.95,
  p.out = 0.8,
  max.iterations = 1000
)
simEGM(
  communities,
  variables,
  loadings,
  cross.loadings = 0.01,
  correlations,
  sample.size,
  p.in = 0.95,
  p.out = 0.8,
  max.iterations = 1000
)

Arguments

`communities`	Numeric (length = 1). Number of communities to generate
`variables`	Numeric vector (length = 1 or `communities`). Number of variables per community
`loadings`	Numeric (length = 1). Magnitude of the assigned network loadings. Uses the same magnitude as factors loadings Uses `runif(n, min = value - 0.025, max = value + 0.025)` for some jitter in the loadings
`cross.loadings`	Numeric (length = 1). Standard deviation of a normal distribution with a mean of zero (`n, mean = 0, sd = value`). Defaults to `0.01`
`correlations`	Numeric (length = 1). Magnitude of the community correlations Uses `runif(n, min = value - 0.015, max = value + 0.015)` for some jitter in the correlations
`sample.size`	Numeric (length = 1). Number of observations to generate
`p.in`	Numeric (length = 1). Sets the probability of retaining an edge within communities. Single values are applied to all communities. Defaults to `0.95`
`p.out`	Numeric (length = 1 or `communities`). Sets the probability of retaining an edge between communities. Single values are applied to all communities. Defaults to `0.80`
`max.iterations`	Numeric (length = 1). Number of iterations to attempt to get convergence before erroring out. Defaults to `1000`

Author(s)

Hudson F. Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

Examples

simulated <- simEGM(
  communities = 2, variables = 6,
  loadings = 0.55, # use standard factor loading sizes
  correlations = 0.30,
  sample.size = 1000
)

simulated <- simEGM(
  communities = 2, variables = 6,
  loadings = 0.55, # use standard factor loading sizes
  correlations = 0.30,
  sample.size = 1000
)

Total Entropy Fit Index using Von Neumman's entropy (Quantum Information Theory) for correlation matrices

Description

Computes the fit (TEFI) of a dimensionality structure using Von Neumman's entropy when the input is a correlation matrix. Lower values suggest better fit of a structure to the data.

Usage

tefi(data, structure = NULL, verbose = TRUE)
tefi(data, structure = NULL, verbose = TRUE)

Arguments

`data`	Matrix, data frame, or `EGA` class object. Matrix or data frame can be raw data or a correlation matrix. All `EGA` objects are accepted. `hierEGA` input will produced the Generalized TEFI (see `genTEFI`)
`structure`	Numeric or character vector (length = `ncol(data)`). Can be theoretical factors or the structure detected by `EGA`
`verbose`	Boolean (length = 1). Whether messages and (insignificant) warnings should be output. Defaults to `TRUE` to see all messages and warnings for every function call. Set to `FALSE` to ignore messages and warnings

Value

Returns a data frame with columns:

Non-hierarchical Structure

`VN.Entropy.Fit`	The Total Entropy Fit Index using Von Neumman's entropy
`Total.Correlation`	The total correlation of the dataset
`Average.Entropy`	The average entropy of the dataset

Hierarchical Structure

`VN.Entropy.Fit`	The Generalized Total Entropy Fit Index using Von Neumman's entropy
`Lower.Order.VN`	Lower order (only) Total Entropy Fit Index
`Higher.Order.VN`	Higher order (only) Total Entropy Fit Index

Author(s)

Hudson Golino <hfg9s at virginia.edu>, Alexander P. Christensen <[email protected]>, and Robert Moulder <[email protected]>

References

Examples

# Load data
wmt <- wmt2[,7:24]

# Estimate EGA model
ega.wmt <- EGA(
  data = wmt, model = "glasso",
  plot.EGA = FALSE # no plot for CRAN checks
)

# Compute entropy indices for empirical EGA
tefi(ega.wmt)

# User-defined structure (with `EGA` object)
tefi(ega.wmt, structure = c(rep(1, 5), rep(2, 5), rep(3, 8)))

# Load data
wmt <- wmt2[,7:24]

# Estimate EGA model
ega.wmt <- EGA(
  data = wmt, model = "glasso",
  plot.EGA = FALSE # no plot for CRAN checks
)

# Compute entropy indices for empirical EGA
tefi(ega.wmt)

# User-defined structure (with `EGA` object)
tefi(ega.wmt, structure = c(rep(1, 5), rep(2, 5), rep(3, 8)))

Compare Total Entropy Fit Index (`tefi`) Between Two Structures

Description

This function computes the tefi values for two different structures using bootstrapped correlation matrices from bootEGA and compares them using a non-parametric bootstrap test. It also visualizes the distributions of tefi values for both structures.

Usage

tefi.compare(bootega.obj, base, comparison, plot.TEFI = TRUE, ...)
tefi.compare(bootega.obj, base, comparison, plot.TEFI = TRUE, ...)

Arguments

`bootega.obj`	A `bootEGA` object
`base`	Numeric (length = columns in original dataset). A vector representing the base structure to be tested
`comparison`	Numeric (length = columns in original dataset). A vector representing the structure to be compared against the `base` structure
`plot.TEFI`	Boolean (length = 1). Whether the TEFI comparison and the p-value should be plotted. Defaults to `TRUE`
`...`	Additional arguments that can be passed on to `plot.EGAnet`. See `Examples` for plotting arguments

Details

The null hypothesis is that the TEFI values obtained in the bootstrapped correlation matrices for the base structure are than the TEFI values obtained in the bootstrapped correlation matrices for the comparison structure. Therefore, the p-value in this bootstrap test can be interpreted as follows:

If the p-value less than 0.05: TEFI values for the base structure tend to be lower than the comparison structure, indicating that the former provides a better fit (lower entropy) than the latter
If the p-value is greater than 0.05: TEFI values for the base structure are not significantly lower than the comparison structure, suggesting that both structures may provide similar fits or that comparison might fit better

Value

A list containing:

`TEFI.df`	A data frame containing the TEFI values for both structures
`p.value`	The p-value from the non-parametric bootstrap hypothesis test

Author(s)

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <[email protected]>

Examples

# Obtain data
wmt <- wmt2[,7:24]

## Not run: 
# Perform bootstrap EGA
boot.wmt <- bootEGA(
  data = wmt, iter = 500,
  type = "parametric", ncores = 2
)
## End(Not run)

# Perform comparison
comparing_tefi <- tefi.compare(
  boot.wmt,
  base = boot.wmt$EGA$wc, # Compare Walktrap
  comparison = community.detection(
   boot.wmt$EGA$network, algorithm = "louvain"
  ) # With Louvain
)

# Plot options (UVa colors)
plot(
  comparing_tefi,
  base.name = "Walktrap", base.color = "#232D4B",
  comparison.name = "Louvain", comparison.color = "#E57200"
)

# Obtain data
wmt <- wmt2[,7:24]

## Not run: 
# Perform bootstrap EGA
boot.wmt <- bootEGA(
  data = wmt, iter = 500,
  type = "parametric", ncores = 2
)
## End(Not run)

# Perform comparison
comparing_tefi <- tefi.compare(
  boot.wmt,
  base = boot.wmt$EGA$wc, # Compare Walktrap
  comparison = community.detection(
   boot.wmt$EGA$network, algorithm = "louvain"
  ) # With Louvain
)

# Plot options (UVa colors)
plot(
  comparing_tefi,
  base.name = "Walktrap", base.color = "#232D4B",
  comparison.name = "Louvain", comparison.color = "#E57200"
)

Triangulated Maximally Filtered Graph

Description

Applies the Triangulated Maximally Filtered Graph (TMFG) filtering method (see Massara et al., 2016). The TMFG method uses a structural constraint that limits the number of zero-order correlations included in the network (3n - 6; where n is the number of variables). The TMFG algorithm begins by identifying four variables which have the largest sum of correlations to all other variables. Then, it iteratively adds each variable with the largest sum of three correlations to nodes already in the network until all variables have been added to the network. This structure can be associated with the inverse correlation matrix (i.e., precision matrix) to be turned into a GGM (i.e., partial correlation network) by using Local-Global Inversion Method (LoGo; see Barfuss et al., 2016 for more details). See Details for more information

Usage

TMFG(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  partial = FALSE,
  returnAllResults = FALSE,
  verbose = FALSE,
  ...
)
TMFG(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  partial = FALSE,
  returnAllResults = FALSE,
  verbose = FALSE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis. Can be raw data or correlation matrix
`n`	Numeric (length = 1). Sample size for when a correlation matrix is input into `data`. Defaults to `NULL`. `n` is not necessary and is provided for better functionality in `EGAnet`
`corr`	Character (length = 1). Method to compute correlations. Defaults to `"auto"`. Available options: `"auto"` — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use `ordinal.categories` (see `polychoric.matrix` for more details) `"cor_auto"` — Uses `cor_auto` to compute correlations. Arguments can be passed along to the function `"cosine"` — Uses `cosine` to compute cosine similarity `"pearson"` — Pearson's correlation is computed for all variables regardless of categories `"spearman"` — Spearman's rank-order correlation is computed for all variables regardless of categories For other similarity measures, compute them first and input them into `data` with the sample size (`n`)
`na.data`	Character (length = 1). How should missing data be handled? Defaults to `"pairwise"`. Available options: `"pairwise"` — Computes correlation for all available cases between two variables `"listwise"` — Computes correlation for all complete cases in the dataset
`partial`	Boolean (length = 1). Whether partial correlations should be output. Defaults to `FALSE`. The TMFG method is based on the zero-order correlations; the Local-Global Inversion Method (LoGo; see Barfuss et al., 2016 for more details) uses the decomposability of the TMFG network to obtain the inverse covariance structure of the network (which is then converted to partial correlations). Set to `TRUE` to obtain the partial correlations from the LoGo method
`returnAllResults`	Boolean (length = 1). Whether all results should be returned. Defaults to `FALSE` (network only). Set to `TRUE` to access separators and cliques
`verbose`	Boolean (length = 1). Whether messages and (insignificant) warnings should be output. Defaults to `FALSE` (silent calls). Set to `TRUE` to see all messages and warnings for every function call
`...`	Additional arguments to be passed on to `auto.correlate`

Details

The TMFG method applies a structural constraint on the network, which restrains the network to retain a certain number of edges (3n-6, where n is the number of nodes; Massara et al., 2016). The network is also composed of 3- and 4-node cliques (i.e., sets of connected nodes; a triangle and tetrahedron, respectively). The TMFG method constructs a network using zero-order correlations and the resulting network can be associated with the inverse covariance matrix (yielding a GGM; Barfuss, Massara, Di Matteo, & Aste, 2016). Notably, the TMFG can use any association measure and thus does not assume the data is multivariate normal.

Construction begins by forming a tetrahedron of the four nodes that have the highest sum of correlations that are greater than the average correlation in the correlation matrix. Next, the algorithm iteratively identifies the node that maximizes its sum of correlations to a connected set of three nodes (triangles) already included in the network and then adds that node to the network. The process is completed once every node is connected in the network. In this process, the network automatically generates what's called a planar network. A planar network is a network that could be drawn on a sphere with no edges crossing (often, however, the networks are depicted with edges crossing; Tumminello, Aste, Di Matteo, & Mantegna, 2005).

Value

Returns a network or list containing:

`network`	The filtered adjacency matrix
`separators`	The separators (3-cliques) in the network
`cliques`	The cliques (4-cliques) in the network

Author(s)

Alexander Christensen <[email protected]>

References

Local-Global Inversion Method
Barfuss, W., Massara, G. P., Di Matteo, T., & Aste, T. (2016). Parsimonious modeling with information filtering networks. Physical Review E, 94, 062306.

Psychometric network introduction to TMFG
Christensen, A. P., Kenett, Y. N., Aste, T., Silvia, P. J., & Kwapil, T. R. (2018). Network structure of the Wisconsin Schizotypy Scales-Short Forms: Examining psychometric network filtering approaches. Behavior Research Methods, 50, 2531-2550.

Triangulated Maximally Filtered Graph
Massara, G. P., Di Matteo, T., & Aste, T. (2016). Network filtering for big data: Triangulated maximally filtered graph. Journal of Complex Networks, 5, 161-178.

Examples

# TMFG filtered network
TMFG(wmt2[,7:24])

# Partial correlations using the LoGo method
TMFG(wmt2[,7:24], partial = TRUE)

# TMFG filtered network
TMFG(wmt2[,7:24])

# Partial correlations using the LoGo method
TMFG(wmt2[,7:24], partial = TRUE)

Total Correlation

Description

Computes the total correlation of a dataset

Usage

totalCor(data, base = 2.718282)
totalCor(data, base = 2.718282)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis
`base`	Numeric (length = 1). Base to use for entropy. Defaults to `exp(1)` or `2.718282`

Value

Returns a list containing:

`Ind.Entropies`	Individual entropies for each variable
`Joint.Entropy`	The joint entropy of the dataset
`Total.Cor`	The total correlation of the dataset
`Normalized`	Total correlation divided by the sum of the individual entropies minus the maximum of the individual entropies

Author(s)

Hudson F. Golino <hfg9s at virginia.edu>

References

Formalization of total correlation
Watanabe, S. (1960). Information theoretical analysis of multivariate correlation. IBM Journal of Research and Development 4, 66-82.

Applied implementation
Felix, L. M., Mansur-Alves, M., Teles, M., Jamison, L., & Golino, H. (2021). Longitudinal impact and effects of booster sessions in a cognitive training program for healthy older adults. Archives of Gerontology and Geriatrics, 94, 104337.

Examples

# Compute total correlation
totalCor(wmt2[,7:24])

# Compute total correlation
totalCor(wmt2[,7:24])

Total Correlation Matrix

Description

Computes the pairwise total correlation (totalCor) for a dataset

Usage

totalCorMat(data, base = 2.718282, normalized = FALSE)
totalCorMat(data, base = 2.718282, normalized = FALSE)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis
`base`	Numeric (length = 1). Base to use for entropy. Defaults to `exp(1)` or `2.718282`
`normalized`	Boolean (length = 1). Should the normalized total correlation be computed? Defaults to `FALSE`

Value

Returns a symmetric matrix with pairwise total correlations

Author(s)

Hudson F. Golino <hfg9s at virginia.edu>

References

Formalization of total correlation
Watanabe, S. (1960). Information theoretical analysis of multivariate correlation. IBM Journal of Research and Development 4, 66-82.

Examples

# Compute total correlation matrix
totalCorMat(wmt2[,7:24])

# Compute total correlation matrix
totalCorMat(wmt2[,7:24])

Unique Variable Analysis

Description

Identifies locally dependent (redundant) variables in a multivariate dataset using the EBICglasso.qgraph network estimation method and weighted topological overlap (see Christensen, Garrido, & Golino, 2023 for more details)

Usage

UVA(
  data = NULL,
  network = NULL,
  n = NULL,
  key = NULL,
  uva.method = c("MBR", "EJP"),
  cut.off = 0.25,
  reduce = TRUE,
  reduce.method = c("latent", "mean", "remove", "sum"),
  auto = TRUE,
  verbose = FALSE,
  ...
)
UVA(
  data = NULL,
  network = NULL,
  n = NULL,
  key = NULL,
  uva.method = c("MBR", "EJP"),
  cut.off = 0.25,
  reduce = TRUE,
  reduce.method = c("latent", "mean", "remove", "sum"),
  auto = TRUE,
  verbose = FALSE,
  ...
)

Arguments

`data`	Matrix or data frame. Should consist only of variables to be used in the analysis. Can be raw data or a correlation matrix. Defaults to `NULL`
`network`	Symmetric matrix or data frame. A symmetric network. Defaults to `NULL` If both `data` and `network` are provided, then `UVA` will use the `network` with the `data` (rather than estimating a network from the `data`)
`n`	Numeric (length = 1). Sample size if `data` provided is a correlation matrix. Defaults to `NULL`
`key`	Character vector (length = `ncol(data)`). Item key for labeling variables in the results
`uva.method`	Character (length = 1). Whether the method described in Christensen, Garrido, and Golino (2023) publication in Multivariate Behavioral Research (`"MBR"`) or Christensen, Golino, and Silvia (2020) publication in European Journal of Personality (`"EJP"`) should be used. Defaults to `"MBR"` Based on simulation and accumulating empirical evidence, the methods described in Christensen, Golino, and Silvia (2020) such as adaptive alpha are outdated. Evidence supports using a single cut-off value (regardless of continuous, polytomous, or dichotomous data; Christensen, Garrido, & Golino, 2023)
`cut.off`	Numeric (length = 1). Cut-off used to determine when pairwise `wto` values are considered locally dependent (or redundant). Must be values between `0` and `1`. Defaults to `0.25` This cut-off value is recommended and based on extensive simulation (Christensen, Garrido, & Golino, 2023). Printing the result will provide a gradient of pairwise redundancies in increments of 0.20, 0.25, and 0.30. Use `print` or `summary` on the output rather than adjusting this cut-off value
`reduce`	Logical (length = 1). Whether redundancies should be reduced in data. Defaults to `TRUE`
`reduce.method`	Character (length = 1). Method to reduce redundancies. Available options: `"latent"` — Computes latent variables using `cfa` when there are three or more redundant variables. If variables are not all coded in the same direction, then they will be recoded as necessary. A warning will be produced for all variables that are flipped `"mean"` — Computes mean of redundant variables. If variables are not all coded in the same direction, then they will be recoded as necessary. A warning will be produced for all variables that are flipped `"remove"` — Removes all but one variable from a set of redundant variables `"sum"` — Computes sum of redundant variables. If variables are not all coded in the same direction, then they will be recoded as necessary. A warning will be produced for all variables that are flipped
`auto`	Logical (length = 1). Whether `reduce` should occur automatically. For `reduce.method = "remove"`, the automated decision process is as follows: `Two variables` — The variable with the lowest maximum `wto` to all other variables (other than the one it is redundant with) is retained and the other is removed `Three or more variables` — The variable with the highest mean `wto` to all other variables that are redundant with one another is retained and all others are removed
`verbose`	Boolean (length = 1). Whether messages and (insignificant) warnings should be output. Defaults to `FALSE` (silent calls). Set to `TRUE` to see all messages and warnings for every function call
`...`	Additional arguments that should be passed on to old versions of `UVA` or to `EGA` and `cfa`

References

Most recent simulation and implementation
Christensen, A. P., Garrido, L. E., & Golino, H. (2023). Unique variable analysis: A network psychometrics method to detect local dependence. Multivariate Behavioral Research.

Conceptual foundation and outdated methods
Christensen, A. P., Golino, H., & Silvia, P. J. (2020). A psychometric network perspective on the validity and validation of personality trait questionnaires. European Journal of Personality, 34(6), 1095-1108.

Weighted topological overlap
Nowick, K., Gernat, T., Almaas, E., & Stubbs, L. (2009). Differences in human and chimpanzee gene expression patterns define an evolving network of transcription factors in brain. Proceedings of the National Academy of Sciences, 106, 22358-22363.

Examples

# Perform UVA
uva.wmt <- UVA(wmt2[,7:24])

# Show summary
summary(uva.wmt)

# Perform UVA
uva.wmt <- UVA(wmt2[,7:24])

# Show summary
summary(uva.wmt)

Entropy Fit Index using Von Neumman's entropy (Quantum Information Theory) for correlation matrices

Description

Computes the fit of a dimensionality structure using Von Neumman's entropy when the input is a correlation matrix. Lower values suggest better fit of a structure to the data

Usage

vn.entropy(data, structure)
vn.entropy(data, structure)

Arguments

`data`	Matrix or data frame. Contains variables to be used in the analysis
`structure`	Numeric or character vector (length = `ncol(data)`). A vector representing the structure (numbers or labels for each item). Can be theoretical factors or the structure detected by `EGA`

Value

Returns a list containing:

`VN.Entropy.Fit`	The Entropy Fit Index using Von Neumman's entropy
`Total.Correlation`	The total correlation of the dataset
`Average.Entropy`	The average entropy of the dataset

Author(s)

Hudson Golino <hfg9s at virginia.edu>, Alexander P. Christensen <[email protected]>, and Robert Moulder <[email protected]>

References

Examples

# Get EGA result
ega.wmt <- EGA(
  data = wmt2[,7:24], model = "glasso",
  plot.EGA = FALSE # no plot for CRAN checks
)

# Compute Von Neumman entropy
vn.entropy(ega.wmt$correlation, ega.wmt$wc)

# Get EGA result
ega.wmt <- EGA(
  data = wmt2[,7:24], model = "glasso",
  plot.EGA = FALSE # no plot for CRAN checks
)

# Compute Von Neumman entropy
vn.entropy(ega.wmt$correlation, ega.wmt$wc)

WMT-2 Data

Description

A response matrix (n = 1185) of the Wiener Matrizen-Test 2 (WMT-2).

Usage

data(wmt2)
data(wmt2)

Format

A 1185x24 response matrix

Examples

data("wmt2")

data("wmt2")

Weighted Topological Overlap

Description

Computes weighted topological overlap following the Novick et al. (2009) definition

Usage

wto(network, signed = TRUE, diagonal.zero = TRUE)
wto(network, signed = TRUE, diagonal.zero = TRUE)

Arguments

`network`	Symmetric matrix or data frame. A symmetric network
`signed`	Boolean (length = 1). Whether the signed version should be used. Defaults to `TRUE`. Use `FALSE` for absolute values
`diagonal.zero`	Boolean (length = 1). Whether diagonal of overlap matrix should be set to zero. Defaults to `TRUE`. Use `FALSE` to allow overlap of a node with itself

Value

A symmetric matrix of weighted topological overlap values between each pair of variables

References

Original formalization
Nowick, K., Gernat, T., Almaas, E., & Stubbs, L. (2009). Differences in human and chimpanzee gene expression patterns define an evolving network of transcription factors in brain. Proceedings of the National Academy of Sciences, 106, 22358-22363.

Examples

# Obtain network
network <- network.estimation(wmt2[,7:24], model = "glasso")

# Compute wTO
wto(network)

# Obtain network
network <- network.estimation(wmt2[,7:24], model = "glasso")

# Compute wTO
wto(network)

Package 'EGAnet'

Help Index

EGAnet-package

Description

Author(s)

References

See Also

Automatic correlations

Description

Usage

Arguments

Author(s)

Examples

Bootstrap Test for the Ergodicity Information Index

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

bootEGA Results of wmt2Data

Description

Usage

Format

Examples

Bootstrap Exploratory Graph Analysis

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

CFA Fit of EGA or hierEGA Structure

Description

Usage

Arguments

Value

Author(s)

References

Examples

EGA Color Palettes

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Compares Community Detection Solutions Using Permutation

Description

Usage

Arguments

Value

Author(s)

References

Examples

Applies the Consensus Clustering Method (Louvain only)

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Apply a Community Detection Algorithm

Description

Usage

Arguments

Value

Author(s)

References

Examples

Homogenize Community Memberships

`bootEGA` Results of `wmt2`Data

CFA Fit of `EGA` or `hierEGA` Structure

`EGA` Color Palettes

Visually Compare Two or More `EGAnet` plots

Convert networks to `igraph`

Convert networks to `tidygraph`

Dimension Stability Statistics from `bootEGA`