Title: | An Exponential Stochastic Block Model for Interaction Lengths |
---|---|
Description: | Given a continuous-time dynamic network, this package allows one to fit a stochastic blockmodel where nodes belonging to the same group create interactions and non-interactions of similar lengths. This package implements the methodology described by R. Rastelli and M. Fop (2019) <arXiv:1901.09828>. |
Authors: | Riccardo Rastelli [aut, cre] , Michael Fop [aut] |
Maintainer: | Riccardo Rastelli <[email protected]> |
License: | GPL-3 |
Version: | 1.3.5 |
Built: | 2024-11-04 05:49:44 UTC |
Source: | https://github.com/cran/expSBM |
Given a continuous-time dynamic network, this package allows one to fit a stochastic blockmodel where nodes belonging to the same group create interactions and non-interactions of similar lengths. This package implements the methodology described by R. Rastelli and M. Fop (2019) <arXiv:1901.09828>.
Riccardo Rastelli <[email protected]>
Michael Fop <[email protected]>
R. Rastelli and M. Fop (2019) "A dynamic stochastic blockmodel for interaction lengths", https://arxiv.org/abs/1901.09828
Evaluates the evidence lower bound for a given dynamic network.
expSBM_ELBO(N, edgelist, Z, lambda, mu, nu, directed = F, trunc = T, verbose = F)
expSBM_ELBO(N, edgelist, Z, lambda, mu, nu, directed = F, trunc = T, verbose = F)
N |
Number of nodes. |
edgelist |
A matrix with 4 columns: on the first column the sender node, on the second the receiver, on the third either a one or zero to indicate whether it is an interaction or a non-interaction respectively, on the fourth the corresponding exponential length. |
Z |
A |
lambda |
Mixing proportions of the latent groups. |
mu |
A matrix of size |
nu |
A matrix of size |
directed |
|
trunc |
|
verbose |
|
computing_time |
Number of seconds required for the evaluation. |
elbo_value |
Value of the evidence lower bound for the given variational parameters. |
Runs the variational expectation maximization algorithm for a given number of latent groups.
expSBM_EM(N, edgelist, Z, lambda, mu, nu, directed = F, trunc = T, tol = 0.001, n_iter_max = 100, verbose = F)
expSBM_EM(N, edgelist, Z, lambda, mu, nu, directed = F, trunc = T, tol = 0.001, n_iter_max = 100, verbose = F)
N |
Number of nodes. |
edgelist |
A matrix with 4 columns: on the first column the sender node, on the second the receiver, on the third either a one or zero to indicate whether it is an interaction or a non-interaction respectively, on the fourth the corresponding exponential length. |
Z |
A |
lambda |
Mixing proportions of the latent groups. |
mu |
A matrix of size |
nu |
A matrix of size |
directed |
|
trunc |
|
tol |
Stop the maximization if the relative increase in the objective function is not larger than this value. |
n_iter_max |
Stop the maximization if the number of iterations is larger than this value. This parameter can be set to zero or one for debug purposes. |
verbose |
|
computing_time |
Number of seconds required for the evaluation. |
elbo_values |
Stored values of the objective function at each iteration. |
Z_star |
Optimal soft clustering of the nodes into the groups. |
lambda_star |
Optimal mixing proportions. |
mu_star |
Optimal group-specific parameters for the exponential rates of the interaction lengths. |
nu_star |
Optimal group-specific parameters for the exponential rates of the non-interaction lengths. |
set.seed(1) data(high_school) K <- 4 lambda_init <- rep(1/K,K) Z_init <- expSBM_init(high_school$edgelist, K, "random") mu_init <- nu_init <- matrix(1,K,K) expSBM_EM(N = 327, high_school$edgelist, Z_init, lambda_init, mu_init, nu_init)
set.seed(1) data(high_school) K <- 4 lambda_init <- rep(1/K,K) Z_init <- expSBM_init(high_school$edgelist, K, "random") mu_init <- nu_init <- matrix(1,K,K) expSBM_EM(N = 327, high_school$edgelist, Z_init, lambda_init, mu_init, nu_init)
Initialization step for the variational expectation maximization algorithm.
expSBM_init(edgelist, K, method = c("random", "SBM_binary", "SBM_gaussian", "spectral"), sbm_density = 0.5, blur_value = 1)
expSBM_init(edgelist, K, method = c("random", "SBM_binary", "SBM_gaussian", "spectral"), sbm_density = 0.5, blur_value = 1)
edgelist |
A matrix with 4 columns: on the first column the sender node, on the second the receiver, on the third either a one or zero to indicate whether it is an interaction or a non-interaction respectively, on the fourth the corresponding exponential length. |
K |
Number of latent groups. |
method |
Method used to initialise the allocations. Can be one of |
sbm_density |
If |
blur_value |
A value between 0 and 1. If 1, the initialization method returns a hard partition where each node belongs to one group and one only. Reducing this value introduces noise, i.e. it gradually transforms the hard clustering into a soft clustering where each node is equally likely to belong to any of the K given clusters. |
All initialisation methods return a NxK
matrix indicating the partitioning of the nodes.
The method random
intialises the allocation variables uniformly at random.
The method SBM_binary
first calculates the aggregated interaction and non-interaction times for each pair of nodes. Then, it calculates the portion of time when the nodes where interacting over the whole time period. Then it obtains an adjacency matrix by thresholding these values, i.e. values above a given threshold are replaced by ones and values below the threshold are replaced by zeros. The threshold is chosen by setting the parameter sbm_density
which defines the desired density of the graph. Once the adjacency matrix is obtained, a binary stochastic blockmodel is fit on the data hence obtaining the partition.
The method SBM_gaussian
aggregates the interaction values and non-interaction values for each pair of nodes. Then it log-transforms both of these quantities. Then it fits a stochastic blockmodel with multivariate Gaussian edges to obtain the partition.
The method spectral
first calculates the aggregated interaction and non-interaction times for each pair of nodes. Then, it calculates the portion of time when the nodes where interacting over the whole time period. Then it performs model-based clustering on the Laplacian associated to this weighted matrix.
A NxK
matrix indicating the partitioning of the nodes.
set.seed(12345) data(high_school) K <- 4 lambda_init <- rep(1/K,K) expSBM_init(high_school$edgelist, K, "random")
set.seed(12345) data(high_school) K <- 4 lambda_init <- rep(1/K,K) expSBM_init(high_school$edgelist, K, "random")
Runs the variational expectation maximization algorithm for different numbers of latent groups, and selects the best overall model using the integrated completed likelihood criterion. See reference for a detailed explanation of the procedure.
expSBM_select(K_max, N, edgelist, method = "SBM_gaussian", directed = F, trunc = T, tol = 0.001, n_iter_max = 100, init_blur_value = 1, verbose = F)
expSBM_select(K_max, N, edgelist, method = "SBM_gaussian", directed = F, trunc = T, tol = 0.001, n_iter_max = 100, init_blur_value = 1, verbose = F)
K_max |
Estimate and compare the models with number of latent groups equal to 1,2,..., |
N |
Number of nodes. |
edgelist |
A matrix with 4 columns: on the first column the sender node, on the second the receiver, on the third either a one or zero to indicate whether it is an interaction or a non-interaction respectively, on the fourth the corresponding exponential length. |
method |
Indicates the method used for the initialisation. Can be one of |
directed |
|
trunc |
|
tol |
Stop the maximization if the relative increase in the objective function is not larger than this value. |
n_iter_max |
Stop the maximization if the number of iterations is larger than this value. This parameter can be set to zero or one for debug purposes. |
init_blur_value |
A value from zero to one, indicating if the initialized partition should be perturbed with noise. The value one means no noise, whereas the value zero has maximum noise, i.e. each node is equally likely belonging to any of the K groups. |
verbose |
|
fitted_models |
A list with the fitted values for every model considered. |
icl_values |
Integrated completed likelihood values for each model considered. |
K_star |
Optimal number of latent groups, according to the integrated completed likelihood criterion. |
best_model |
Output of the variational expectation maximization algorithm for the best overall model. |
R. Rastelli and M. Fop (2019) "A dynamic stochastic blockmodel for interaction lengths", https://arxiv.org/abs/1901.09828
set.seed(1) data(high_school) res <- expSBM_select(K_max = 8, N = 327, edgelist = high_school$edgelist, method = "random", tol = 0.01)
set.seed(1) data(high_school) res <- expSBM_select(K_max = 8, N = 327, edgelist = high_school$edgelist, method = "random", tol = 0.01)
The data concern face to face interactions among 327 high school students in Marseilles, France, and were collected by means of wearable sensors over a period of 5 days in December 2013. Students wore a sensor badge on their chest and the instrument recorded when they were facing each other with a time resolution of 20 seconds. Thus, any pair of students was considered interacting face-to-face when the sensors of the two were exchanging data packets at any given time during the 20 seconds interval. Additional information on the students is available from the same dataset. Students may have 4 different main specializations: biology (BIO), mathematics and physics (MP), physics and chemistry (PC), and engineering studies (PSI).
data(high_school)
data(high_school)
The list contains:
An adjacency list indicating whether any pair of students had at least one interaction during the 5 days of the study.
An edgelist in a format that can be handled by this package.
Clustering variable indicating the program each student is registered to.
Names of the different programs.
Aggregated version of the previous clustering variable, where programs are aggregated into 4 areas.
Names of the different areas, these correspond to biology (BIO), mathematics and physics (MP), physics and chemistry (PC), and engineering studies (PSI)
Clustering given by the sex of the students.
Labels for each of the sex classes.
R. Mastrandrea, J. Fournet, and A. Barrat (2015). "Contact patterns in a high school: A comparison between data collected using wearable sensors, contact diaries and friendship surveys". PLOS ONE 10.9, pp. 1-26.