Tutorial on food webs
tutorial.Rmd
Estimation with colSBM
We load a list of 8 foodwebs. They are binary directed networks with different number of species. First, we are going to model jointly the first \(3\) networks, using the iid-colSBM model.
# global_opts = list(nb_cores = 1L,
# nb_models = 5L,
# nb_init = 10L,
# depth = 2L,
# verbosity = 1,
# spectral_init = FALSE,
# Q_max = 8L,
# plot_details = 1)
set.seed(1234)
res_fw_iid <- estimate_colSBM(netlist = foodwebs[1:3], # A list of networks
colsbm_model = "iid", # The name of the model
directed = TRUE, # Foodwebs are directed networks
net_id = names(foodwebs)[1:3], # Name of the networks
nb_run = 1L, # Number of runs of the algorithm
global_opts = list(verbosity = 0,
plot_details = 0,
Q_max = 8) #Max number of clusters
)
#> Please install the progress-package in order to get a progress bar.
We can look at how the variational bound and the model selection criteria evolve with the number of clusters. Here, the BICL criterion selects Q = 5 blocks.
plot(res_fw_iid)
best_fit <- res_fw_iid$best_fit
Results and analysis
Here are some useful fields to analyze the results.
best_fit
#> Fitted Collection of Simple SBM -- bernoulli variant for 3 networks
#> =====================================================================
#> Dimension = ( 105 58 71 ) - ( 5 ) blocks.
#> BICL = -1965.898 -- #Empty blocks : 0
#> =====================================================================
#> * Useful fields
#> $distribution, $nb_nodes, $nb_clusters, $support, $Z
#> $memberships, $parameters, $BICL, $vbound, $pred_dyads
We can get:
- the estimation of the model parameters
best_fit$parameters
#> $alpha
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 9.607731e-07 8.114266e-01 0.7694572408 1.012126e-09 0.649458439
#> [2,] 3.156278e-10 1.711840e-08 0.0001339098 4.738765e-12 0.007519232
#> [3,] 2.065889e-07 3.349829e-03 0.0583841027 2.548610e-10 0.133917673
#> [4,] 3.816361e-08 1.902550e-01 0.0877403540 7.982200e-11 0.005975504
#> [5,] 2.365337e-08 1.698945e-05 0.0040088852 6.352734e-12 0.008585466
#>
#> $pi
#> $pi[[1]]
#> [1] 0.02564106 0.11738232 0.13009882 0.45191262 0.27496518
#>
#> $pi[[2]]
#> [1] 0.02564106 0.11738232 0.13009882 0.45191262 0.27496518
#>
#> $pi[[3]]
#> [1] 0.02564106 0.11738232 0.13009882 0.45191262 0.27496518
#>
#>
#> $delta
#> [1] 1 1 1
- The block memberships:
best_fit$Z
#> [[1]]
#> Unidentified sp1 FW_009 Terrestrial plant material
#> 1 1
#> Terrestrial invertebrates Achnanthes lanceolata
#> 3 4
#> Batrachospermum Calothrix sp1 FW_009
#> 4 4
#> Cocconeis placentula Cosmarium sp1 FW_009
#> 4 4
#> Cyclotella sp1 FW_009 Cymbella aspera
#> 4 4
#> Cymbella cuspidata Cymbella kappi
#> 4 4
#> Cymbella tumida Diatoma heimale
#> 4 4
#> Epithemia sorex Epithemia turgida
#> 4 4
#> Eunotia serpentina Unidentified sp2 FW_009
#> 4 4
#> Euntoia pectinalis Fragilaria sp1 FW_009
#> 4 4
#> Fragilaria vaucheriae Frustulia rhomboides
#> 4 4
#> Gomphoneis herculeana Gomphonema accuminatum
#> 4 4
#> Gomphonema angustatum Gomphonema truncatum
#> 4 4
#> Unidentified sp3 FW_009 Melosira varians
#> 4 4
#> Navicula avenacea Navicula dicephala
#> 4 4
#> Nitzschia dubia Oedogonium sp1 FW_009
#> 4 4
#> Phormidium sp1 FW_009 Pinnularia mesolepta
#> 4 4
#> Pinnularia viridis Pleaurotaenium sp1 FW_009
#> 4 4
#> Rhoicospenia curvata Rhopalodia sp1 FW_009
#> 4 4
#> Schizothrix sp1 FW_009 Staurastrum sp1 FW_009
#> 4 4
#> Surirella elegans Surirella tenera
#> 4 4
#> Synedra ulna Tabellaria fenestrata
#> 4 4
#> Tabellaria flocculosa Ulothrix sp1 FW_009
#> 4 4
#> Unidentified sp4 FW_009 Unidentified sp5 FW_009
#> 4 4
#> Acroneuria sp1 FW_009 Aelosoma sp1 FW_009
#> 5 3
#> Alloperla sp1 FW_009 Anepeorus sp1 FW_009
#> 5 5
#> Antocha saxicola Baetis
#> 2 3
#> Boyeria vinosa Bryophaenocladius sp1 FW_009
#> 5 3
#> Chauliodes sp1 FW_009 Chimarra atterima
#> 5 3
#> Unidentified sp6 FW_009 Conchapelopia sp1 FW_009
#> 2 5
#> Cryptolabis sp1 FW_009 Cyrnellus sp1 FW_009
#> 5 5
#> Dicrotendipes sp1 FW_009 Diplectrona modesta
#> 3 2
#> Ectopria nervosa Endochironomous sp1 FW_009
#> 2 3
#> Ephemerella sp1 FW_009 Eukieferiella pseudomontana
#> 5 3
#> Eukiefferiella 'dark' type Glossosoma sp1 FW_009
#> 2 5
#> Gyraulus sp1 FW_009 Haploperla brevis
#> 2 5
#> Hexatoma sp1 FW_009 Hydrophilidae
#> 5 5
#> Hydropsyche sp1 FW_009arana Larsia sp1 FW_009
#> 3 5
#> Leucrocuta sp1 FW_009 Leuctra sp1 FW_009
#> 2 3
#> Unidentified sp7 FW_009 Metriocnemus sp1 FW_009
#> 3 3
#> Micrasema sp1 FW_009 Ochthebius sp1 FW_009
#> 5 5
#> Ophiogomphus sp1 FW_009 Paraleptophlebia sp1 FW_009
#> 5 5
#> Paranyctiophylax sp1 FW_009 Polycentropus sp1 FW_009
#> 5 5
#> Probezzia sp1 FW_009 Promoresia sp1 FW_009
#> 5 3
#> Psephenus sp1 FW_009 Pseudolimnolphila sp1 FW_009
#> 2 5
#> Rhyacophila sp1 FW_009 Simulium sp1 FW_009
#> 5 2
#> Sphaerium occidentale Stempelinella sp1 FW_009
#> 2 3
#> Stenelmis crenata Stenelmis sp1 FW_009
#> 2 2
#> Suwalia sp1 FW_009 Tallaperla sp1 FW_009
#> 5 3
#> Tanytarsus Genus A Tricorythodes sp1 FW_009
#> 5 5
#> Notropis heterolepsis Brook trout
#> 5 5
#> Orcnocetes virilis Rhinichthys cataractae
#> 5 5
#> Cambarus bartonii
#> 5
#>
#> [[2]]
#> Unidentified sp1 FW_012_01 Terrestrial invertebrates
#> 1 3
#> Plant material Achnanthes lanceolata
#> 1 4
#> Achnanthes minutissima Audouinella sp1 FW_012_01
#> 4 4
#> Batrachospermum Blue-green algae
#> 4 4
#> Calothrix Cymbella cistula
#> 4 4
#> Cymbella mulleri Diatoma heimale
#> 4 4
#> Epithemia sorex Epithemia turgida
#> 4 4
#> Eunotia pectinalis Eunotia sp1 FW_012_01
#> 4 4
#> Frustulia rhomboides Gomphoneis herculeana
#> 4 4
#> Gomphonema intricatum Gomphonema sp1 FW_012_01
#> 4 4
#> Meridion circulare Navicula avenacea
#> 4 4
#> Pleurotaenium Rhoicosphenia curvata
#> 4 4
#> Stigeoclonium Synedra ulna
#> 4 4
#> Ulothrix Unidentified sp2 FW_012_01
#> 4 4
#> Aelosoma Brachycentrus sp1 FW_012_01
#> 3 2
#> Cambarus bartoni Chauliodes sp1 FW_012_01
#> 5 5
#> Cordulegaster maculata Dicranota sp1 FW_012_01
#> 5 5
#> Ectopria thoracica Epeorus dispar
#> 2 2
#> Glossosoma sp1 FW_012_01 Homoplectra sp1 FW_012_01
#> 5 3
#> Hudsonimya sp1 FW_012_01 Hydropsyche sp1 FW_012_01
#> 5 3
#> Leucrocuta sp1 FW_012_01 Leuctra sp1 FW_012_01
#> 2 3
#> Lumbriculiid oligochaete Parametriocnemus sp1 FW_012_01
#> 3 3
#> Neureclipsis sp1 FW_012_01 Ophiogomphus sp1 FW_012_01
#> 2 5
#> Palpomyia sp1 FW_012_01 Palpomyia sp2 FW_012_01
#> 5 5
#> Promoresia sp1 FW_012_01 Psephenus sp1 FW_012_01
#> 2 2
#> Soliperla sp1 FW_012_01 Stenelmis adult
#> 5 5
#> Stenelmis sp1 FW_012_01 Suwalia sp1 FW_012_01
#> 3 5
#> Tallaperla maria Thaumalea sp1 FW_012_01
#> 3 2
#> Tipula abdominalis Salamander
#> 2 5
#>
#> [[3]]
#> Unidentified sp1 FW_012_02 Terrestrial plants
#> 1 1
#> Terrestrial bugs Achnanthes inflata var. elata
#> 4 4
#> Achnanthes lanceolata Achnanthes linearis
#> 4 4
#> Achnanthes minutissima Auodinella hermanii
#> 4 4
#> Blue Green algae Calothrix
#> 4 4
#> Cocconeis placentula Cymbella kappi
#> 4 4
#> Cymbella mulleri Diatoma heimale
#> 4 4
#> Epithemia turgida Eunotia meisteri
#> 4 4
#> Eunotia pectinalis Fragilaria vaucheriae
#> 4 4
#> Frustulia rhomboides Gomphoneis herculeana
#> 4 4
#> Gomphonema accuminatum Gomphonema angustatum
#> 4 4
#> Gomphonema intricatum Gomphonema tennuellum
#> 4 4
#> Marssoniella Navicula avenacea
#> 4 4
#> Navicula cryptocephala Navicula mutica
#> 4 4
#> Navicula sp1 FW_012_02 Pinnularia viridis
#> 4 4
#> Rhoicosphenia curvata Rhopalodia
#> 4 4
#> Surirella brebbisonii Surirella elegans
#> 4 4
#> Synechoccus Synedra ulna
#> 4 4
#> Ulothrix Unidentified sp2 FW_012_02
#> 4 4
#> Aeolosoma sp1 FW_012_02 Ajax longipes
#> 3 5
#> Amphinemura wui Anchytarsus bicolor
#> 2 2
#> Baetis Hudsonimya
#> 2 5
#> Cricotopus Dicranota
#> 2 5
#> Eukieffidrella pseudomontana Diplectrona modesta
#> 5 5
#> Dixella Dolophilodes
#> 5 5
#> Ectopria thoracica Epeorus dispar
#> 2 2
#> Fatigia pele Hexatoma sp1 FW_012_02
#> 5 5
#> Leuctra Oligo Lumbr. Blue
#> 3 3
#> Oligo. Lumbr. Pink Ophiogomphus
#> 3 5
#> Paraleptophelebia Pericoma
#> 5 2
#> Pilaria Pentaneuri sp1 FW_012_02
#> 5 3
#> Polycentropus maculatus Stenelmis
#> 5 2
#> Tallaperla maria Tanyderid
#> 5 5
#> Conchapelopia Tipula
#> 5 5
#> Wormaldia moesta Crayfish
#> 3 5
#> Salamander
#> 5
- The prediction for each dyads in the networks, here for network
number 3. If your goal is dyad prediction, then you should use
colsbm_model = "delta"
, instead ofcolsbm_model = "iid"
.
best_fit$pred_dyads[[3]][1:10, 1:5]
#> Unidentified sp1 FW_012_02 Terrestrial plants
#> Unidentified sp1 FW_012_02 0.000000e+00 9.630034e-07
#> Terrestrial plants 9.630034e-07 0.000000e+00
#> Terrestrial bugs 5.247760e-08 5.247760e-08
#> Achnanthes inflata var. elata 3.845403e-08 3.845403e-08
#> Achnanthes lanceolata 3.844759e-08 3.844759e-08
#> Achnanthes linearis 3.844759e-08 3.844759e-08
#> Achnanthes minutissima 3.844758e-08 3.844758e-08
#> Auodinella hermanii 3.809306e-08 3.809306e-08
#> Blue Green algae 3.844767e-08 3.844767e-08
#> Calothrix 3.855646e-08 3.855646e-08
#> Terrestrial bugs Achnanthes inflata var. elata
#> Unidentified sp1 FW_012_02 0.130180474 2.964760e-05
#> Terrestrial plants 0.130180474 2.964760e-05
#> Terrestrial bugs 0.000000000 2.968630e-06
#> Achnanthes inflata var. elata 0.008580909 0.000000e+00
#> Achnanthes lanceolata 0.008580562 3.364158e-06
#> Achnanthes linearis 0.008580563 3.364158e-06
#> Achnanthes minutissima 0.008580562 3.364158e-06
#> Auodinella hermanii 0.008136997 3.154766e-06
#> Blue Green algae 0.008580567 3.364158e-06
#> Calothrix 0.008586418 3.363440e-06
#> Achnanthes lanceolata
#> Unidentified sp1 FW_012_02 2.624662e-08
#> Terrestrial plants 2.624662e-08
#> Terrestrial bugs 2.652287e-09
#> Achnanthes inflata var. elata 2.986899e-09
#> Achnanthes lanceolata 0.000000e+00
#> Achnanthes linearis 2.986930e-09
#> Achnanthes minutissima 2.986930e-09
#> Auodinella hermanii 2.801688e-09
#> Blue Green algae 2.986930e-09
#> Calothrix 2.986409e-09
We can also plot the networks individually, with the groups reordered by trophic levels:
p <- gtools::permutations(best_fit$Q, best_fit$Q)
ind <- which.min(
sapply(
seq(nrow(p)),
function(x)
sum((tcrossprod(best_fit$pi[[1]]) * best_fit$alpha )[p[x,],p[x,]][
upper.tri(best_fit$alpha)])))
ord <- p[ind,]
plot(res_fw_iid$best_fit, type = "block", net_id = 1, ord = ord) +
plot(res_fw_iid$best_fit, type = "block", net_id = 2, ord = ord) +
plot(res_fw_iid$best_fit, type = "block", net_id = 3, ord = ord)
Or make different plots to exhibit the mesoscale structure:
plot(res_fw_iid$best_fit, type = "graphon", ord = ord)
plot(res_fw_iid$best_fit, type = "meso", mixture = TRUE, ord = ord)
#> New names:
#> • `` -> `...1`
#> • `` -> `...2`
#> • `` -> `...3`
Clustering of networks
Let simulate some directed networks with a lower triangular structure that looks alike foodwebs.
set.seed(1234)
alpha <- matrix(c(.05, .01, .01, .01,
.3, .05, .01, .01,
.5, .4, .05, .01,
.1, .8, .1, .05), 4, 4, byrow = TRUE)
pi <- c(.1, .2, .6, .1)
sim_net <-
replicate(3,
{X <-
sbm::sampleSimpleSBM(100, blockProp = pi, connectParam = list(mean = alpha),
directed = TRUE)
X$rNetwork
X$networkData}, simplify = FALSE)
set.seed(1234)
net_clust <- clusterize_networks(netlist = c(foodwebs[1:3], sim_net), # A list of networks
colsbm_model = "iid", # The name of the model
directed = TRUE, # Foodwebs are directed networks
net_id = c(names(foodwebs)[1:3], "sim1", "sim2", "sim3"), # Name of the networks
nb_run = 1L, # Nmber of runs of the algorithm
global_opts = list(verbosity = 0,
plot_details = 0,
Q_max = 9) #Max number of clusters
)
#> Please install the progress-package in order to get a progress bar.
#> Please install the progress-package in order to get a progress bar.
#> Please install the progress-package in order to get a progress bar.
#> Please install the progress-package in order to get a progress bar.
#> Please install the progress-package in order to get a progress bar.
#> Please install the progress-package in order to get a progress bar.
#> Please install the progress-package in order to get a progress bar.
We obtain a list of 3 models. The first one is for the full collection, and the second and third one are for the 3 foodwebs and the three collected networks. Here is a plot of the mesoscale structure obtained from the group of simulated networks. We can extract the best partition:
best_partition <- extract_best_partition(net_clust)
The plot of the mesoscale structure of the whole collection is the following:
but then we can compare the mesoscale structures of the 2 groups:
plot(best_partition[[1]], type = "graphon",
ord = order(best_partition[[1]]$alpha %*% best_partition[[1]]$pi[[1]])) +
plot(best_partition[[2]], type = "graphon",
ord = order(best_partition[[2]]$alpha %*% best_partition[[2]]$pi[[1]])) +
plot_layout(guides = "collect") + plot_annotation(tag_levels = "1")