Tutorial on food webs
tutorial.Rmd
Estimation with colSBM
We load a list of 8 foodwebs. They are binary directed networks with different number of species. First, we are going to model jointly the first networks, using the iid-colSBM model.
# global_opts = list(nb_cores = 1L,
# nb_models = 5L,
# nb_init = 10L,
# depth = 2L,
# verbosity = 1,
# spectral_init = FALSE,
# Q_max = 8L,
# plot_details = 1)
set.seed(1234)
res_fw_iid <- estimate_colSBM(
netlist = foodwebs[1:3], # A list of networks
colsbm_model = "iid", # The name of the model
directed = TRUE, # Foodwebs are directed networks
net_id = names(foodwebs)[1:3], # Name of the networks
nb_run = 1L, # Number of runs of the algorithm
global_opts = list(
verbosity = 0,
plot_details = 0,
Q_max = 8
) # Max number of clusters
)
We can look at how the variational bound and the model selection criteria evolve with the number of clusters. Here, the BICL criterion selects Q = 5 blocks.
plot(res_fw_iid)
best_fit <- res_fw_iid$best_fit
Results and analysis
Here are some useful fields to analyze the results.
best_fit
#> Fitted Collection of Simple SBM -- bernoulli variant for 3 networks
#> =====================================================================
#> Dimension = ( 105 58 71 ) - ( 5 ) blocks.
#> BICL = -1966.999 -- #Empty blocks : 0
#> =====================================================================
#> * Useful fields
#> $distribution, $nb_nodes, $nb_clusters, $support, $Z
#> $memberships, $parameters, $BICL, $vbound, $pred_dyads
We can get:
- the estimation of the model parameters
best_fit$parameters
#> $alpha
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 2.045995e-04 0.654004854 7.872244e-01 1.019399e-09 7.976874e-01
#> [2,] 7.547256e-06 0.007953472 6.055094e-03 6.413325e-12 1.116853e-05
#> [3,] 2.703080e-05 0.143400957 7.534554e-02 2.381464e-10 3.577233e-03
#> [4,] 1.014393e-06 0.007758361 9.118897e-02 8.019048e-11 1.825920e-01
#> [5,] 4.187670e-06 0.010907266 3.094711e-06 7.262520e-12 1.470198e-08
#>
#> $pi
#> $pi[[1]]
#> [1] 0.02575797 0.28631455 0.10692669 0.45375528 0.12724551
#>
#> $pi[[2]]
#> [1] 0.02575797 0.28631455 0.10692669 0.45375528 0.12724551
#>
#> $pi[[3]]
#> [1] 0.02575797 0.28631455 0.10692669 0.45375528 0.12724551
#>
#>
#> $delta
#> [1] 1 1 1
- The block memberships:
best_fit$Z
#> [[1]]
#> Unidentified sp1 FW_009 Terrestrial plant material
#> 1 1
#> Terrestrial invertebrates Achnanthes lanceolata
#> 3 4
#> Batrachospermum Calothrix sp1 FW_009
#> 4 4
#> Cocconeis placentula Cosmarium sp1 FW_009
#> 4 4
#> Cyclotella sp1 FW_009 Cymbella aspera
#> 4 4
#> Cymbella cuspidata Cymbella kappi
#> 4 4
#> Cymbella tumida Diatoma heimale
#> 4 4
#> Epithemia sorex Epithemia turgida
#> 4 4
#> Eunotia serpentina Unidentified sp2 FW_009
#> 4 4
#> Euntoia pectinalis Fragilaria sp1 FW_009
#> 4 4
#> Fragilaria vaucheriae Frustulia rhomboides
#> 4 4
#> Gomphoneis herculeana Gomphonema accuminatum
#> 4 4
#> Gomphonema angustatum Gomphonema truncatum
#> 4 4
#> Unidentified sp3 FW_009 Melosira varians
#> 4 4
#> Navicula avenacea Navicula dicephala
#> 4 4
#> Nitzschia dubia Oedogonium sp1 FW_009
#> 4 4
#> Phormidium sp1 FW_009 Pinnularia mesolepta
#> 4 4
#> Pinnularia viridis Pleaurotaenium sp1 FW_009
#> 4 4
#> Rhoicospenia curvata Rhopalodia sp1 FW_009
#> 4 4
#> Schizothrix sp1 FW_009 Staurastrum sp1 FW_009
#> 4 4
#> Surirella elegans Surirella tenera
#> 4 4
#> Synedra ulna Tabellaria fenestrata
#> 4 4
#> Tabellaria flocculosa Ulothrix sp1 FW_009
#> 4 4
#> Unidentified sp4 FW_009 Unidentified sp5 FW_009
#> 4 4
#> Acroneuria sp1 FW_009 Aelosoma sp1 FW_009
#> 2 3
#> Alloperla sp1 FW_009 Anepeorus sp1 FW_009
#> 2 2
#> Antocha saxicola Baetis
#> 5 3
#> Boyeria vinosa Bryophaenocladius sp1 FW_009
#> 2 3
#> Chauliodes sp1 FW_009 Chimarra atterima
#> 2 3
#> Unidentified sp6 FW_009 Conchapelopia sp1 FW_009
#> 5 3
#> Cryptolabis sp1 FW_009 Cyrnellus sp1 FW_009
#> 2 2
#> Dicrotendipes sp1 FW_009 Diplectrona modesta
#> 3 5
#> Ectopria nervosa Endochironomous sp1 FW_009
#> 5 3
#> Ephemerella sp1 FW_009 Eukieferiella pseudomontana
#> 2 3
#> Eukiefferiella 'dark' type Glossosoma sp1 FW_009
#> 5 2
#> Gyraulus sp1 FW_009 Haploperla brevis
#> 5 2
#> Hexatoma sp1 FW_009 Hydrophilidae
#> 2 2
#> Hydropsyche sp1 FW_009arana Larsia sp1 FW_009
#> 3 2
#> Leucrocuta sp1 FW_009 Leuctra sp1 FW_009
#> 5 3
#> Unidentified sp7 FW_009 Metriocnemus sp1 FW_009
#> 3 3
#> Micrasema sp1 FW_009 Ochthebius sp1 FW_009
#> 2 2
#> Ophiogomphus sp1 FW_009 Paraleptophlebia sp1 FW_009
#> 2 2
#> Paranyctiophylax sp1 FW_009 Polycentropus sp1 FW_009
#> 2 2
#> Probezzia sp1 FW_009 Promoresia sp1 FW_009
#> 2 3
#> Psephenus sp1 FW_009 Pseudolimnolphila sp1 FW_009
#> 5 2
#> Rhyacophila sp1 FW_009 Simulium sp1 FW_009
#> 2 5
#> Sphaerium occidentale Stempelinella sp1 FW_009
#> 5 5
#> Stenelmis crenata Stenelmis sp1 FW_009
#> 5 5
#> Suwalia sp1 FW_009 Tallaperla sp1 FW_009
#> 2 3
#> Tanytarsus Genus A Tricorythodes sp1 FW_009
#> 2 2
#> Notropis heterolepsis Brook trout
#> 2 2
#> Orcnocetes virilis Rhinichthys cataractae
#> 2 2
#> Cambarus bartonii
#> 2
#>
#> [[2]]
#> Unidentified sp1 FW_012_01 Terrestrial invertebrates
#> 1 4
#> Plant material Achnanthes lanceolata
#> 1 4
#> Achnanthes minutissima Audouinella sp1 FW_012_01
#> 4 4
#> Batrachospermum Blue-green algae
#> 4 4
#> Calothrix Cymbella cistula
#> 4 4
#> Cymbella mulleri Diatoma heimale
#> 4 4
#> Epithemia sorex Epithemia turgida
#> 4 4
#> Eunotia pectinalis Eunotia sp1 FW_012_01
#> 4 4
#> Frustulia rhomboides Gomphoneis herculeana
#> 4 4
#> Gomphonema intricatum Gomphonema sp1 FW_012_01
#> 4 4
#> Meridion circulare Navicula avenacea
#> 4 4
#> Pleurotaenium Rhoicosphenia curvata
#> 4 4
#> Stigeoclonium Synedra ulna
#> 4 4
#> Ulothrix Unidentified sp2 FW_012_01
#> 4 4
#> Aelosoma Brachycentrus sp1 FW_012_01
#> 3 5
#> Cambarus bartoni Chauliodes sp1 FW_012_01
#> 2 2
#> Cordulegaster maculata Dicranota sp1 FW_012_01
#> 2 2
#> Ectopria thoracica Epeorus dispar
#> 5 5
#> Glossosoma sp1 FW_012_01 Homoplectra sp1 FW_012_01
#> 2 3
#> Hudsonimya sp1 FW_012_01 Hydropsyche sp1 FW_012_01
#> 2 3
#> Leucrocuta sp1 FW_012_01 Leuctra sp1 FW_012_01
#> 5 2
#> Lumbriculiid oligochaete Parametriocnemus sp1 FW_012_01
#> 3 5
#> Neureclipsis sp1 FW_012_01 Ophiogomphus sp1 FW_012_01
#> 5 2
#> Palpomyia sp1 FW_012_01 Palpomyia sp2 FW_012_01
#> 2 2
#> Promoresia sp1 FW_012_01 Psephenus sp1 FW_012_01
#> 5 5
#> Soliperla sp1 FW_012_01 Stenelmis adult
#> 2 2
#> Stenelmis sp1 FW_012_01 Suwalia sp1 FW_012_01
#> 2 2
#> Tallaperla maria Thaumalea sp1 FW_012_01
#> 2 5
#> Tipula abdominalis Salamander
#> 5 2
#>
#> [[3]]
#> Unidentified sp1 FW_012_02 Terrestrial plants
#> 1 1
#> Terrestrial bugs Achnanthes inflata var. elata
#> 4 4
#> Achnanthes lanceolata Achnanthes linearis
#> 4 4
#> Achnanthes minutissima Auodinella hermanii
#> 4 4
#> Blue Green algae Calothrix
#> 4 4
#> Cocconeis placentula Cymbella kappi
#> 4 4
#> Cymbella mulleri Diatoma heimale
#> 4 4
#> Epithemia turgida Eunotia meisteri
#> 4 4
#> Eunotia pectinalis Fragilaria vaucheriae
#> 4 4
#> Frustulia rhomboides Gomphoneis herculeana
#> 4 4
#> Gomphonema accuminatum Gomphonema angustatum
#> 4 4
#> Gomphonema intricatum Gomphonema tennuellum
#> 4 4
#> Marssoniella Navicula avenacea
#> 4 4
#> Navicula cryptocephala Navicula mutica
#> 4 4
#> Navicula sp1 FW_012_02 Pinnularia viridis
#> 4 4
#> Rhoicosphenia curvata Rhopalodia
#> 4 4
#> Surirella brebbisonii Surirella elegans
#> 4 4
#> Synechoccus Synedra ulna
#> 4 4
#> Ulothrix Unidentified sp2 FW_012_02
#> 4 4
#> Aeolosoma sp1 FW_012_02 Ajax longipes
#> 3 2
#> Amphinemura wui Anchytarsus bicolor
#> 5 2
#> Baetis Hudsonimya
#> 5 2
#> Cricotopus Dicranota
#> 5 2
#> Eukieffidrella pseudomontana Diplectrona modesta
#> 2 2
#> Dixella Dolophilodes
#> 2 2
#> Ectopria thoracica Epeorus dispar
#> 5 5
#> Fatigia pele Hexatoma sp1 FW_012_02
#> 2 2
#> Leuctra Oligo Lumbr. Blue
#> 3 3
#> Oligo. Lumbr. Pink Ophiogomphus
#> 3 2
#> Paraleptophelebia Pericoma
#> 2 5
#> Pilaria Pentaneuri sp1 FW_012_02
#> 2 3
#> Polycentropus maculatus Stenelmis
#> 2 5
#> Tallaperla maria Tanyderid
#> 2 2
#> Conchapelopia Tipula
#> 2 2
#> Wormaldia moesta Crayfish
#> 2 2
#> Salamander
#> 2
- The prediction for each dyads in the networks, here for network
number 3. If your goal is dyad prediction, then you should use
colsbm_model = "delta"
, instead ofcolsbm_model = "iid"
.
best_fit$pred_dyads[[3]][1:10, 1:5]
#> Unidentified sp1 FW_012_02 Terrestrial plants
#> Unidentified sp1 FW_012_02 0.000000e+00 2.046018e-04
#> Terrestrial plants 2.046018e-04 0.000000e+00
#> Terrestrial bugs 3.266636e-06 3.266636e-06
#> Achnanthes inflata var. elata 1.023083e-06 1.023083e-06
#> Achnanthes lanceolata 1.014993e-06 1.014993e-06
#> Achnanthes linearis 1.014738e-06 1.014738e-06
#> Achnanthes minutissima 1.016412e-06 1.016412e-06
#> Auodinella hermanii 2.339017e-06 2.339017e-06
#> Blue Green algae 1.015086e-06 1.015086e-06
#> Calothrix 1.106559e-06 1.106559e-06
#> Terrestrial bugs Achnanthes inflata var. elata
#> Unidentified sp1 FW_012_02 0.121088033 4.531901e-05
#> Terrestrial plants 0.121088033 4.531901e-05
#> Terrestrial bugs 0.000000000 4.525853e-06
#> Achnanthes inflata var. elata 0.005987599 0.000000e+00
#> Achnanthes lanceolata 0.005983016 5.117200e-06
#> Achnanthes linearis 0.005982871 5.117149e-06
#> Achnanthes minutissima 0.005983818 5.117480e-06
#> Auodinella hermanii 0.005133356 4.208144e-06
#> Blue Green algae 0.005983068 5.117218e-06
#> Calothrix 0.006035075 5.130186e-06
#> Achnanthes lanceolata
#> Unidentified sp1 FW_012_02 7.141368e-08
#> Terrestrial plants 7.141368e-08
#> Terrestrial bugs 7.256724e-09
#> Achnanthes inflata var. elata 8.223344e-09
#> Achnanthes lanceolata 0.000000e+00
#> Achnanthes linearis 8.221291e-09
#> Achnanthes minutissima 8.221812e-09
#> Auodinella hermanii 6.757226e-09
#> Blue Green algae 8.221399e-09
#> Calothrix 8.241566e-09
We can also plot the networks individually, with the groups reordered by trophic levels:
p <- gtools::permutations(best_fit$Q, best_fit$Q)
ind <- which.min(
sapply(
seq(nrow(p)),
function(x) {
sum((tcrossprod(best_fit$pi[[1]]) * best_fit$alpha)[p[x, ], p[x, ]][
upper.tri(best_fit$alpha)
])
}
)
)
ord <- p[ind, ]
plot(res_fw_iid$best_fit, type = "block", net_id = 1, ord = ord) +
plot(res_fw_iid$best_fit, type = "block", net_id = 2, ord = ord) +
plot(res_fw_iid$best_fit, type = "block", net_id = 3, ord = ord)
Or make different plots to exhibit the mesoscale structure:
plot(res_fw_iid$best_fit, type = "graphon", ord = ord)
plot(res_fw_iid$best_fit, type = "meso", mixture = TRUE, ord = ord)
#> New names:
#> • `` -> `...1`
#> • `` -> `...2`
#> • `` -> `...3`
Clustering of networks
Let simulate some directed networks with a lower triangular structure that looks alike foodwebs.
set.seed(1234)
alpha <- matrix(c(
.05, .01, .01, .01,
.3, .05, .01, .01,
.5, .4, .05, .01,
.1, .8, .1, .05
), 4, 4, byrow = TRUE)
pi <- c(.1, .2, .6, .1)
sim_net <-
replicate(3,
{
X <-
sbm::sampleSimpleSBM(100,
blockProp = pi, connectParam = list(mean = alpha),
directed = TRUE
)
X$rNetwork
X$networkData
},
simplify = FALSE
)
set.seed(1234)
net_clust <- clusterize_networks(
netlist = c(foodwebs[1:3], sim_net), # A list of networks
colsbm_model = "iid", # The name of the model
directed = TRUE, # Foodwebs are directed networks
net_id = c(names(foodwebs)[1:3], "sim1", "sim2", "sim3"), # Name of the networks
nb_run = 3L, # Nmber of runs of the algorithm
global_opts = # List of options
list(
verbosity = 0, # Verbosity level of the algorithm
plot_details = 0, # Monitoring plot of the algorithm
Q_max = 9, # Max number of clusters
backend = "parallel" # Backend for parallel computing
)
)
We obtain a list of 3 models. The first one is for the full collection, and the second and third one are for the 3 foodwebs and the three collected networks. Here is a plot of the mesoscale structure obtained from the group of simulated networks. We can extract the best partition:
best_partition <- extract_best_partition(net_clust)
The plot of the mesoscale structure of the whole collection is the following:
but then we can compare the mesoscale structures of the 2 groups:
plot(best_partition[[1]],
type = "graphon",
ord = order(best_partition[[1]]$alpha %*% best_partition[[1]]$pi[[1]])
) +
plot(best_partition[[2]],
type = "graphon",
ord = order(best_partition[[2]]$alpha %*% best_partition[[2]]$pi[[1]])
) +
plot_layout(guides = "collect") + plot_annotation(tag_levels = "1")