graph2mat.core.data

Tools to create and manipilate data to interact with the models.

It implements the functionality needed to handle (sparse) matrices that are related to a graph. There are several things to take into account which make the problem of handling the data non-trivial and therefore this module useful:

  • Matrices are sparse.

  • Matrices are in a basis which is centered around the points in the graph. Therefore elements of the matrix correspond to nodes or edges of the graph.

  • Each point might have more than one basis function, therefore the matrix is divided in blocks (not just single elements) that correspond to nodes or edges of the graph.

  • Different point types might have different basis size, which makes the different blocks in the matrix have different shapes.

  • The different block sizes and the sparsity of the matrices supose and extra challenge when batching examples for machine learning.

The tools in this submodule are agnostic to the machine learning framework of choice, and they are based purely on numpy, with the extra dependency on sisl to handle the sparse matrices. The sisl dependency could eventually be lift off if needed.

Modules

basis

Utilities to describe a basis set for a point type.

configuration

Implements classes to store an example of the dataset in memory.

formats

Module defining formats and conversion management.

matrices

Containers to store the raw matrices as a dictionary of blocks.

metrics

Functions to assess performance.

neighborhood

Neighborhood construction.

node_feats

Experimental module for defining node features.

processing

Core of the data processing.

sparse

Conversion between different sparse representations.

table

Storage of global basis information for a group of configurations.