graph2mat.core.data.configuration

Implements classes to store an example of the dataset in memory.

A “configuration” is an object that contains all the information about a given example in the dataset. It contains all the features needed to describe the example (e.g. coordinates, lattice vectors…), and optionally the matrix that corresponds to this example.

In a typical case, your configurations will contain the matrix as a label for training, validating or testing. When doing inference, the configurations will not have an associated matrix, since the matrix is what you are trying to calculate.

Classes

BasisConfiguration(point_types, positions, basis)

Container class to store all the information of an example.

OrbitalConfiguration(point_types, positions, ...)

Stores a distribution of atoms in space, with associated orbitals.

class graph2mat.core.data.configuration.BasisConfiguration(point_types: ndarray, positions: ndarray, basis: Sequence[PointBasis], cell: ndarray | None = None, pbc: tuple | None = None, matrix: BasisMatrix | None = None, weight: float = 1.0, config_type: str | None = 'Default', metadata: Dict[str, Any] | None = None)[source]

Bases: object

Container class to store all the information of an example.

Stores a distribution of points in space, with associated basis functions. Optionally, it can also store an associated matrix.

In a typical case, your configurations will contain the matrix as a label for training, validating or testing. When doing inference, the configurations will not have an associated matrix, since the matrix is what you are trying to calculate.

This is a dataclasses.dataclass. It is purely a container for the information of one example in your dataset.

Parameters:
  • point_types (numpy.ndarray) – Shape (n_points,). The type of each point. Each type can be either a string or an integer, and it should be the type key of a PointBasis object in the basis list.

  • positions (numpy.ndarray) – Shape (n_points, 3). The positions of each point in cartesian coordinates.

  • basis (Sequence[graph2mat.core.data.basis.PointBasis]) – List of PointBasis objects for types that are (possibly) present in the system.

  • cell (numpy.ndarray | None) – Shape (3, 3). The cell vectors that delimit the system, in cartesian coordinates.

  • pbc (tuple | None) – Shape (3,). Whether the system is periodic in each cell direction.

  • matrix (graph2mat.core.data.matrices.basis_matrix.BasisMatrix | None) –

    The matrix associated to the configuration.

    It can be a numpy or scipy sparse matrix, which will be converted to a BasisMatrix object.

  • weight (float) – The weight of the configuration in the loss.

  • config_type (str | None) – A string that indicates the type of configuration.

  • metadata (Dict[str, Any] | None) – A dictionary with additional metadata related to the configuration.

__init__(point_types: ndarray, positions: ndarray, basis: Sequence[PointBasis], cell: ndarray | None = None, pbc: tuple | None = None, matrix: BasisMatrix | None = None, weight: float = 1.0, config_type: str | None = 'Default', metadata: Dict[str, Any] | None = None) None
basis: Sequence[PointBasis]

List of PointBasis objects for types that are (possibly) present in the system.

cell: ndarray | None = None

Shape (3, 3). The cell vectors that delimit the system, in cartesian coordinates.

config_type: str | None = 'Default'

A string that indicates the type of configuration.

matrix: BasisMatrix | None = None

The matrix associated to the configuration.

metadata: Dict[str, Any] | None = None

A dictionary with additional metadata related to the configuration.

pbc: tuple | None = None

Shape (3,). Whether the system is periodic in each cell direction.

point_types: ndarray

Shape (n_points,). The type of each point. Each type can be either a string or an integer, and it should be the type key of a PointBasis object in the basis list.

positions: ndarray

Shape (n_points, 3). The positions of each point in cartesian coordinates.

to_sisl_geometry() Geometry[source]

Converts the configuration to a sisl Geometry.

weight: float = 1.0

The weight of the configuration in the loss.

class graph2mat.core.data.configuration.OrbitalConfiguration(point_types: ndarray, positions: ndarray, basis: Atoms, cell: ndarray | None = None, pbc: tuple | None = None, matrix: OrbitalMatrix | None = None, weight: float = 1.0, config_type: str | None = 'Default', metadata: Dict[str, Any] | None = None)[source]

Bases: BasisConfiguration

Stores a distribution of atoms in space, with associated orbitals.

Optionally, it can also store an associated matrix.

In a typical case, your configurations will contain the matrix as a label for training, validating or testing. When doing inference, the configurations will not have an associated matrix, since the matrix is what you are trying to calculate.

This is a version of BasisConfiguration for atomic systems, where points are atoms.

Parameters:
  • point_types (numpy.ndarray) – Shape (n_points,). The type of each point. Each type can be either a string or an integer, and it should be the type key of a PointBasis object in the basis list.

  • positions (numpy.ndarray) – Shape (n_points, 3). The positions of each point in cartesian coordinates.

  • basis (sisl.Atoms) – Atoms that are (possibly) present in the system.

  • cell (numpy.ndarray | None) – Shape (3, 3). The cell vectors that delimit the system, in cartesian coordinates.

  • pbc (tuple | None) – Shape (3,). Whether the system is periodic in each cell direction.

  • matrix (graph2mat.core.data.matrices.physics.orbital_matrix.OrbitalMatrix | None) –

    The matrix associated to the configuration.

    It can be a numpy or scipy sparse matrix, which will be converted to a BasisMatrix object.

  • weight (float) – The weight of the configuration in the loss.

  • config_type (str | None) – A string that indicates the type of configuration.

  • metadata (Dict[str, Any] | None) – A dictionary with additional metadata related to the configuration.

__init__(point_types: ndarray, positions: ndarray, basis: Atoms, cell: ndarray | None = None, pbc: tuple | None = None, matrix: OrbitalMatrix | None = None, weight: float = 1.0, config_type: str | None = 'Default', metadata: Dict[str, Any] | None = None) None
property atom_types: ndarray

Alias for point_types.

property atoms: Atoms

Alias for basis.

basis: Atoms

Atoms that are (possibly) present in the system.

cell: ndarray | None = None

Shape (3, 3). The cell vectors that delimit the system, in cartesian coordinates.

config_type: str | None = 'Default'

A string that indicates the type of configuration.

classmethod from_geometry(geometry: Geometry, **kwargs) OrbitalConfiguration[source]

Initializes an OrbitalConfiguration object from a sisl geometry.

Note that the created object will not have an associated matrix, unless it is passed explicitly as a keyword argument.

Parameters:
  • geometry (sisl.Geometry) – The geometry to associate to the OrbitalConfiguration.

  • **kwargs – Additional arguments to be passed to the OrbitalConfiguration constructor.

classmethod from_matrix(matrix: SparseOrbital, geometry: Geometry | None = None, labels: bool = True, **kwargs) OrbitalConfiguration[source]

Initializes an OrbitalConfiguration object from a sisl matrix.

Parameters:
  • matrix (sisl.SparseOrbital) – The matrix to associate to the OrbitalConfiguration. This matrix should have an associated geometry, which will be used.

  • geometry (sisl.Geometry, optional) – The geometry to associate to the OrbitalConfiguration. If None, the geometry of the matrix will be used.

  • labels (bool) – Whether to process the labels from the matrix. If False, the only thing to read will be the atomic structure, which is likely the input of your model.

  • **kwargs – Additional arguments to be passed to the OrbitalConfiguration constructor.

classmethod from_run(runfilepath: str | Path, geometry_path: str | Path | None = None, out_matrix: Literal['density_matrix', 'hamiltonian', 'energy_density_matrix', 'dynamical_matrix'] | None = None, basis: Atoms | None = None) OrbitalConfiguration[source]

Initializes an OrbitalConfiguration object from the main input file of a run.

Parameters:
  • runfilepath – The path of the main input file. E.g. in SIESTA this is the path to the “.fdf” file

  • geometry_path – The path to the geometry file. If None, the geometry will be read from the run file.

  • out_matrix – The matrix to be read from the output of the run. The configuration object will contain the matrix. If it is None, then no matrices are read from the output. This is the case when trying to predict matrices, since you don’t have the output yet.

  • cls – Class to initialize, should be a subclass of OrbitalConfiguration.

  • basis – The basis to use for the configuration. If None, the basis of the read geometry will be used.

matrix: OrbitalMatrix | None = None

The matrix associated to the configuration.

metadata: Dict[str, Any] | None = None

A dictionary with additional metadata related to the configuration.

classmethod new(obj: Geometry | SparseOrbital | str | Path, labels: bool = True, **kwargs) OrbitalConfiguration[source]

Creates a new OrbitalConfiguration.

This is just a dispatcher that will call the appropriate method to create the object depending on the type of the input.

Parameters:
  • obj – The object from which to create the OrbitalConfiguration.

  • labels – Whether to find labels (the matrix) to be assigned to the configuration.

  • **kwargs – Additional arguments to be passed to the constructor of the OrbitalConfiguration.

pbc: tuple | None = None

Shape (3,). Whether the system is periodic in each cell direction.

point_types: ndarray

Shape (n_points,). The type of each point. Each type can be either a string or an integer, and it should be the type key of a PointBasis object in the basis list.

positions: ndarray

Shape (n_points, 3). The positions of each point in cartesian coordinates.

weight: float = 1.0

The weight of the configuration in the loss.