graph2mat.core.data.formats

Module defining formats and conversion management.

Handling sparse matrices for associated 3D point clouds with basis functions is sometimes not straightforward. For each different task (e.g. training a ML model, computing a property…) there might be some data format that is more convenient. To the user (and the developer), converting from any format to any other target format can be a pain. In graph2mat, we try to centralize this task by:

Having a class, Formats, that contains all the formats that we support.
Having a class that manages the conversions between these formats: ConversionManager`.

An instance of ConversionManager is available at graph2mat.conversions.

Classes

`ConversionManager`()	Manages the conversions between formats.
`Formats`()	Class holding all known formats.

class graph2mat.core.data.formats.ConversionManager[source]

Bases: object

Manages the conversions between formats.

This class centralizes the handling of conversions between formats. It uses the formats defined in the Formats class.

Examples

The conversion manager needs to be instantiated in order to be used:

conversions = ConversionManager()

Notice that by doing that, you get an empty conversion manager (with no implemented conversions). graph2mat already provides an instantiated conversion manager with all the implemented conversions. It can be imported like:

from graph2mat import conversions

Then, converters can be registered using the register_converter method:

def my_converter(data: np.ndarray) -> scipy.sparse.coo_matrix:
    ...

conversions.register_converter(Formats.NUMPY, Formats.SCIPY_COO, my_converter)

Or using the converter decorator:

@conversions.converter(Formats.NUMPY, Formats.SCIPY_COO)
def my_converter(data: np.ndarray) -> scipy.sparse.coo_matrix:
    ...

The registered converters can be retrieved using the get_converter method:

converter = conversions.get_converter(Formats.NUMPY, Formats.SCIPY_COO)

# Matrix as a numpy array
array = np.random.rand(10, 10)
# Use the converter to get the matrix as a scipy sparse COO matrix
sparse_coo = converter(array)

They converters are also registered as attributes of the conversion manager, with the names being <source>_to_. For example, to get the numpy to scipy COO converter, one can also do:

converter = conversions.numpy_to_scipy_coo

Note

When writing code that should be future-proof (e.g. inside a package), we recommend using get_converter with the formats retreived from Formats. In the very unlikely event that some format name changes, the error raised will be more informative.

See also

Formats: The class that defines all formats.

__init__()[source]

add_callback(callback: Callable[[str, str, Callable, ConversionManager], Any], retroactive: bool = False)[source]

Add a function that will be called every time a new converter is registered.

Parameters:

callback – The callback function. It will receive the source format, target format, the converter function being registered and the ConversionManager instance.
retroactive (bool) – If True, the callback will be called for all converters that have been previously registered.

converter(source: str, target: str, exists_ok: bool = False) → Callable[[Callable], Callable][source]

converter(converter: Callable, exists_ok: bool = False) → Callable

Decorator to register a converter while defining a function.

Examples

There are two ways to use this decorator:

As a decorator with two arguments:

@converter("source_format", "target_format")
def my_converter(...):
    ...

Where source_format and target_format are strings representing the input and output formats of the converter.

As a no argument decorator:

@converter
def my_converter(data: np.ndarray) -> scipy.sparse.coo_matrix:
    ...

In which case the source and target formats are inferred from the function signature.

get_available_sources(target: str) → list[str][source]

For a given format, return all formats it can be converted from.

Parameters:: target – The target format

get_available_targets(source: str) → list[str][source]

For a given format, return all formats it can be converted to.

Parameters:: source – The source format.

get_converter(source: str, target: str) → Callable[source]

Get a converter function between two formats.

It raises a KeyError if no converter is found.

Parameters:

source – The source format.
target – The target format.

Returns:

The converter function for the given formats.

Return type:

converter

has_converter(source: str, target: str) → bool[source]

Check if a converter exists between two formats.

Parameters:

source – The source format.
target – The target format.

register_converter(source: str, target: str, converter: Callable, exists_ok: bool = False, autodef: bool = False)[source]

Register a converter function between two formats.

Parameters:

source – The source format.
target – The target format.
converter – The function that converts from source to target.
exists_ok – If False, raises a KeyError error if a converter from source to target already exists.
autodef – Whether this is an automatically generated converter. Only set to True by internal calls, should not be set by the user.

register_expanded_converter(old_source: str, old_target: str, expansion_source: str, expansion_target: str, expansion: Callable)[source]

Registers a converter that is an expansion of an existing one.

This function takes care of modifying the signature, docstring etc… so that the user still sees some helpful information when inspecting the converter e.g. in a Jupyter notebook or in the documentation.

The function will automatically detect whether the original converter is to be expanded to the left or to the right, and will create a new converter that chains the original one with the expansion.

Parameters:

old_source – The source format of the original converter.
old_target – The target format of the original converter.
expansion_source – The source format of the expansion.
expansion_target – The target format of the expansion.
expansion – The function that expands the original converter.

class graph2mat.core.data.formats.Formats[source]

Bases: object

Class holding all known formats.

These are referenced by the conversion manager to understand what a function converts from and to.

BASISCONFIGURATION = 'basisconfiguration': The format for graph2mat’s BasisConfiguration class.

BASISMATRIX = 'basismatrix': The format for graph2mat’s BasisMatrix.

BASISMATRIXDATA = 'basismatrixdata': The format for graph2mat’s BasisMatrixData class

BLOCK_DICT = 'block_dict': The format for a row block dictionary.

NODESEDGES = 'nodesedges': Pseudoformat, arrays for edge and node values, without a container.

NUMPY = 'numpy': Numpy array

ORBITALCONFIGURATION = 'orbitalconfiguration': The format for graph2mat’s OrbitalConfiguration class.

SCIPY_COO = 'scipy_coo': Scipy sparse COO matrix/array

SCIPY_CSR = 'scipy_csr': Scipy sparse CSR matrix/array

SISL = 'sisl': Sisl SparseOrbital class

SISL_DM = 'sisl_DM': Sisl DensityMatrix class

SISL_EDM = 'sisl_EDM': Sisl EnergyDensityMatrix class

SISL_GEOMETRY = 'sisl_geometry': Sisl Geometry class, doesn’t contain matrix information.

SISL_H = 'sisl_H': Sisl Hamiltonian class

SISL_SILE = 'sisl_sile': Pseudoformat, path to a file from which sisl can read a matrix’s data.

TORCH = 'torch': Torch tensor

TORCH_BASISMATRIXDATA = 'torch_basismatrixdata': The format for graph2mat’s TorchBasisMatrixData class.

TORCH_COO = 'torch_coo': Torch sparse COO tensor

TORCH_CSR = 'torch_csr': Torch sparse CSR tensor

TORCH_NODESEDGES = 'torch_nodesedges': Pseudoformat, same as NODESEDGES but in torch tensors.

classmethod add_alias(fmt: str, *aliases: Any)[source]

Add an alias for a format.

Parameters:

fmt – The format name.
aliases – The aliases that will be associated with the format. They don’t need to be strings, they can be e.g. a class.

classmethod string_to_attr_name(format_string: str) → str[source]

Get the attribute name that corresponds to a given format string.

This function is quite slow.

Parameters:: format_string – The format string.
Returns:: The attribute name.
Return type:: attr_name