Cutoffs and cluster filters

In this tutorial the usage of cutoffs and cluster filters will be discussed.


A cluster of lattice sites (i,j,k,…) is defined to be inside a cutoff if all pairwise interatomic distances in the cluster are less than the cutoff.

Cutoffs can be set for each expansion order as well as for each n-body interaction. This can be seen as a matrix of cutoffs, i.e.






















The order of a cluster (i.j,k,…) is given by the number of lattice sites in the cluster, whereas the body (n-body interaction) is given by the number of unique lattice sites in the cluster. In order to use a cutoff matrix in hiphive one must use the Cutoffs object. A simple example is shown below for which a third-order ClusterSpace is constructed but only including two-body terms.

>>> from import bulk
>>> from hiphive import ClusterSpace
>>> from hiphive.cutoffs import Cutoffs
>>> prim = bulk('Al', a=4.0)
>>> cutoff_matrix = [
...     [5.0, 5.0],  # 2-body
...     [0.0, 0.0]]  # 3-body
>>> cutoffs = Cutoffs(cutoff_matrix)
>>> cs = ClusterSpace(prim, cutoffs)

While the majority of parameters in a higher-order FCP is typically associated with three and four-body interactions, they are typically much weaker and less relevant than two-body interactions. The usage of a cutoff matrix provides a more fine-grained approach and can thus greatly reduce the number of parameters in a ForceConstantPotential, simplifying model construction and improving computational performance.

Cluster filters

Hiphive allows users to use custom cluster filters. These can be used to reduce the number of clusters that full within the given cutoffs. The cluster filter has one main function, which given a cluster (i,j,k,…) is to return True or False depending on whether the cluster should be kept or not. The BaseClusterFilter code looks like this:

class BaseClusterFilter:
    def __init__(self):

    def setup(self, atoms):
        """ The filter is passed the environment of the primitive cell. """
        self._atoms = atoms

    def __call__(self, cluster):
        """ Returns True or False when a cluster is proposed. """
        return True

Where the setup function can be used to pre-define or pre-calculate quantities that are needed for the subsequent filter. Filtering can be based on both geometrical aspects, e.g., include a cluster if it involves atoms close to a defect. It can also be based on the atomic species, for example the following filter will only consider higher order clusters if they contain at least one oxygen atom.:

class MyOxygenFilter(BaseClusterFilter):

    def __call__(self, cluster):
        """ Returns True or False when a cluster is proposed. """
        order = len(cluster)
        if order == 2:
            return True
            if any(self._atoms[i].symbol == 'O' for i in cluster):
                return True
                return False

cf = MyOxygenFilter()
cs = ClusterSpace(prim, [6.0, 6.0, 6.0], cluster_filter=cf)

In examples/advanced_topics/cluster_filters an example is provided that demonstrates how to generate a higher-order model for atoms on the surface while “bulk” atoms are treated harmonically.

Please note that the cutoffs from the Cutoffs object are always enforced first, after which the cluster filter is applied.

Source code

Usage of max triplet distance filter

import numpy as np

from import fcc100
from ase.calculators.emt import EMT
from ase.optimize import BFGS

from hiphive.cutoffs import BaseClusterFilter
from hiphive import ClusterSpace, StructureContainer
from trainstation import CrossValidationEstimator
from hiphive.structure_generation import generate_rattled_structures
from hiphive.utilities import prepare_structures

class SurfaceFilter(BaseClusterFilter):

    def setup(self, atoms):
        self.atoms = atoms
        dist_tol = 1e-5
        z_pos = atoms.positions[:, 2]
        z_max = np.max(z_pos)
        z_min = np.min(z_pos)

        top_layer = np.where(z_pos + dist_tol > z_max)[0].tolist()
        bot_layer = np.where(z_pos - dist_tol < z_min)[0].tolist()
        self.surface_indices = top_layer + bot_layer

    def __call__(self, cluster):
        order = len(cluster)
        if order <= 2:
            return True
            return any(c in self.surface_indices for c in cluster)

def evaluate_cs(cs, structures):
    sc = StructureContainer(cs)
    for x in structures:
    cve = CrossValidationEstimator(sc.get_fit_data(), n_splits=10)
    return cve.summary

# setup atoms and filter
atoms = fcc100('Ni', (6, 6, 6), a=4.05, vacuum=20, orthogonal=True)
atoms.pbc = True
calc = EMT()

# relax structure
dyn = BFGS(atoms)
converged =, steps=1000)

# setup cluster filter
sf = SurfaceFilter()

# build clusterspace
cutoffs1 = [6.0]
cutoffs2 = [6.0, 3.0, 3.0]
cs1 = ClusterSpace(atoms, cutoffs1, cluster_filter=None)
cs2 = ClusterSpace(atoms, cutoffs2, cluster_filter=None)
cs2_sf = ClusterSpace(atoms, cutoffs2, cluster_filter=sf)

print('Degrees of freedom second order                :', cs1.n_dofs)
print('Degrees of freedom fourth order                :', cs2.n_dofs)
print('Degrees of freedom fourth order with filter    :', cs2_sf.n_dofs)

# testing
structures = generate_rattled_structures(atoms, n_structures=25, rattle_std=0.03)
structures = prepare_structures(structures, atoms, calc)

# second order only
cve1 = evaluate_cs(cs1, structures)
cve2 = evaluate_cs(cs2, structures)
cve2_sf = evaluate_cs(cs2_sf, structures)