# Cutoffs and cluster filters¶

In this tutorial the usage of cutoffs and cluster filters will be discussed.

## Cutoffs¶

A cluster of lattice sites (i,j,k,…) is defined to be inside a
cutoff if *all* pairwise interatomic distances in the cluster are less
than the cutoff.

Cutoffs can be set for each expansion order as well as for each
*n*-body interaction. This can be seen as a matrix of cutoffs, i.e.

body/order

2

3

4

5

6

2

6.0

6.0

6.0

5.0

5.0

3

5.0

5.0

5.0

4.0

4

4.0

4.0

3.5

The order of a cluster (i.j,k,…) is given by the number of lattice
sites in the cluster, whereas the body (*n*-body interaction) is given
by the number of unique lattice sites in the cluster.
In order to use a cutoff matrix in hiphive one must use the `Cutoffs`

object.
A simple example is shown below for which a third-order `ClusterSpace`

is constructed but only including two-body terms.

```
>>> from ase.build import bulk
>>> from hiphive import ClusterSpace
>>> from hiphive.cutoffs import Cutoffs
>>> prim = bulk('Al', a=4.0)
>>> cutoff_matrix = [
... [5.0, 5.0], # 2-body
... [0.0, 0.0]] # 3-body
>>> cutoffs = Cutoffs(cutoff_matrix)
>>> cs = ClusterSpace(prim, cutoffs)
```

While the majority of parameters in a higher-order FCP is typically associated with three and four-body interactions, they are typically much weaker and less relevant than two-body interactions.
The usage of a cutoff matrix provides a more fine-grained approach and can thus greatly reduce the number of parameters in a `ForceConstantPotential`

, simplifying model construction and improving computational performance.

## Cluster filters¶

Hiphive allows users to use custom cluster filters. These can be used
to reduce the number of clusters that full within the given
cutoffs. The cluster filter has one main function, which given a
cluster (i,j,k,…) is to return True or False depending on whether
the cluster should be kept or not.
The `BaseClusterFilter`

code looks like this:

```
class BaseClusterFilter:
def __init__(self):
pass
def setup(self, atoms):
""" The filter is passed the environment of the primitive cell. """
self._atoms = atoms
def __call__(self, cluster):
""" Returns True or False when a cluster is proposed. """
return True
```

Where the setup function can be used to pre-define or pre-calculate quantities that are needed for the subsequent filter. Filtering can be based on both geometrical aspects, e.g., include a cluster if it involves atoms close to a defect. It can also be based on the atomic species, for example the following filter will only consider higher order clusters if they contain at least one oxygen atom.:

```
class MyOxygenFilter(BaseClusterFilter):
def __call__(self, cluster):
""" Returns True or False when a cluster is proposed. """
order = len(cluster)
if order == 2:
return True
else:
if any(self._atoms[i].symbol == 'O' for i in cluster):
return True
else:
return False
cf = MyOxygenFilter()
cs = ClusterSpace(prim, [6.0, 6.0, 6.0], cluster_filter=cf)
```

In `examples/advanced_topics/cluster_filters`

an example is provided that
demonstrates how to generate a higher-order model for atoms on the
surface while “bulk” atoms are treated harmonically.

Please note that the cutoffs from the `Cutoffs`

object are always enforced first, *after* which the cluster filter is applied.

## Source code¶

Usage of max triplet distance filter

`examples/advanced_topics/cluster_filters/surface_filter.py`

```
import numpy as np
from ase.build import fcc100
from ase.calculators.emt import EMT
from ase.optimize import BFGS
from hiphive.cutoffs import BaseClusterFilter
from hiphive import ClusterSpace, StructureContainer
from trainstation import CrossValidationEstimator
from hiphive.structure_generation import generate_rattled_structures
from hiphive.utilities import prepare_structures
class SurfaceFilter(BaseClusterFilter):
def setup(self, atoms):
self.atoms = atoms
dist_tol = 1e-5
z_pos = atoms.positions[:, 2]
z_max = np.max(z_pos)
z_min = np.min(z_pos)
top_layer = np.where(z_pos + dist_tol > z_max)[0].tolist()
bot_layer = np.where(z_pos - dist_tol < z_min)[0].tolist()
self.surface_indices = top_layer + bot_layer
def __call__(self, cluster):
order = len(cluster)
if order <= 2:
return True
else:
return any(c in self.surface_indices for c in cluster)
def evaluate_cs(cs, structures):
sc = StructureContainer(cs)
for x in structures:
sc.add_structure(x)
cve = CrossValidationEstimator(sc.get_fit_data(), n_splits=10)
cve.validate()
print(cve)
return cve.summary
# setup atoms and filter
atoms = fcc100('Ni', (6, 6, 6), a=4.05, vacuum=20, orthogonal=True)
atoms.pbc = True
calc = EMT()
# relax structure
atoms.set_calculator(calc)
dyn = BFGS(atoms)
converged = dyn.run(fmax=0.0001, steps=1000)
# setup cluster filter
sf = SurfaceFilter()
# build clusterspace
cutoffs1 = [6.0]
cutoffs2 = [6.0, 3.0, 3.0]
cs1 = ClusterSpace(atoms, cutoffs1, cluster_filter=None)
cs2 = ClusterSpace(atoms, cutoffs2, cluster_filter=None)
cs2_sf = ClusterSpace(atoms, cutoffs2, cluster_filter=sf)
print('Degrees of freedom second order :', cs1.n_dofs)
print('Degrees of freedom fourth order :', cs2.n_dofs)
print('Degrees of freedom fourth order with filter :', cs2_sf.n_dofs)
# testing
structures = generate_rattled_structures(atoms, n_structures=25, rattle_std=0.03)
structures = prepare_structures(structures, atoms, calc)
# second order only
cve1 = evaluate_cs(cs1, structures)
cve2 = evaluate_cs(cs2, structures)
cve2_sf = evaluate_cs(cs2_sf, structures)
```