Source code for dyconnmap.cluster.ng

# -*- coding: utf-8 -*-
""" NeuralGas

Inspired by Self Organizing Maps (SOMs), Neural Gas (*NG*), an unsupervised adaptive algorithm coined by [Martinetz1991]_.
Neural Gas does not assume a preconstructed lattice thus the adaptation cannot be based on the distances between the
neighbor neurons (like in SOMs) because be definition there are no neighbors.

The adaptation-convergence is driven by a stochastic gradient function with a soft-max adaptation rule that minimizes the
average distortion error.

First, we construct a number of neurons ( :math:`N_\\vec{w}` ) with random weights ( :math:`\\vec{w}` ).
Then we train the model by feeding it feature vectors sequentially drawn from the distribution :math:`P(t)`.
When a new feature vector is presented to the model, we sort all neurons' weights (:math:`N_\\vec{w})` based on their
Euclidean distance from :math:`\\vec{x}`. Then, the adaptation if done by:

.. math::
    \\vec{w} \\leftarrow \\vec{w} + [ N_{\\vec{w}} \\cdot  e(t)) \\cdot  h(k)) \\cdot  (\\vec{x} - \\vec{w}) ], \\forall \\vec{w} \\in N_{\\vec{w}}

where,

.. math::
    h(t)=exp{ \\frac{-k}{\\sigma^2}(t) }

    \\sigma^2 = {\\lambda_i(\\frac{\\lambda_T}{\\lambda_0})}^{(\\frac{t}{T_{max}})}

    e(t) = {e_i(\\frac{e_T}{e_0})}^{(\\frac{t}{T_{max}})}


The parameter :math:`\lambda`, governs the initial and final learning rate, while the parameter :math:`e` the training respectively.

After the presentation of a feature vector, increase the itaration counter :math:`t` and repeat
until all desired criteria are met, or :math:`t = T_{max}`.

With these prototypes, we can represent all the input feature vectors :math:`\\vec{x}` using a Nearest Neighbor rule.
The quality of this encoding can measured by the normalized distortion error:

.. math::
    \\frac{ \\sum_{t=1}^T \\left | \\left | X(t) - X^\\ast(t))  \\right | \\right |^2 }{ \\sum_{t=1}^T \\left | \\left | X(t) - \\overline{X})  \\right | \\right |^2 }

where

.. math::
    \\overline{X}` = \\frac{1}{T} \\sum_{t=1}^T{X(t)}

:math:`T` is the number of reference prototypes; in :math:`X` the input patterns are stored; :math:`X^\\ast` contains
the approximated patterns as produced by the Nearest Neighbor rule.

Notes
-----
For faster convergence, we can also draw random weights from the given probability distribution :math:`P(t)`

|

-----

.. [Martinetz1991] Martinetz, T., Schulten, K., et al. A "neural-gas" network learns topologies. University of Illinois at Urbana-Champaign, 1991.
.. [Laskaris2004] Laskaris, N. A., Fotopoulos, S., & Ioannides, A. A. (2004). Mining information from event-related recordings. Signal Processing Magazine, IEEE, 21(3), 66-77.

"""
# Author: Avraam Marimpis <avraam.marimpis@gmail.com>

import numpy as np
from sklearn.metrics.pairwise import pairwise_distances
from sklearn.neighbors import NearestNeighbors

from .cluster import BaseCluster


[docs]class NeuralGas(BaseCluster):
    """ Neural Gas


    Parameters
    ----------
    n_protos : int
        The number of prototypes

    iterations : int
        The maximum iterations

    epsilon : list of length 2
        The initial and final training rates

    lrate : list of length 2
        The initial and final rearning rates

    n_jobs : int
        Number of parallel jobs (will be passed to scikit-learn))

    metric : string
        One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

        Valid options include:

         - euclidean
         - cityblock
         - l1
         - cosine

    rng : object or None
        An object of type numpy.random.RandomState


    Attributes
    ----------
    protos : array-like, shape(n_protos, n_features)
        The prototypical vectors

    distortion : float
        The normalized distortion error

    Notes
    -----
    Slightly based on *http://webloria.loria.fr/~rougier/downloads/ng.py*
    """

    def __init__(
        self,
        n_protos=10,
        iterations=1024,
        # epsilon=[10, 0.001],
        epsilon=None,
        # lrate=[0.5, 0.005],
        lrate=None,
        n_jobs=1,
        metric="euclidean",
        rng=None,
    ):
        self.n_protos = n_protos
        self.iterations = iterations
        if epsilon is None:
            self.epsilon_i, self.epsilon_f = [10, 0.001]
        if lrate is None:
            self.lrate_i, self.lrate_f = [0.5, 0.005]
        self.n_jobs = n_jobs
        self.protos = None
        self.distortion = 0.0

        if rng is None:
            self.rng = np.random.RandomState()
        else:
            self.rng = rng

        self.metric = metric

        self.__symbols = None
        self.__encoding = None

[docs]    def fit(self, data):
        """ Learn data, and construct a vector codebook.

        Parameters
        ----------
        data : real array-like, shape(n_samples, n_features)
            Data matrix, each row represents a sample.

        Returns
        -------
        self : object
            The instance itself
        """
        [n_samples, _] = data.shape
        self.protos = data[self.rng.choice(n_samples, self.n_protos),]

        avg_p = np.mean(data, 0).reshape(1, -1)
        dist_from_avg_p = np.sum(pairwise_distances(avg_p, data))

        for iteration in range(self.iterations):
            sample = data[self.rng.choice(n_samples, 1),]

            t = iteration / float(self.iterations)
            lrate = self.lrate_i * (self.lrate_f / float(self.lrate_i)) ** t
            epsilon = self.epsilon_i * (self.epsilon_f / float(self.epsilon_i)) ** t

            D = pairwise_distances(
                sample, self.protos, metric=self.metric, n_jobs=self.n_jobs
            )
            I = np.argsort(np.argsort(D))

            H = np.exp(-I / epsilon).ravel()

            diff = sample - self.protos
            for proto_id in range(self.n_protos):
                self.protos[proto_id, :] += lrate * H[proto_id] * diff[proto_id, :]

        nbrs = NearestNeighbors(n_neighbors=1, algorithm="auto").fit(self.protos)
        distances, _ = nbrs.kneighbors(data)
        self.distortion = np.sum(distances) / dist_from_avg_p

        return self