syntropy.knn package

syntropy.knn.differential_entropy(data, k, idxs=None, p=inf, noise_level=0.0)[source]

Computes the differential entropy using the Kozachenko-Leoneko estimator.

\[\hat{H}(X) = -\psi(k)+\psi(N) + (1/N)\sum_{i=1}^{N}\log d_i\]
Parameters:
  • data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)

  • k (int) – Number of nearest neighbors

  • idxs (tuple[int, ...] | None) – Indices of channels to use. If None, all channels are used. The default is None.

  • p (float) – The order of the norm to use in the nearest neighbor lookup. If p = 2, the norm is Euclidean. If p = np.inf, the norm is Chebyshev. The default is np.inf.

  • noise_level (float) – The standard deviation of the noise to add to the data. The default is 0.0

Returns:

  • NDArray[np.floating] – The local differential entropy for each sample.

  • float – The expected differential entropy over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Delattre, S., & Fournier, N. (2017). On the Kozachenko–Leonenko entropy estimator. Journal of Statistical Planning and Inference, 185, 69–93. https://doi.org/10.1016/j.jspi.2017.01.004

Kozachenko, L. F., & Leonenko, N. N. (1987). Sample Estimate of the Entropy of a~Random Vector. Problems of Information Transmission, 23(2), 9.

syntropy.knn.mutual_information(idxs_x, idxs_y, k, data, algorithm=1, p=inf)[source]

A wrapper function for the two KSG mutual information functions.

See also

mutual_information_1

Using the KSG-1 algorithm.

mutual_information_2

Using the KSG-2 algorithm.

Parameters:
  • idxs_x (tuple[int, ...]) – Indices of the x-variable.

  • idxs_y (tuple[int, ...]) – Indices of the y-variable.

  • k (int) – Number of nearest neighbors

  • data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)

  • algorithm (int) – Whether to use algorithm 1 or 2. Defaults to 1

  • p (float)

Returns:

  • NDArray[np.floating] – The local mutual information for each sample.

  • float – The expected mutual information over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

syntropy.knn.conditional_mutual_information(idxs_x, idxs_y, idxs_z, k, data, p=inf)[source]

Computes the conditional mutual information I(X;Y|Z) using the KSG algorithm 1 as described in Frenzel & Pompe (2007).

The conditional mutual information is estimated using:

\[\hat{I}(X;Y|Z) = \psi(k) - \langle \psi(n_{xz}+1) + \psi(n_{yz}+1) - \psi(n_z+1) \rangle\]

where n_{xz}, n_{yz}, and n_z are the counts of neighbors within the epsilon-ball in the respective subspaces.

Parameters:
  • idxs_x (tuple[int, ...]) – Indices of the x-variable.

  • idxs_y (tuple[int, ...]) – Indices of the y-variable.

  • idxs_z (tuple[int, ...]) – Indices of the conditioning variable z.

  • k (int) – Number of nearest neighbors

  • data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)

  • p (float)

Returns:

  • NDArray[np.floating] – The local conditional mutual information for each sample.

  • float – The expected conditional mutual information over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Frenzel, S., & Pompe, B. (2007). Partial Mutual Information for Coupling Analysis of Multivariate Time Series. Physical Review Letters, 99(20), 204101. https://doi.org/10.1103/PhysRevLett.99.204101

Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138. https://doi.org/10.1103/PhysRevE.69.066138

syntropy.knn.total_correlation(data, k, idxs=None, algorithm=1)[source]

A wrapper function for the two TC functions.

Parameters:
  • data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)

  • k (int) – Number of nearest neighbors

  • idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)

  • algorithm (int) – Whether to use algorithm 1 or 2. Defaults to 1

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

See also

total_correlation_1

The TC computed using algorithm 1.

total_correlation_2

The TC computed using algorithm 2.

Returns:

  • NDArray[np.floating – The local total correlation for each sample.

  • float – The expected total correlation over all samples

Parameters:
  • data (ndarray[tuple[Any, ...], dtype[floating]])

  • k (int)

  • idxs (tuple[int, ...] | None)

  • algorithm (int)

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

syntropy.knn.dual_total_correlation(data, k, idxs=None)[source]

Compute dual total correlation using KSG estimation. Code adapted from JIDT https://github.com/jlizier/jidt/blob/master/java/source/infodynamics/measures/continuous/kraskov/DualTotalCorrelationCalculatorKraskov.java

Parameters:
  • data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)

  • k (int) – Number of nearest neighbors

  • idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)

Returns:

  • NDArray[np.floating – The local dual total correlation for each sample.

  • float – The expected dual total correlation over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Abdallah, S. A., & Plumbley, M. D. (2012). A measure of statistical complexity based on predictive information with application to finite spin systems. Physics Letters A, 376(4), 275–281. https://doi.org/10.1016/j.physleta.2011.10.066

Rosas, F., Mediano, P. A. M., Gastpar, M., & Jensen, H. J. (2019). Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information. Physical Review E, 100(3), Article 3. https://doi.org/10.1103/PhysRevE.100.032305

syntropy.knn.s_information(data, k, idxs=None)[source]

Compute S-information using KSG estimation. Code adapted from JIDT https://github.com/jlizier/jidt/blob/master/java/source/infodynamics/measures/continuous/kraskov/SInfoCalculatorKraskov.java

Parameters:
  • data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)

  • k (int) – Number of nearest neighbors

  • idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)

Returns:

  • NDArray[np.floating – The local S-information for each sample.

  • float – The expected S-information over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Rosas, F., Mediano, P. A. M., Gastpar, M., & Jensen, H. J. (2019). Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information. Physical Review E, 100(3), Article 3. https://doi.org/10.1103/PhysRevE.100.032305

Varley, T. F., Pope, M., Faskowitz, J., & Sporns, O. (2023). Multivariate information theory uncovers synergistic subsystems of the human cerebral cortex. Communications Biology, 6(1), Article 1. https://doi.org/10.1038/s42003-023-04843-w

syntropy.knn.o_information(data, k, idxs=None)[source]

Compute O-information using KSG estimation. Code adapted from JIDT https://github.com/jlizier/jidt/blob/master/java/source/infodynamics/measures/continuous/kraskov/OInfoCalculatorKraskov.java

O-information quantifies the balance between redundancy (positive values) and synergy (negative values) in multivariate information.

Parameters:
  • data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)

  • k (int) – Number of nearest neighbors

  • idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)

Returns:

  • NDArray[np.floating – The local O-information for each sample.

  • float – The expected O-information over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Rosas, F., Mediano, P. A. M., Gastpar, M., & Jensen, H. J. (2019). Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information. Physical Review E, 100(3), Article 3. https://doi.org/10.1103/PhysRevE.100.032305

Varley, T. F., Pope, M., Faskowitz, J., & Sporns, O. (2023). Multivariate information theory uncovers synergistic subsystems of the human cerebral cortex. Communications Biology, 6(1), Article 1. https://doi.org/10.1038/s42003-023-04843-w

Submodules

syntropy.knn.multivariate_mi module

syntropy.knn.multivariate_mi.total_correlation(data, k, idxs=None, algorithm=1)[source]

A wrapper function for the two TC functions.

Parameters:
  • data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)

  • k (int) – Number of nearest neighbors

  • idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)

  • algorithm (int) – Whether to use algorithm 1 or 2. Defaults to 1

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

See also

total_correlation_1

The TC computed using algorithm 1.

total_correlation_2

The TC computed using algorithm 2.

Returns:

  • NDArray[np.floating – The local total correlation for each sample.

  • float – The expected total correlation over all samples

Parameters:
  • data (ndarray[tuple[Any, ...], dtype[floating]])

  • k (int)

  • idxs (tuple[int, ...] | None)

  • algorithm (int)

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

syntropy.knn.multivariate_mi.total_correlation_1(data, k, idxs=None)[source]

Computes the Kraskov, Stogbauer, Grassberger estimate of the total correlation using the first algorithm presented Kraskov et. al (2004)

\[\hat{TC}(X) = \psi(k) - (m-1)\psi(N) -\langle \psi(n_{x_{1}}+1) + \ldots + \psi(n_{x_{N}}+1)\rangle\]
Parameters:
  • data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)

  • k (int) – Number of nearest neighbors

  • idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)

Returns:

  • NDArray[np.floating – The local total correlation for each sample.

  • float – The expected total correlation over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138. https://doi.org/10.1103/PhysRevE.69.066138

Watanabe, S. (1960). Information Theoretical Analysis of Multivariate Correlation. IBM Journal of Research and Development, 4(1), Article 1. https://doi.org/10.1147/rd.41.0066

syntropy.knn.multivariate_mi.total_correlation_2(data, k, idxs=None)[source]

Computes the Kraskov, Stogbauer, Grassberger estimate of the total correlation using the second algorithm presented in Kraskov et. al., (2004).

\[\hat{TC}(X) = \psi(k) - ((m-1)/k) - (m-1)\psi(N) - \langle \psi(n_{x_{1}}) + \ldots + \psi(n_{x_{N}}) \rangle\]
Parameters:
  • data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)

  • k (int) – Number of nearest neighbors

  • idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)

Returns:

  • NDArray[np.floating – The local total correlation for each sample.

  • float – The expected total correlation over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138. https://doi.org/10.1103/PhysRevE.69.066138

Watanabe, S. (1960). Information Theoretical Analysis of Multivariate Correlation. IBM Journal of Research and Development, 4(1), Article 1. https://doi.org/10.1147/rd.41.0066

syntropy.knn.multivariate_mi.dual_total_correlation(data, k, idxs=None)[source]

Compute dual total correlation using KSG estimation. Code adapted from JIDT https://github.com/jlizier/jidt/blob/master/java/source/infodynamics/measures/continuous/kraskov/DualTotalCorrelationCalculatorKraskov.java

Parameters:
  • data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)

  • k (int) – Number of nearest neighbors

  • idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)

Returns:

  • NDArray[np.floating – The local dual total correlation for each sample.

  • float – The expected dual total correlation over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Abdallah, S. A., & Plumbley, M. D. (2012). A measure of statistical complexity based on predictive information with application to finite spin systems. Physics Letters A, 376(4), 275–281. https://doi.org/10.1016/j.physleta.2011.10.066

Rosas, F., Mediano, P. A. M., Gastpar, M., & Jensen, H. J. (2019). Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information. Physical Review E, 100(3), Article 3. https://doi.org/10.1103/PhysRevE.100.032305

syntropy.knn.multivariate_mi.s_information(data, k, idxs=None)[source]

Compute S-information using KSG estimation. Code adapted from JIDT https://github.com/jlizier/jidt/blob/master/java/source/infodynamics/measures/continuous/kraskov/SInfoCalculatorKraskov.java

Parameters:
  • data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)

  • k (int) – Number of nearest neighbors

  • idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)

Returns:

  • NDArray[np.floating – The local S-information for each sample.

  • float – The expected S-information over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Rosas, F., Mediano, P. A. M., Gastpar, M., & Jensen, H. J. (2019). Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information. Physical Review E, 100(3), Article 3. https://doi.org/10.1103/PhysRevE.100.032305

Varley, T. F., Pope, M., Faskowitz, J., & Sporns, O. (2023). Multivariate information theory uncovers synergistic subsystems of the human cerebral cortex. Communications Biology, 6(1), Article 1. https://doi.org/10.1038/s42003-023-04843-w

syntropy.knn.multivariate_mi.o_information(data, k, idxs=None)[source]

Compute O-information using KSG estimation. Code adapted from JIDT https://github.com/jlizier/jidt/blob/master/java/source/infodynamics/measures/continuous/kraskov/OInfoCalculatorKraskov.java

O-information quantifies the balance between redundancy (positive values) and synergy (negative values) in multivariate information.

Parameters:
  • data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)

  • k (int) – Number of nearest neighbors

  • idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)

Returns:

  • NDArray[np.floating – The local O-information for each sample.

  • float – The expected O-information over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Rosas, F., Mediano, P. A. M., Gastpar, M., & Jensen, H. J. (2019). Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information. Physical Review E, 100(3), Article 3. https://doi.org/10.1103/PhysRevE.100.032305

Varley, T. F., Pope, M., Faskowitz, J., & Sporns, O. (2023). Multivariate information theory uncovers synergistic subsystems of the human cerebral cortex. Communications Biology, 6(1), Article 1. https://doi.org/10.1038/s42003-023-04843-w

syntropy.knn.shannon module

syntropy.knn.shannon.differential_entropy(data, k, idxs=None, p=inf, noise_level=0.0)[source]

Computes the differential entropy using the Kozachenko-Leoneko estimator.

\[\hat{H}(X) = -\psi(k)+\psi(N) + (1/N)\sum_{i=1}^{N}\log d_i\]
Parameters:
  • data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)

  • k (int) – Number of nearest neighbors

  • idxs (tuple[int, ...] | None) – Indices of channels to use. If None, all channels are used. The default is None.

  • p (float) – The order of the norm to use in the nearest neighbor lookup. If p = 2, the norm is Euclidean. If p = np.inf, the norm is Chebyshev. The default is np.inf.

  • noise_level (float) – The standard deviation of the noise to add to the data. The default is 0.0

Returns:

  • NDArray[np.floating] – The local differential entropy for each sample.

  • float – The expected differential entropy over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Delattre, S., & Fournier, N. (2017). On the Kozachenko–Leonenko entropy estimator. Journal of Statistical Planning and Inference, 185, 69–93. https://doi.org/10.1016/j.jspi.2017.01.004

Kozachenko, L. F., & Leonenko, N. N. (1987). Sample Estimate of the Entropy of a~Random Vector. Problems of Information Transmission, 23(2), 9.

syntropy.knn.shannon.mutual_information(idxs_x, idxs_y, k, data, algorithm=1, p=inf)[source]

A wrapper function for the two KSG mutual information functions.

See also

mutual_information_1

Using the KSG-1 algorithm.

mutual_information_2

Using the KSG-2 algorithm.

Parameters:
  • idxs_x (tuple[int, ...]) – Indices of the x-variable.

  • idxs_y (tuple[int, ...]) – Indices of the y-variable.

  • k (int) – Number of nearest neighbors

  • data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)

  • algorithm (int) – Whether to use algorithm 1 or 2. Defaults to 1

  • p (float)

Returns:

  • NDArray[np.floating] – The local mutual information for each sample.

  • float – The expected mutual information over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

syntropy.knn.shannon.mutual_information_1(idxs_x, idxs_y, k, data, p=inf)[source]

Computes the Kraskov, Stogbauer, Grassberger estimate of the bivariate mutual information using the first algorithm presented in Kraskov et. al., (2004)

\[\hat{I}(X;Y) = \psi(k) - \psi(N) -\langle \psi(x+1) + \psi(y+1)\rangle\]
Parameters:
  • idxs_x (tuple[int, ...]) – Indices of the x-variable.

  • idxs_y (tuple[int, ...]) – Indices of the y-variable.

  • k (int) – Number of nearest neighbors

  • data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)

  • p (float)

Returns:

  • NDArray[np.floating – The local mutual information for each sample.

  • float – The expected mutual information over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138. https://doi.org/10.1103/PhysRevE.69.066138

syntropy.knn.shannon.mutual_information_2(idxs_x, idxs_y, k, data, p=inf)[source]

Computes the Kraskov, Stogbauer, Grassberger estimate of the bivariate mutual information using the second algorithm presented in Kraskove et. al., (2004).

\[\hat{I}(X;Y) = \psi(k) - \frac{1}{k} - \psi(N) - \langle \psi(x) + \ldots + \psi(y) \rangle\]
Parameters:
  • idxs_x (tuple[int, ...]) – Indices of the x-variable.

  • idxs_y (tuple[int, ...]) – Indices of the y-variable.

  • k (int) – Number of nearest neighbors

  • data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)

  • p (float)

Returns:

  • NDArray[np.floating – The local mutual information for each sample.

  • float – The expected mutual information over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138. https://doi.org/10.1103/PhysRevE.69.066138

syntropy.knn.shannon.conditional_mutual_information(idxs_x, idxs_y, idxs_z, k, data, p=inf)[source]

Computes the conditional mutual information I(X;Y|Z) using the KSG algorithm 1 as described in Frenzel & Pompe (2007).

The conditional mutual information is estimated using:

\[\hat{I}(X;Y|Z) = \psi(k) - \langle \psi(n_{xz}+1) + \psi(n_{yz}+1) - \psi(n_z+1) \rangle\]

where n_{xz}, n_{yz}, and n_z are the counts of neighbors within the epsilon-ball in the respective subspaces.

Parameters:
  • idxs_x (tuple[int, ...]) – Indices of the x-variable.

  • idxs_y (tuple[int, ...]) – Indices of the y-variable.

  • idxs_z (tuple[int, ...]) – Indices of the conditioning variable z.

  • k (int) – Number of nearest neighbors

  • data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)

  • p (float)

Returns:

  • NDArray[np.floating] – The local conditional mutual information for each sample.

  • float – The expected conditional mutual information over all samples

Return type:

tuple[ndarray[tuple[Any, …], dtype[floating]], float]

References

Frenzel, S., & Pompe, B. (2007). Partial Mutual Information for Coupling Analysis of Multivariate Time Series. Physical Review Letters, 99(20), 204101. https://doi.org/10.1103/PhysRevLett.99.204101

Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138. https://doi.org/10.1103/PhysRevE.69.066138

syntropy.knn.utils module

syntropy.knn.utils.build_tree_and_get_distances(data, k, p=inf)[source]

Builds the KNN tree and returns the indices and distances between each point and it’s k-nearest neighbors.

Parameters:
  • data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)

  • k (int) – Number of nearest neighbors

  • p (float)

Returns:

  • cKDTree – The KNN tree constructed from data.

  • NDArray[np.floating] – The indices of each of the k-nearest neighbors.

  • NDArray[np.integer] – The distances to each k-nearest neighbors (using the max norm).

Return type:

tuple[cKDTree, ndarray[tuple[Any, …], dtype[floating]], ndarray[tuple[Any, …], dtype[integer]]]

syntropy.knn.utils.get_counts_from_tree(tree, data, eps, strict=True, p=inf)[source]
Parameters:
  • tree (cKDTree)

  • data (ndarray[tuple[Any, ...], dtype[floating]])

  • eps (ndarray[tuple[Any, ...], dtype[floating]])

  • strict (bool)

  • p (float)

Return type:

ndarray[tuple[Any, …], dtype[integer]]