syntropy.knn package¶
- syntropy.knn.differential_entropy(data, k, idxs=None, p=inf, noise_level=0.0)[source]¶
Computes the differential entropy using the Kozachenko-Leoneko estimator.
\[\hat{H}(X) = -\psi(k)+\psi(N) + (1/N)\sum_{i=1}^{N}\log d_i\]- Parameters:
data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)
k (int) – Number of nearest neighbors
idxs (tuple[int, ...] | None) – Indices of channels to use. If None, all channels are used. The default is None.
p (float) – The order of the norm to use in the nearest neighbor lookup. If p = 2, the norm is Euclidean. If p = np.inf, the norm is Chebyshev. The default is np.inf.
noise_level (float) – The standard deviation of the noise to add to the data. The default is 0.0
- Returns:
NDArray[np.floating] – The local differential entropy for each sample.
float – The expected differential entropy over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Delattre, S., & Fournier, N. (2017). On the Kozachenko–Leonenko entropy estimator. Journal of Statistical Planning and Inference, 185, 69–93. https://doi.org/10.1016/j.jspi.2017.01.004
Kozachenko, L. F., & Leonenko, N. N. (1987). Sample Estimate of the Entropy of a~Random Vector. Problems of Information Transmission, 23(2), 9.
- syntropy.knn.mutual_information(idxs_x, idxs_y, k, data, algorithm=1, p=inf)[source]¶
A wrapper function for the two KSG mutual information functions.
See also
mutual_information_1Using the KSG-1 algorithm.
mutual_information_2Using the KSG-2 algorithm.
- Parameters:
idxs_x (tuple[int, ...]) – Indices of the x-variable.
idxs_y (tuple[int, ...]) – Indices of the y-variable.
k (int) – Number of nearest neighbors
data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)
algorithm (int) – Whether to use algorithm 1 or 2. Defaults to 1
p (float)
- Returns:
NDArray[np.floating] – The local mutual information for each sample.
float – The expected mutual information over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
- syntropy.knn.conditional_mutual_information(idxs_x, idxs_y, idxs_z, k, data, p=inf)[source]¶
Computes the conditional mutual information I(X;Y|Z) using the KSG algorithm 1 as described in Frenzel & Pompe (2007).
The conditional mutual information is estimated using:
\[\hat{I}(X;Y|Z) = \psi(k) - \langle \psi(n_{xz}+1) + \psi(n_{yz}+1) - \psi(n_z+1) \rangle\]where n_{xz}, n_{yz}, and n_z are the counts of neighbors within the epsilon-ball in the respective subspaces.
- Parameters:
idxs_x (tuple[int, ...]) – Indices of the x-variable.
idxs_y (tuple[int, ...]) – Indices of the y-variable.
idxs_z (tuple[int, ...]) – Indices of the conditioning variable z.
k (int) – Number of nearest neighbors
data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)
p (float)
- Returns:
NDArray[np.floating] – The local conditional mutual information for each sample.
float – The expected conditional mutual information over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Frenzel, S., & Pompe, B. (2007). Partial Mutual Information for Coupling Analysis of Multivariate Time Series. Physical Review Letters, 99(20), 204101. https://doi.org/10.1103/PhysRevLett.99.204101
Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138. https://doi.org/10.1103/PhysRevE.69.066138
- syntropy.knn.total_correlation(data, k, idxs=None, algorithm=1)[source]¶
A wrapper function for the two TC functions.
- Parameters:
data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)
k (int) – Number of nearest neighbors
idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)
algorithm (int) – Whether to use algorithm 1 or 2. Defaults to 1
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
See also
total_correlation_1The TC computed using algorithm 1.
total_correlation_2The TC computed using algorithm 2.
- Returns:
NDArray[np.floating – The local total correlation for each sample.
float – The expected total correlation over all samples
- Parameters:
data (ndarray[tuple[Any, ...], dtype[floating]])
k (int)
idxs (tuple[int, ...] | None)
algorithm (int)
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
- syntropy.knn.dual_total_correlation(data, k, idxs=None)[source]¶
Compute dual total correlation using KSG estimation. Code adapted from JIDT https://github.com/jlizier/jidt/blob/master/java/source/infodynamics/measures/continuous/kraskov/DualTotalCorrelationCalculatorKraskov.java
- Parameters:
data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)
k (int) – Number of nearest neighbors
idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)
- Returns:
NDArray[np.floating – The local dual total correlation for each sample.
float – The expected dual total correlation over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Abdallah, S. A., & Plumbley, M. D. (2012). A measure of statistical complexity based on predictive information with application to finite spin systems. Physics Letters A, 376(4), 275–281. https://doi.org/10.1016/j.physleta.2011.10.066
Rosas, F., Mediano, P. A. M., Gastpar, M., & Jensen, H. J. (2019). Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information. Physical Review E, 100(3), Article 3. https://doi.org/10.1103/PhysRevE.100.032305
- syntropy.knn.s_information(data, k, idxs=None)[source]¶
Compute S-information using KSG estimation. Code adapted from JIDT https://github.com/jlizier/jidt/blob/master/java/source/infodynamics/measures/continuous/kraskov/SInfoCalculatorKraskov.java
- Parameters:
data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)
k (int) – Number of nearest neighbors
idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)
- Returns:
NDArray[np.floating – The local S-information for each sample.
float – The expected S-information over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Rosas, F., Mediano, P. A. M., Gastpar, M., & Jensen, H. J. (2019). Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information. Physical Review E, 100(3), Article 3. https://doi.org/10.1103/PhysRevE.100.032305
Varley, T. F., Pope, M., Faskowitz, J., & Sporns, O. (2023). Multivariate information theory uncovers synergistic subsystems of the human cerebral cortex. Communications Biology, 6(1), Article 1. https://doi.org/10.1038/s42003-023-04843-w
- syntropy.knn.o_information(data, k, idxs=None)[source]¶
Compute O-information using KSG estimation. Code adapted from JIDT https://github.com/jlizier/jidt/blob/master/java/source/infodynamics/measures/continuous/kraskov/OInfoCalculatorKraskov.java
O-information quantifies the balance between redundancy (positive values) and synergy (negative values) in multivariate information.
- Parameters:
data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)
k (int) – Number of nearest neighbors
idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)
- Returns:
NDArray[np.floating – The local O-information for each sample.
float – The expected O-information over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Rosas, F., Mediano, P. A. M., Gastpar, M., & Jensen, H. J. (2019). Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information. Physical Review E, 100(3), Article 3. https://doi.org/10.1103/PhysRevE.100.032305
Varley, T. F., Pope, M., Faskowitz, J., & Sporns, O. (2023). Multivariate information theory uncovers synergistic subsystems of the human cerebral cortex. Communications Biology, 6(1), Article 1. https://doi.org/10.1038/s42003-023-04843-w
Submodules¶
syntropy.knn.multivariate_mi module¶
- syntropy.knn.multivariate_mi.total_correlation(data, k, idxs=None, algorithm=1)[source]¶
A wrapper function for the two TC functions.
- Parameters:
data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)
k (int) – Number of nearest neighbors
idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)
algorithm (int) – Whether to use algorithm 1 or 2. Defaults to 1
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
See also
total_correlation_1The TC computed using algorithm 1.
total_correlation_2The TC computed using algorithm 2.
- Returns:
NDArray[np.floating – The local total correlation for each sample.
float – The expected total correlation over all samples
- Parameters:
data (ndarray[tuple[Any, ...], dtype[floating]])
k (int)
idxs (tuple[int, ...] | None)
algorithm (int)
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
- syntropy.knn.multivariate_mi.total_correlation_1(data, k, idxs=None)[source]¶
Computes the Kraskov, Stogbauer, Grassberger estimate of the total correlation using the first algorithm presented Kraskov et. al (2004)
\[\hat{TC}(X) = \psi(k) - (m-1)\psi(N) -\langle \psi(n_{x_{1}}+1) + \ldots + \psi(n_{x_{N}}+1)\rangle\]- Parameters:
data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)
k (int) – Number of nearest neighbors
idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)
- Returns:
NDArray[np.floating – The local total correlation for each sample.
float – The expected total correlation over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138. https://doi.org/10.1103/PhysRevE.69.066138
Watanabe, S. (1960). Information Theoretical Analysis of Multivariate Correlation. IBM Journal of Research and Development, 4(1), Article 1. https://doi.org/10.1147/rd.41.0066
- syntropy.knn.multivariate_mi.total_correlation_2(data, k, idxs=None)[source]¶
Computes the Kraskov, Stogbauer, Grassberger estimate of the total correlation using the second algorithm presented in Kraskov et. al., (2004).
\[\hat{TC}(X) = \psi(k) - ((m-1)/k) - (m-1)\psi(N) - \langle \psi(n_{x_{1}}) + \ldots + \psi(n_{x_{N}}) \rangle\]- Parameters:
data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)
k (int) – Number of nearest neighbors
idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)
- Returns:
NDArray[np.floating – The local total correlation for each sample.
float – The expected total correlation over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138. https://doi.org/10.1103/PhysRevE.69.066138
Watanabe, S. (1960). Information Theoretical Analysis of Multivariate Correlation. IBM Journal of Research and Development, 4(1), Article 1. https://doi.org/10.1147/rd.41.0066
- syntropy.knn.multivariate_mi.dual_total_correlation(data, k, idxs=None)[source]¶
Compute dual total correlation using KSG estimation. Code adapted from JIDT https://github.com/jlizier/jidt/blob/master/java/source/infodynamics/measures/continuous/kraskov/DualTotalCorrelationCalculatorKraskov.java
- Parameters:
data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)
k (int) – Number of nearest neighbors
idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)
- Returns:
NDArray[np.floating – The local dual total correlation for each sample.
float – The expected dual total correlation over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Abdallah, S. A., & Plumbley, M. D. (2012). A measure of statistical complexity based on predictive information with application to finite spin systems. Physics Letters A, 376(4), 275–281. https://doi.org/10.1016/j.physleta.2011.10.066
Rosas, F., Mediano, P. A. M., Gastpar, M., & Jensen, H. J. (2019). Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information. Physical Review E, 100(3), Article 3. https://doi.org/10.1103/PhysRevE.100.032305
- syntropy.knn.multivariate_mi.s_information(data, k, idxs=None)[source]¶
Compute S-information using KSG estimation. Code adapted from JIDT https://github.com/jlizier/jidt/blob/master/java/source/infodynamics/measures/continuous/kraskov/SInfoCalculatorKraskov.java
- Parameters:
data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)
k (int) – Number of nearest neighbors
idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)
- Returns:
NDArray[np.floating – The local S-information for each sample.
float – The expected S-information over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Rosas, F., Mediano, P. A. M., Gastpar, M., & Jensen, H. J. (2019). Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information. Physical Review E, 100(3), Article 3. https://doi.org/10.1103/PhysRevE.100.032305
Varley, T. F., Pope, M., Faskowitz, J., & Sporns, O. (2023). Multivariate information theory uncovers synergistic subsystems of the human cerebral cortex. Communications Biology, 6(1), Article 1. https://doi.org/10.1038/s42003-023-04843-w
- syntropy.knn.multivariate_mi.o_information(data, k, idxs=None)[source]¶
Compute O-information using KSG estimation. Code adapted from JIDT https://github.com/jlizier/jidt/blob/master/java/source/infodynamics/measures/continuous/kraskov/OInfoCalculatorKraskov.java
O-information quantifies the balance between redundancy (positive values) and synergy (negative values) in multivariate information.
- Parameters:
data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)
k (int) – Number of nearest neighbors
idxs (tuple[int, ...]) – Indices of variables to use (-1 means all)
- Returns:
NDArray[np.floating – The local O-information for each sample.
float – The expected O-information over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Rosas, F., Mediano, P. A. M., Gastpar, M., & Jensen, H. J. (2019). Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information. Physical Review E, 100(3), Article 3. https://doi.org/10.1103/PhysRevE.100.032305
Varley, T. F., Pope, M., Faskowitz, J., & Sporns, O. (2023). Multivariate information theory uncovers synergistic subsystems of the human cerebral cortex. Communications Biology, 6(1), Article 1. https://doi.org/10.1038/s42003-023-04843-w
syntropy.knn.shannon module¶
- syntropy.knn.shannon.differential_entropy(data, k, idxs=None, p=inf, noise_level=0.0)[source]¶
Computes the differential entropy using the Kozachenko-Leoneko estimator.
\[\hat{H}(X) = -\psi(k)+\psi(N) + (1/N)\sum_{i=1}^{N}\log d_i\]- Parameters:
data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)
k (int) – Number of nearest neighbors
idxs (tuple[int, ...] | None) – Indices of channels to use. If None, all channels are used. The default is None.
p (float) – The order of the norm to use in the nearest neighbor lookup. If p = 2, the norm is Euclidean. If p = np.inf, the norm is Chebyshev. The default is np.inf.
noise_level (float) – The standard deviation of the noise to add to the data. The default is 0.0
- Returns:
NDArray[np.floating] – The local differential entropy for each sample.
float – The expected differential entropy over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Delattre, S., & Fournier, N. (2017). On the Kozachenko–Leonenko entropy estimator. Journal of Statistical Planning and Inference, 185, 69–93. https://doi.org/10.1016/j.jspi.2017.01.004
Kozachenko, L. F., & Leonenko, N. N. (1987). Sample Estimate of the Entropy of a~Random Vector. Problems of Information Transmission, 23(2), 9.
- syntropy.knn.shannon.mutual_information(idxs_x, idxs_y, k, data, algorithm=1, p=inf)[source]¶
A wrapper function for the two KSG mutual information functions.
See also
mutual_information_1Using the KSG-1 algorithm.
mutual_information_2Using the KSG-2 algorithm.
- Parameters:
idxs_x (tuple[int, ...]) – Indices of the x-variable.
idxs_y (tuple[int, ...]) – Indices of the y-variable.
k (int) – Number of nearest neighbors
data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)
algorithm (int) – Whether to use algorithm 1 or 2. Defaults to 1
p (float)
- Returns:
NDArray[np.floating] – The local mutual information for each sample.
float – The expected mutual information over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
- syntropy.knn.shannon.mutual_information_1(idxs_x, idxs_y, k, data, p=inf)[source]¶
Computes the Kraskov, Stogbauer, Grassberger estimate of the bivariate mutual information using the first algorithm presented in Kraskov et. al., (2004)
\[\hat{I}(X;Y) = \psi(k) - \psi(N) -\langle \psi(x+1) + \psi(y+1)\rangle\]- Parameters:
idxs_x (tuple[int, ...]) – Indices of the x-variable.
idxs_y (tuple[int, ...]) – Indices of the y-variable.
k (int) – Number of nearest neighbors
data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)
p (float)
- Returns:
NDArray[np.floating – The local mutual information for each sample.
float – The expected mutual information over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138. https://doi.org/10.1103/PhysRevE.69.066138
- syntropy.knn.shannon.mutual_information_2(idxs_x, idxs_y, k, data, p=inf)[source]¶
Computes the Kraskov, Stogbauer, Grassberger estimate of the bivariate mutual information using the second algorithm presented in Kraskove et. al., (2004).
\[\hat{I}(X;Y) = \psi(k) - \frac{1}{k} - \psi(N) - \langle \psi(x) + \ldots + \psi(y) \rangle\]- Parameters:
idxs_x (tuple[int, ...]) – Indices of the x-variable.
idxs_y (tuple[int, ...]) – Indices of the y-variable.
k (int) – Number of nearest neighbors
data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)
p (float)
- Returns:
NDArray[np.floating – The local mutual information for each sample.
float – The expected mutual information over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138. https://doi.org/10.1103/PhysRevE.69.066138
- syntropy.knn.shannon.conditional_mutual_information(idxs_x, idxs_y, idxs_z, k, data, p=inf)[source]¶
Computes the conditional mutual information I(X;Y|Z) using the KSG algorithm 1 as described in Frenzel & Pompe (2007).
The conditional mutual information is estimated using:
\[\hat{I}(X;Y|Z) = \psi(k) - \langle \psi(n_{xz}+1) + \psi(n_{yz}+1) - \psi(n_z+1) \rangle\]where n_{xz}, n_{yz}, and n_z are the counts of neighbors within the epsilon-ball in the respective subspaces.
- Parameters:
idxs_x (tuple[int, ...]) – Indices of the x-variable.
idxs_y (tuple[int, ...]) – Indices of the y-variable.
idxs_z (tuple[int, ...]) – Indices of the conditioning variable z.
k (int) – Number of nearest neighbors
data (NDArray[np.floating]) – Numpy array of shape (n_variables, n_samples)
p (float)
- Returns:
NDArray[np.floating] – The local conditional mutual information for each sample.
float – The expected conditional mutual information over all samples
- Return type:
tuple[ndarray[tuple[Any, …], dtype[floating]], float]
References
Frenzel, S., & Pompe, B. (2007). Partial Mutual Information for Coupling Analysis of Multivariate Time Series. Physical Review Letters, 99(20), 204101. https://doi.org/10.1103/PhysRevLett.99.204101
Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138. https://doi.org/10.1103/PhysRevE.69.066138
syntropy.knn.utils module¶
- syntropy.knn.utils.build_tree_and_get_distances(data, k, p=inf)[source]¶
Builds the KNN tree and returns the indices and distances between each point and it’s k-nearest neighbors.
- Parameters:
data (NDArray[np.floating]) – Data array of shape (n_variables, n_samples)
k (int) – Number of nearest neighbors
p (float)
- Returns:
cKDTree – The KNN tree constructed from data.
NDArray[np.floating] – The indices of each of the k-nearest neighbors.
NDArray[np.integer] – The distances to each k-nearest neighbors (using the max norm).
- Return type:
tuple[cKDTree, ndarray[tuple[Any, …], dtype[floating]], ndarray[tuple[Any, …], dtype[integer]]]