Algorithms for the analysis of node dynamics¶
Active Information Storage (AIS)¶
- class idtxl.active_information_storage.ActiveInformationStorage[source]
Estimate active information storage in individual processes.
Estimate active information storage (AIS) in individual processes of the network. To perform AIS estimation call analyse_network() on the whole network or a set of nodes or call analyse_single_process() to estimate AIS for a single process. See docstrings of the two functions for more information.
References:
Lizier, J. T., Prokopenko, M., & Zomaya, A. Y. (2012). Local measures of information storage in complex distributed computation. Inform Sci, 208, 39–54. http://doi.org/10.1016/j.ins.2012.04.016
Wibral, M., Lizier, J. T., Vögler, S., Priesemann, V., & Galuske, R. (2014). Local active information storage as a tool to understand distributed neural information processing. Front Neuroinf, 8, 1. http://doi.org/10.3389/fninf.2014.00001
Faes, L., Nollo, G., & Porta, A. (2011). Information-based detection of nonlinear Granger causality in multivariate processes via a nonuniform embedding technique. Phys Rev E, 83, 1–15. http://doi.org/10.1103/PhysRevE.83.051112
- Attributes:
- process_setlist
list with indices of analyzed processes
- settingsdict
analysis settings
- current_valuetuple
index of the current value in AIS estimation, (idx process, idx sample)
- selected_vars_fulllist of tuples
samples in the past state, (idx process, idx sample)
- aisfloat
raw AIS value
- signbool
true if AIS is significant
- pvalue: float
p-value of AIS
- analyse_network(settings, data, processes='all')[source]
Estimate active information storage for multiple network processes.
Estimate active information storage for all or a subset of processes in the network.
- Note:
For a detailed description of the algorithm and settings see documentation of the analyse_single_process() method and references in the class docstring.
Example:
>>> data = Data() >>> data.generate_mute_data(100, 5) >>> settings = { >>> 'cmi_estimator': 'JidtKraskovCMI', >>> 'n_perm_max_stat': 200, >>> 'n_perm_min_stat': 200, >>> 'max_lag': 5, >>> 'tau': 1 >>> } >>> processes = [1, 2, 3] >>> network_analysis = ActiveInformationStorage() >>> results = network_analysis.analyse_network(settings, data, >>> processes)
- Args:
- settingsdict
parameters for estimation and statistical testing, see documentation of analyse_single_target() for details, settings can further contain
verbose : bool [optional] - toggle console output (default=True)
fdr_correction : bool [optional] - correct results on the network level, see documentation of stats.ais_fdr() for details (default=True)
- dataData instance
raw data for analysis
- processeslist of int | ‘all’
index of processes (default=’all’); if ‘all’, AIS is estimated for all processes; if list of int, AIS is estimated for processes specified in the list.
- Returns:
- ResultsSingleProcessAnalysis instance
results of network AIS estimation, see documentation of ResultsSingleProcessAnalysis()
- analyse_single_process(settings, data, process)[source]
Estimate active information storage for a single process.
Estimate active information storage for one process in the network. Uses non-uniform embedding found through information maximisation. This is done in three steps (see Lizier and Faes for details):
Find all relevant samples in the processes’ own past, by iteratively adding candidate samples that have significant conditional mutual information (CMI) with the current value (conditional on all samples that were added previously)
Prune the final conditional set by testing the CMI between each sample in the final set and the current value, conditional on all other samples in the final set
Calculate AIS using the final set of candidates as the past state (calculate MI between samples in the past and the current value); test for statistical significance using a permutation test
- Note:
For a further description of the algorithm see references in the class docstring.
- Args:
- settingsdict
parameters for estimator use and statistics:
cmi_estimator : str - estimator to be used for CMI and MI calculation (for estimator settings see the documentation in the estimators_* modules)
max_lag : int - maximum temporal search depth for candidates in the processes’ past in samples
tau : int [optional] - spacing between candidates in the sources’ past in samples (default=1)
n_perm_* : int [optional] - number of permutations, where * can be ‘max_stat’, ‘min_stat’, ‘mi’ (default=500)
alpha_* : float [optional] - critical alpha level for statistical significance, where * can be ‘max_stat’, ‘min_stat’, ‘mi’ (default=0.05)
add_conditionals : list of tuples | str [optional] - force the estimator to add these conditionals when estimating TE; can either be a list of variables, where each variable is described as (idx process, lag wrt to current value) or can be a string: ‘faes’ for Faes-Method (see references)
permute_in_time : bool [optional] - force surrogate creation by shuffling realisations in time instead of shuffling replications; see documentation of Data.permute_samples() for further settings (default=False)
verbose : bool [optional] - toggle console output (default=True)
write_ckp : bool [optional] - enable checkpointing, writes analysis state to disk every time a variable is selected; resume crashed analysis using network_analysis.resume_checkpoint() (default=False)
filename_ckp : str [optional] - checkpoint file name (without extension) (default=’./idtxl_checkpoint’)
- dataData instance
raw data for analysis
- processint
index of process
- Returns:
- ResultsSingleProcessAnalysis instance
results of AIS estimation, see documentation of ResultsSingleProcessAnalysis()
Multivariate Partial Information Decomposition (PID)¶
- class idtxl.multivariate_pid.MultivariatePID[source]
Perform partial information decomposition for individual processes.
Perform partial information decomposition (PID) for multiple source processes (up to 4 sources) and a target process in the network. Estimate unique, shared, and synergistic information in the multiple sources about the target. Call analyse_network() on the whole network or a set of nodes or call analyse_single_target() to estimate PID for a single process. See docstrings of the two functions for more information.
References:
Williams, P. L., & Beer, R. D. (2010). Nonnegative Decomposition of Multivariate Information, 1–14. Retrieved from http://arxiv.org/abs/1004.2515
Makkeh, A. & Gutknecht, A. & Wibral, M. (2020). A Differentiable measure for shared information. 1- 27 Retrieved from http://arxiv.org/abs/2002.03356
- Attributes:
- targetint
index of target process
- sourcesarray type
multiple of indices of source processes
- settingsdict
analysis settings
- resultsdict
estimated PID
- analyse_network(settings, data, targets, sources)[source]
Estimate partial information decomposition for network nodes.
Estimate, for multiple nodes (target processes), the partial information decomposition (PID) for multiple source processes (up to 4 sources) and each of these target processes in the network.
- Note:
For a detailed description of the algorithm and settings see documentation of the analyse_single_target() method and references in the class docstring.
Example:
>>> n = 20 >>> alph = 2 >>> s1 = np.random.randint(0, alph, n) >>> s2 = np.random.randint(0, alph, n) >>> s3 = np.random.randint(0, alph, n) >>> target1 = np.logical_xor(s1, s2).astype(int) >>> target = np.logical_xor(target1, s3).astype(int) >>> data = Data(np.vstack((s1, s2, s3, target)), 'ps', >>> normalise=False) >>> settings = { >>> 'lags_pid': [[1, 1, 1], [3, 2, 7]], >>> 'verbose': False, >>> 'pid_estimator': 'SxPID'} >>> targets = [0, 1] >>> sources = [[1, 2, 3], [0, 2, 3]] >>> pid_analysis = MultivariatePID() >>> results = pid_analysis.analyse_network(settings, data, targets, >>> sources)
- Args:
- settingsdict
parameters for estimation and statistical testing, see documentation of analyse_single_target() for details, can contain
lags_pid : list of lists of ints [optional] - lags in samples between sources and target (default=[[1, 1, …, 1], [1, 1, …, 1], …])
- dataData instance
raw data for analysis
- targetslist of int
index of target processes
- sourceslist of lists
indices of the multiple source processes for each target, e.g., [[0, 1, 2], [1, 0, 3]], all must lists be of the same lenght and list of lists must have the same length as targets
- Returns:
- ResultsMultivariatePID instance
results of network inference, see documentation of ResultsMultivariatePID()
- analyse_single_target(settings, data, target, sources)[source]
Estimate partial information decomposition for a network node.
Estimate partial information decomposition (PID) for multiple source processes (up to 4 sources) and a target process in the network.
- Note:
For a description of the algorithm and the method see references in the class and estimator docstrings.
Example:
>>> n = 20 >>> alph = 2 >>> s1 = np.random.randint(0, alph, n) >>> s2 = np.random.randint(0, alph, n) >>> s3 = np.random.randint(0, alph, n) >>> target1 = np.logical_xor(s1, s2).astype(int) >>> target = np.logical_xor(target1, s3).astype(int) >>> data = Data(np.vstack((s1, s2, s3, target)), 'ps', >>> normalise=False) >>> settings = { >>> 'verbose' : false, >>> 'pid_estimator': 'SxPID', >>> 'lags_pid': [2, 3, 1]} >>> pid_analysis = MultivariatePID() >>> results = pid_analysis.analyse_single_target(settings=settings, >>> data=data, >>> target=0, >>> sources=[1, 2, 3])
Args: settings : dict parameters for estimator use and statistics:
pid_estimator : str - estimator to be used for PID estimation (for estimator settings see the documentation in the estimators_pid modules)
lags_pid : list of ints [optional] - lags in samples between sources and target (default=[1, 1, …, 1])
verbose : bool [optional] - toggle console output (default=True)
- dataData instance
raw data for analysis
- targetint
index of target processes
- sourceslist of ints
indices of the multiple source processes for the target
- Returns: ResultsMultivariatePID instance results of
network inference, see documentation of ResultsPID()
Bivariate Partial Information Decomposition (PID)¶
- class idtxl.bivariate_pid.BivariatePID[source]
Perform partial information decomposition for individual processes.
Perform partial information decomposition (PID) for two source processes and one target process in the network. Estimate unique, shared, and synergistic information in the two sources about the target. Call analyse_network() on the whole network or a set of nodes or call analyse_single_target() to estimate PID for a single process. See docstrings of the two functions for more information.
References:
Williams, P. L., & Beer, R. D. (2010). Nonnegative Decomposition of Multivariate Information, 1–14. Retrieved from http://arxiv.org/abs/1004.2515
Bertschinger, N., Rauh, J., Olbrich, E., Jost, J., & Ay, N. (2014). Quantifying Unique Information. Entropy, 16(4), 2161–2183. http://doi.org/10.3390/e16042161
- Attributes:
- targetint
index of target process
- sourcesarray type
pair of indices of source processes
- settingsdict
analysis settings
- resultsdict
estimated PID
- analyse_network(settings, data, targets, sources)[source]
Estimate partial information decomposition for network nodes.
Estimate partial information decomposition (PID) for multiple nodes in the network.
- Note:
For a detailed description of the algorithm and settings see documentation of the analyse_single_target() method and references in the class docstring.
Example:
>>> n = 20 >>> alph = 2 >>> x = np.random.randint(0, alph, n) >>> y = np.random.randint(0, alph, n) >>> z = np.logical_xor(x, y).astype(int) >>> data = Data(np.vstack((x, y, z)), 'ps', normalise=False) >>> settings = { >>> 'lags_pid': [[1, 1], [3, 2], [0, 0]], >>> 'alpha': 0.1, >>> 'alph_s1': alph, >>> 'alph_s2': alph, >>> 'alph_t': alph, >>> 'max_unsuc_swaps_row_parm': 60, >>> 'num_reps': 63, >>> 'max_iters': 1000, >>> 'pid_estimator': 'SydneyPID'} >>> targets = [0, 1, 2] >>> sources = [[1, 2], [0, 2], [0, 1]] >>> pid_analysis = BivariatePID() >>> results = pid_analysis.analyse_network(settings, data, targets, >>> sources)
- Args:
- settingsdict
parameters for estimation and statistical testing, see documentation of analyse_single_target() for details, can contain
lags_pid : list of lists of ints [optional] - lags in samples between sources and target (default=[[1, 1], [1, 1] …])
- dataData instance
raw data for analysis
- targetslist of int
index of target processes
- sourceslist of lists
indices of the two source processes for each target, e.g., [[0, 2], [1, 0]], must have the same length as targets
- Returns:
- ResultsPID instance
results of network inference, see documentation of ResultsPID()
- analyse_single_target(settings, data, target, sources)[source]
Estimate partial information decomposition for a network node.
Estimate partial information decomposition (PID) for a target node in the network.
- Note:
For a description of the algorithm and the method see references in the class and estimator docstrings.
Example:
>>> n = 20 >>> alph = 2 >>> x = np.random.randint(0, alph, n) >>> y = np.random.randint(0, alph, n) >>> z = np.logical_xor(x, y).astype(int) >>> data = Data(np.vstack((x, y, z)), 'ps', normalise=False) >>> settings = { >>> 'alpha': 0.1, >>> 'alph_s1': alph, >>> 'alph_s2': alph, >>> 'alph_t': alph, >>> 'max_unsuc_swaps_row_parm': 60, >>> 'num_reps': 63, >>> 'max_iters': 1000, >>> 'pid_calc_name': 'SydneyPID', >>> 'lags_pid': [2, 3]} >>> pid_analysis = BivariatePID() >>> results = pid_analysis.analyse_single_target(settings=settings, >>> data=data, >>> target=0, >>> sources=[1, 2])
Args: settings : dict parameters for estimator use and statistics:
pid_estimator : str - estimator to be used for PID estimation (for estimator settings see the documentation in the estimators_pid modules)
lags_pid : list of ints [optional] - lags in samples between sources and target (default=[1, 1])
verbose : bool [optional] - toggle console output (default=True)
- dataData instance
raw data for analysis
- targetint
index of target processes
- sourceslist of ints
indices of the two source processes for the target
- Returns: ResultsPID instance results of
network inference, see documentation of ResultsPID()
History-dependence estimator for neural spiking data¶
Embedding optimization¶
- class idtxl.embedding_optimization_ais_Rudelt.OptimizationRudelt(settings=None)[source]
Optimization of embedding parameters of spike times using the history dependence estimators
References:
- [1]: L. Rudelt, D. G. Marx, M. Wibral, V. Priesemann: Embedding
optimization reveals long-lasting history dependence in neural spiking activity, 2021, PLOS Computational Biology, 17(6)
implemented in idtxl by Michael Lindner, Göttingen 2021
- Args:
- settingsdict
- estimation_methodstring
The method to be used to estimate the history dependence ‘bbc’ or ‘shuffling’.
- embedding_step_sizefloat
Step size delta t (in seconds) with which the window is slid through the data. (default: 0.005)
- embedding_number_of_bins_setlist of integer values
Set of values for d, the number of bins in the embedding. (default: [1, 2, 3, 4, 5])
- embedding_past_range_setlist of floating-point values
Set of values for T, the past range (in seconds) to be used for embeddings. (default: [0.005, 0.00561, 0.00629, 0.00706, 0.00792, 0.00889, 0.00998, 0.01119, 0.01256, 0.01409, 0.01581, 0.01774, 0.01991, 0.02233, 0.02506, 0.02812, 0.03155, 0.0354, 0.03972, 0.04456, 0.05, 0.0561, 0.06295, 0.07063, 0.07924, 0.08891, 0.09976, 0.11194, 0.12559, 0.14092, 0.15811, 0.17741, 0.19905, 0.22334, 0.25059, 0.28117, 0.31548, 0.35397, 0.39716, 0.44563, 0.5, 0.56101, 0.62946, 0.70627, 0.79245, 0.88914, 0.99763, 1.11936, 1.25594, 1.40919, 1.58114, 1.77407, 1.99054, 2.23342, 2.50594, 2.81171, 3.15479, 3.53973, 3.97164, 4.45625, 5.0])
- embedding_scaling_exponent_setdict
Set of values for kappa, the scaling exponent for the bins in the embedding. Should be a python-dictionary with the three entries ‘number_of_scalings’, ‘min_first_bin_size’ and ‘min_step_for_scaling’. defaults: {‘number_of_scalings’: 10, ‘min_first_bin_size’: 0.005, ‘min_step_for_scaling’: 0.01})
- bbc_tolerancefloat
The tolerance for the Bayesian Bias Criterion. Influences which embeddings are discarded from the analysis. (default: 0.05)
- return_averaged_Rbool
Return R_tot as the average over R(T) for T in [T_D, T_max], instead of R_tot = R(T_D). If set to True, the setting for number_of_bootstraps_R_tot (see below) is ignored and set to 0 and CI bounds are not calculated. (default: True)
- timescale_minimum_past_rangefloat
Minimum past range T_0 (in seconds) to take into consideration for the estimation of the information timescale tau_R. (default: 0.01)
- number_of_bootstraps_R_maxint
The number of bootstrap re-shuffles that should be used to determine the optimal embedding. (Bootstrap the estimates of R_max to determine R_tot.) These are computed during the ‘history-dependence’ task because they are essential to obtain R_tot. (default: 250)
- number_of_bootstraps_R_totint
The number of bootstrap re-shuffles that should be used to estimate the confidence interval of the optimal embedding. (Bootstrap the estimates of R_tot = R(T_D) to obtain a confidence interval for R_tot.). These are computed during the ‘confidence-intervals’ task. The setting return_averaged_R (see above) needs to be set to False for this setting to take effect. (default: 250)
- number_of_bootstraps_nonessentialint
The number of bootstrap re-shuffles that should be used to estimate the confidence intervals for embeddings other than the optimal one. (Bootstrap the estimates of R(T) for all other T.) (These are not necessary for the main analysis and therefore default to 0.)
- symbol_block_lengthint
The number of symbols that should be drawn in each block for bootstrap resampling If it is set to None (recommended), the length is automatically chosen, based on heuristics (default: None)
- bootstrap_CI_use_sdbool
Most of the time we observed normally-distributed bootstrap replications, so it is sufficient (and more efficient) to compute confidence intervals based on the standard deviation (default: True)
- bootstrap_CI_percentile_lofloat
The lower percentile for the confidence interval. This has no effect if bootstrap_CI_use_sd is set to True (default: 2.5)
- bootstrap_CI_percentile_hifloat
The upper percentiles for the confidence interval. This has no effect if bootstrap_CI_use_sd is set to True (default: 97.5)
- analyse_auto_MIbool
perform calculation of auto mutual information of the spike train (default: True) If set to True:
- auto_MI_bin_size_setlist of floating-point values
Set of values for the sizes of the bins (in seconds). (default: [0.005, 0.01, 0.025, 0.05, 0.25, 0.5])
- auto_MI_max_delayint
The maximum delay (in seconds) between the past bin and the response. (default: 5)
- visualizationbool
create .eps output image showing the optimization values and graphs for the history dependence and the auto mutual information (default: False) if set to True:
- output_pathString
Path where the .eps images should be saved
- output_prefixString
Prefix of the output images e.g. <output_prefix>_process0.eps
- debug: bool
show values while calculating (default: False)
- analyse_auto_MI(spike_times)[source]
Get the auto MI for the spike times. If it is available from file, load it, else compute it.
- check_inputs()[source]
Check input settings for completeness
- compute_CIs(data, target_R='R_max', symbol_block_length=None)[source]
Compute bootstrap replications of the history dependence estimate which can be used to obtain confidence intervals.
- Args:
- datadata_spiketime object
Input data
- target_RString
One of ‘R_max’, ‘R_tot’ or ‘nonessential’. If set to R_max, replications of R are produced for the T at which R is maximised. If set to R_tot, replications of R are produced for T = T_D (cf get_temporal_depth_T_D). If set to nonessential, replications of R are produced for each T (one embedding per T, cf get_embeddings_that_maximise_R). These are not otherwise used in the analysis and are probably only useful if the resulting plot is visually inspected, so in most cases it can be set to zero.
- symbol_block_lengthint
The number of symbols that should be drawn in each block for bootstrap resampling If it is set to None (recommended), the length is automatically chosen, based on heuristics
- get_auto_MI(spike_times, bin_size, number_of_delays)[source]
Compute the auto mutual information in the neuron’s activity, a measure closely related to history dependence.
- get_bootstrap_history_dependence(data, embedding, number_of_bootstraps, symbol_block_length=None)[source]
For a given embedding, return bootstrap replications for R.
- get_embeddings(embedding_past_range_set, embedding_number_of_bins_set, embedding_scaling_exponent_set)[source]
Get all combinations of parameters T, d, k, based on the sets of selected parameters.
- get_embeddings_that_maximise_R(bbc_tolerance=None, dependent_var='T', get_as_list=False)[source]
For each T (or d), get the embedding for which R is maximised.
For the bbc estimator, here the bbc_tolerance is applied, ie get the unbiased embeddings that maximise R.
- get_history_dependence(data, process)[source]
Estimate the history dependence for each embedding to all given processes.
- get_information_timescale_tau_R()[source]
Get the information timescale tau_R, a characteristic timescale of history dependence similar to an autocorrelation time.
- get_past_range(number_of_bins_d, first_bin_size, scaling_k)[source]
Get the past range T of the embedding, based on the parameters d, tau_1 and k.
- get_set_of_scalings(past_range_T, number_of_bins_d, number_of_scalings, min_first_bin_size, min_step_for_scaling)[source]
Get scaling exponents such that the uniform embedding as well as the embedding for which the first bin has a length of min_first_bin_size (in seconds), as well as linearly spaced scaling factors in between, such that in total number_of_scalings scalings are obtained.
- get_temporal_depth_T_D(get_R_thresh=False)[source]
Get the temporal depth T_D, the past range for the ‘optimal’ embedding parameters.
Given the maximal history dependence R at each past range T, (cf get_embeddings_that_maximise_R), first find the smallest T at which R is maximised (cf get_max_R_T). If bootstrap replications for this R are available, get the smallest T at which this R minus one standard deviation of the bootstrap estimates is attained.
- optimize(data, processes='all')[source]
Optimize the embedding parameters of spike time data using the Rudelt history dependence estimator.
References:
- [1]: L. Rudelt, D. G. Marx, M. Wibral, V. Priesemann: Embedding
optimization reveals long-lasting history dependence in neural spiking activity, 2021, PLOS Computational Biology, 17(6)
implemented in idtxl by Michael Lindner, Göttingen 2021
- Args:
- dataData_spiketime instance
raw data for analysis
- processeslist of int
index of processes; spike times are optimized all processes specified in the list separately.
- Returns: # ——————————————————————————————————– TODO
- ResultsSingleProcessRudelt instance
results of Rudelt optimization, see documentation of ResultsSingleProcessRudelt()
- if visulization in settings was set True (see class OptimizationRudelt):
- .eps images are created for each optimized process containing:
optimized values for the process
graph for the history dependence
graph for auto mutual information (if calculated)
- optimize_single_run(data, process)[source]
optimizes a single realisation of spike time data given the process number
- Args:
- dataData_spiketime instance
raw data for analysis
- processint
index of process;
- Returns:
- DotDict
with the following keys
- Processint
Process that was optimized
- estimation_methodString
Estimation method that was used for optimization
- T_Dfloat
Estimated optimal value for the temporal depth TD
- tau_R :
Information timescale tau_R, a characteristic timescale of history dependence similar to an autocorrelation time.
- R_totfloat
Estimated value for the total history dependence Rtot,
- AIS_totfloat
Estimated value for the total active information storage
- opt_number_of_bins_dint
Number of bins d for the embedding that yields (R̂tot ,T̂D)
- opt_scaling_kint
Scaling exponent κ for the embedding that yields (R̂tot , T̂D)
- opt_first_bin_sizeint
Size of the first bin τ1 for the embedding that yields (R̂tot , T̂D ),
- history_dependencearray with floating-point values
Estimated history dependence for each embedding
- firing_ratefloat
Firing rate of the neuron/ spike train
- recording_lengthfloat
Length of the recording (in seconds)
- H_spikingfloat
Entropy of the spike times
- if analyse_auto_MI was set to True additionally:
- auto_MIdict
numpy array of MI values for each delay
- auto_MI_delayslist of int
list of delays depending on the given auto_MI_bin_sizes and auto_MI_max_delay
- remove_subresults_single_process()[source]
delete results from self from single process
Estimators¶
Provide HDE estimators.
- class idtxl.estimators_Rudelt.RudeltAbstractEstimator(settings=None)[source]
Abstract class for implementation of nsb and plugin estimators from Rudelt.
Abstract class for implementation of nsb and plugin estimators, child classes implement estimators for mutual information (MI) .
References:
- [1]: L. Rudelt, D. G. Marx, M. Wibral, V. Priesemann: Embedding
optimization reveals long-lasting history dependence in neural spiking activity, 2021, PLOS Computational Biology, 17(6)
implemented in idtxl by Michael Lindner, Göttingen 2021
- Args:
- settingsdict
- embedding_step_sizefloat [optional]
Step size delta t (in seconds) with which the window is slid through the data (default = 0.005).
- normalisebool [optional]
rebase spike times to zero (default=True)
- return_averaged_Rbool [optional]
If set to True, compute R̂tot as the average over R̂(T ) for T ∈ [T̂D, Tmax ] instead of R̂tot = R(T̂D ). If set to True, the setting for number_of_bootstraps_R_tot is ignored and set to 0 (default=True)
- get_median_number_of_spikes_per_bin(raw_symbols)[source]
Given raw symbols (in which the number of spikes per bin are counted, ie not necessarily binary quantity), get the median number of spikes for each bin, among all symbols obtained by the embedding.
- get_multiplicities(symbol_counts, alphabet_size)[source]
Get the multiplicities of some given symbol counts.
To estimate the entropy of a system, it is only important how often a symbol/ event occurs (the probability that it occurs), not what it represents. Therefore, computations can be simplified by summarizing symbols by their frequency, as represented by the multiplicities.
- get_past_range(number_of_bins_d, first_bin_size, scaling_k)[source]
Get the past range T of the embedding, based on the parameters d, tau_1 and k.
- get_raw_symbols(spike_times, embedding, first_bin_size)[source]
Get the raw symbols (in which the number of spikes per bin are counted, ie not necessarily binary quantity), as obtained by applying the embedding.
- get_symbol_counts(symbol_array)[source]
Count how often symbols occur
- get_window_delimiters(number_of_bins_d, scaling_k, first_bin_size)[source]
Get delimiters of the window, used to describe the embedding. The window includes both the past embedding and the response.
The delimiters are times, relative to the first bin, that separate two consequent bins.
- is_analytic_null_estimator()[source]
Indicate if estimator supports analytic surrogates.
Return true if the estimator implements estimate_surrogates_analytic() where data is formatted as per the estimate method for this estimator.
- Returns:
bool
- is_parallel()[source]
Indicate if estimator supports parallel estimation over chunks.
Return true if the supports parallel estimation over chunks, where a chunk is one independent data set.
- Returns:
bool
- symbol_array_to_binary(spikes_in_window, number_of_bins_d)[source]
Given an array of 1s and 0s, representing spikes and the absence thereof, read the array as a binary number to obtain a (base 10) integer.
- symbol_binary_to_array(symbol_binary, number_of_bins_d)[source]
Given a binary representation of a symbol (cf symbol_array_to_binary), convert it back into its array-representation.
- class idtxl.estimators_Rudelt.RudeltAbstractNSBEstimator(settings=None)[source]
Abstract class for implementation of NSB estimators from Rudelt.
Abstract class for implementation of Nemenman-Shafee-Bialek (NSB) estimators, child classes implement nsb estimators for mutual information (MI).
implemented in idtxl by Michael Lindner, Göttingen 2021
References:
- [1]: L. Rudelt, D. G. Marx, M. Wibral, V. Priesemann: Embedding
optimization reveals long-lasting history dependence in neural spiking activity, 2021, PLOS Computational Biology, 17(6)
- [2]: I. Nemenman, F. Shafee, W. Bialek: Entropy and inference,
revisited. In T.G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, Cambridge, MA, 2002. MIT Press.
- Args:
- settingsdict
- embedding_step_sizefloat [optional]
Step size delta t (in seconds) with which the window is slid through the data (default = 0.005).
- normalisebool [optional]
rebase spike times to zero (default=True)
- return_averaged_Rbool [optional]
If set to True, compute R̂tot as the average over R̂(T ) for T ∈ [T̂D, Tmax ] instead of R̂tot = R(T̂D ). If set to True, the setting for number_of_bootstraps_R_tot is ignored and set to 0 (default=True)
- H1(beta, mk, K, N)[source]
Compute the first moment (expectation value) of the entropy H.
H is the entropy one obtains with a symmetric Dirichlet prior with concentration parameter beta and a multinomial likelihood.
- alpha_ML(mk, K1, N)[source]
Compute first guess for the beta_MAP (cf get_beta_MAP) parameter via the posterior of a Dirichlet process.
- d2_log_rho(beta, mk, K, N)[source]
Second derivate of the logarithm of the Dirichlet multinomial likelihood.
- d2_log_rho_xi(beta, mk, K, N)[source]
Second derivative of the logarithm of the nsb (unnormalized) posterior.
- d2_xi(beta, K)[source]
Second derivative of xi(beta) (cf d_xi).
- d3_xi(beta, K)[source]
Third derivative of xi(beta) (cf d_xi).
- d_log_rho(beta, mk, K, N)[source]
First derivate of the logarithm of the Dirichlet multinomial likelihood.
- d_log_rho_xi(beta, mk, K, N)[source]
First derivative of the logarithm of the nsb (unnormalized) posterior.
- d_xi(beta, K)[source]
First derivative of xi(beta).
xi(beta) is the entropy of the system when no data has been observed. d_xi is the prior for the nsb estimator
- get_beta_MAP(mk, K, N)[source]
Get the maximum a posteriori (MAP) value for beta.
Provides the location of the peak, around which we integrate.
beta_MAP is the value for beta for which the posterior of the NSB estimator is maximised (or, equivalently, of the logarithm thereof, as computed here).
- get_integration_bounds(mk, K, N)[source]
Find the integration bounds for the estimator.
Typically it is a delta-like distribution so it is sufficient to integrate around this peak. (If not this function is not called.)
- log_likelihood_DP_alpha(a, K1, N)[source]
Alpha-dependent terms of the log-likelihood of a Dirichlet Process.
- nsb_entropy(mk, K, N)[source]
Estimate the entropy of a system using the NSB estimator.
- Parameters
mk – multiplicities
K – number of possible symbols/ state space of the system
N – total number of observed symbols
- rho(beta, mk, K, N)[source]
rho(beta, data) is the Dirichlet multinomial likelihood.
rho(beta, data) together with the d_xi(beta) make up the posterior for the nsb estimator
- unnormalized_posterior(beta, mk, K, N)[source]
The (unnormalized) posterior in the nsb estimator.
Product of the likelihood rho and the prior d_xi; the normalizing factor is given by the marginal likelihood
- class idtxl.estimators_Rudelt.RudeltBBCEstimator(settings=None)[source]
Bayesian bias criterion (BBC) Estimator using NSB and Plugin estimator
Calculate the mutual information (MI) of one variable depending on its past using nsb and plugin estimator and check if bias criterion is passed. See parent class for references.
implemented in idtxl by Michael Lindner, Göttingen 2021
- Args:
- settingsdict
- embedding_step_sizefloat [optional]
Step size delta t (in seconds) with which the window is slid through the data (default = 0.005).
- normalisebool [optional]
rebase spike times to zero (default=True)
- return_averaged_Rbool [optional]
If set to True, compute R̂tot as the average over R̂(T ) for T ∈ [T̂D, Tmax ] instead of R̂tot = R(T̂D ). If set to True, the setting for number_of_bootstraps_R_tot is ignored and set to 0 (default=True)
- bayesian_bias_criterion(R_nsb, R_plugin, bbc_tolerance)[source]
Get whether the Bayesian bias criterion (bbc) is passed.
- Parameters
R_nsb – history dependence computed with NSB estimator
R_plugin – history dependence computed with plugin estimator
bbc_tolerance – tolerance for the Bayesian bias criterion
- estimate(symbol_array, past_symbol_array, current_symbol_array, bbc_tolerance=None)[source]
Calculate the mutual information (MI) of one variable depending on its past using nsb and plugin estimator and check if bias criterion is passed/
- Args:
- symbol_array1D numpy array
realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)
- past_symbol_arraynumpy array
realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)
- current_symbol_arraynumpy array
realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)
- Returns:
- I (float)
MI (AIS)
- R (float)
MI / H_uncond (History dependence)
- bbc_term (float)
bbc tolerance-independent term of the Bayesian bias criterion (bbc)
- get_bbc_term(R_nsb, R_plugin)[source]
Get the bbc tolerance-independent term of the Bayesian bias criterion (bbc).
- Parameters
R_nsb – history dependence computed with NSB estimator
R_plugin – history dependence computed with plugin estimator
- class idtxl.estimators_Rudelt.RudeltNSBEstimatorSymbolsMI(settings=None)[source]
History dependence NSB estimator
Calculate the mutual information (MI) of one variable depending on its past using NSB estimator. See parent class for references.
implemented in idtxl by Michael Lindner, Göttingen 2021
- Args:
- settingsdict
- embedding_step_sizefloat [optional]
Step size delta t (in seconds) with which the window is slid through the data (default = 0.005).
- normalisebool [optional]
rebase spike times to zero (default=True)
- return_averaged_Rbool [optional]
If set to True, compute R̂tot as the average over R̂(T ) for T ∈ [T̂D, Tmax ] instead of R̂tot = R(T̂D ). If set to True, the setting for number_of_bootstraps_R_tot is ignored and set to 0 (default=True)
- estimate(symbol_array, past_symbol_array, current_symbol_array)[source]
Estimate mutual information using NSB estimator.
- Args:
- symbol_array1D numpy array
realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)
- past_symbol_arraynumpy array
realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)
- current_symbol_arraynumpy array
realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)
- Returns:
- I (float)
MI (AIS)
- R (float)
MI / H_uncond (History dependence)
- nsb_estimator(symbol_counts, past_symbol_counts, alphabet_size, alphabet_size_past, H_uncond)[source]
Estimate the entropy of a system using the NSB estimator.
- class idtxl.estimators_Rudelt.RudeltPluginEstimatorSymbolsMI(settings=None)[source]
Plugin History dependence estimator
Calculate the mutual information (MI) of one variable depending on its past using plugin estimator. See parent class for references.
implemented in idtxl by Michael Lindner, Göttingen 2021
- Args:
- settingsdict
- embedding_step_sizefloat [optional] - Step size delta t (in seconds) with which the window is slid
through the data (default = 0.005).
normalise : bool [optional] - rebase spike times to zero (default=True)
return_averaged_R : bool [optional] - rebase spike times to zero (default=True)
- estimate(symbol_array, past_symbol_array, current_symbol_array)[source]
Estimate mutual information using plugin estimator.
- Args:
- symbol_array1D numpy array
realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)
- past_symbol_arraynumpy array
realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)
- current_symbol_arraynumpy array
realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)
- Returns:
- I (float)
MI (AIS)
- R (float)
MI / H_uncond (History dependence)
- plugin_entropy(mk, N)[source]
Estimate the entropy of a system using the Plugin estimator.
(In principle this is the same function as utl.get_shannon_entropy, only here it is a function of the multiplicities, not the probabilities.)
- Parameters
mk – multiplicities
N – total number of observed symbols
- plugin_estimator(symbol_counts, past_symbol_counts, alphabet_size, alphabet_size_past, H_uncond)[source]
Estimate the entropy of a system using the BBC estimator.
- class idtxl.estimators_Rudelt.RudeltShufflingEstimator(settings=None)[source]
Estimate the history dependence in a spike train using the shuffling estimator.
See parent class for references.
implemented in idtxl by Michael Lindner, Göttingen 2021
- estimate(symbol_array)[source]
Estimate the history dependence in a spike train using the shuffling estimator.
- Args:
- symbol_array1D numpy array
realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)
- Returns:
- I (float)
MI (AIS)
- R (float)
MI / H_uncond (History dependence)
- get_H0_X_past_cond_X(marginal_probabilities, number_of_bins_d, P_X_uncond)[source]
Compute H_0(X_past | X), the estimate of the entropy for the past symbols given a response, under the assumption that activity in the past contributes independently towards the response.
- get_H0_X_past_cond_X_eq_x(marginal_probabilities, number_of_bins_d)[source]
Compute H_0(X_past | X = x), cf get_H0_X_past_cond_X.
- get_H_X_past_cond_X(P_X_uncond, P_X_past_cond_X)[source]
Compute H(X_past | X), the plug-in estimate of the conditional entropy for the past symbols, conditioned on the response X, given their probabilities.
- get_H_X_past_uncond(P_X_past_uncond)[source]
Compute H(X_past), the plug-in estimate of the entropy for the past symbols, given their probabilities.
- get_P_X_past_cond_X(past_symbol_counts, number_of_symbols)[source]
Compute P(X_past | X), the probability of the past activity conditioned on the response X using the plug-in estimator.
- get_P_X_past_uncond(past_symbol_counts, number_of_symbols)[source]
Compute P(X_past), the probability of the past activity using the plug-in estimator.
- get_P_X_uncond(number_of_symbols)[source]
Compute P(X), the probability of the current activity using the plug-in estimator.
- get_marginal_frequencies_of_spikes_in_bins(symbol_counts, number_of_bins_d)[source]
Compute for each past bin 1…d the sum of spikes found in that bin across all observed symbols.
- get_shuffled_symbol_counts(symbol_counts, past_symbol_counts, number_of_bins_d, number_of_symbols)[source]
Simulate new data by, for each past bin 1…d, permutating the activity across all observed past_symbols (for a given response X). The marginal probability of observing a spike given the response is thus preserved for each past bin.
- shuffling_MI(symbol_counts, number_of_bins_d)[source]
Estimate the mutual information between current and past activity in a spike train using the shuffling estimator.
To obtain the shuffling estimate, compute the plug-in estimate and a correction term to reduce its bias.
For the plug-in estimate:
Extract the past_symbol_counts from the symbol_counts.
I_plugin = H(X_past) - H(X_past | X)
Notation:
X: current activity, aka response
X_past: past activity
P_X_uncond: P(X)
P_X_past_uncond: P(X_past)
P_X_past_cond_X: P(X_past | X)
H_X_past_uncond: H(X_past)
H_X_past_cond_X: H(X_past | X)
I_plugin: plugin estimate of I(X_past; X)
For the correction term:
- Simulate additional data under the assumption that activity
in the past contributes independently towards the current activity.
- Compute the entropy under the assumptions of the model, which
due to its simplicity is easy to sample and the estimate unbiased
- Compute the entropy using the plug-in estimate, whose bias is
similar to that of the plug-in estimate on the original data
- Compute the correction term as the difference between the
unbiased and biased terms
Notation:
P0_sh_X_past_cond_X: P_0,sh(X_past | X), equiv. to P(X_past | X) on the shuffled data
H0_X_past_cond_X: H_0(X_past | X), based on the model of independent contributions
H0_sh_X_past_cond_X: H_0,sh(X_past | X), based on
P0_sh_X_past_cond_X, ie the plug-in estimate
I_corr: the correction term to reduce the bias of I_plugin
- Args:
- symbol_countsiterable
the activity of a spike train is embedded into symbols, whose occurrences are counted (cf emb.get_symbol_counts)
- number_of_bins_dint
the number of bins of the embedding