Algorithms for the analysis of node dynamics

Active Information Storage (AIS)

class idtxl.active_information_storage.ActiveInformationStorage[source]

Estimate active information storage in individual processes.

Estimate active information storage (AIS) in individual processes of the network. To perform AIS estimation call analyse_network() on the whole network or a set of nodes or call analyse_single_process() to estimate AIS for a single process. See docstrings of the two functions for more information.

References:

  • Lizier, J. T., Prokopenko, M., & Zomaya, A. Y. (2012). Local measures of information storage in complex distributed computation. Inform Sci, 208, 39–54. http://doi.org/10.1016/j.ins.2012.04.016

  • Wibral, M., Lizier, J. T., Vögler, S., Priesemann, V., & Galuske, R. (2014). Local active information storage as a tool to understand distributed neural information processing. Front Neuroinf, 8, 1. http://doi.org/10.3389/fninf.2014.00001

  • Faes, L., Nollo, G., & Porta, A. (2011). Information-based detection of nonlinear Granger causality in multivariate processes via a nonuniform embedding technique. Phys Rev E, 83, 1–15. http://doi.org/10.1103/PhysRevE.83.051112

Attributes:
process_setlist

list with indices of analyzed processes

settingsdict

analysis settings

current_valuetuple

index of the current value in AIS estimation, (idx process, idx sample)

selected_vars_fulllist of tuples

samples in the past state, (idx process, idx sample)

aisfloat

raw AIS value

signbool

true if AIS is significant

pvalue: float

p-value of AIS

analyse_network(settings, data, processes='all')[source]

Estimate active information storage for multiple network processes.

Estimate active information storage for all or a subset of processes in the network.

Note:

For a detailed description of the algorithm and settings see documentation of the analyse_single_process() method and references in the class docstring.

Example:

>>> data = Data()
>>> data.generate_mute_data(100, 5)
>>> settings = {
>>>     'cmi_estimator': 'JidtKraskovCMI',
>>>     'n_perm_max_stat': 200,
>>>     'n_perm_min_stat': 200,
>>>     'max_lag': 5,
>>>     'tau': 1
>>>     }
>>> processes = [1, 2, 3]
>>> network_analysis = ActiveInformationStorage()
>>> results = network_analysis.analyse_network(settings, data,
>>>                                            processes)
Args:
settingsdict

parameters for estimation and statistical testing, see documentation of analyse_single_target() for details, settings can further contain

  • verbose : bool [optional] - toggle console output (default=True)

  • fdr_correction : bool [optional] - correct results on the network level, see documentation of stats.ais_fdr() for details (default=True)

dataData instance

raw data for analysis

processeslist of int | ‘all’

index of processes (default=’all’); if ‘all’, AIS is estimated for all processes; if list of int, AIS is estimated for processes specified in the list.

Returns:
ResultsSingleProcessAnalysis instance

results of network AIS estimation, see documentation of ResultsSingleProcessAnalysis()

analyse_single_process(settings, data, process)[source]

Estimate active information storage for a single process.

Estimate active information storage for one process in the network. Uses non-uniform embedding found through information maximisation. This is done in three steps (see Lizier and Faes for details):

  1. Find all relevant samples in the processes’ own past, by iteratively adding candidate samples that have significant conditional mutual information (CMI) with the current value (conditional on all samples that were added previously)

  2. Prune the final conditional set by testing the CMI between each sample in the final set and the current value, conditional on all other samples in the final set

  3. Calculate AIS using the final set of candidates as the past state (calculate MI between samples in the past and the current value); test for statistical significance using a permutation test

Note:

For a further description of the algorithm see references in the class docstring.

Args:
settingsdict

parameters for estimator use and statistics:

  • cmi_estimator : str - estimator to be used for CMI and MI calculation (for estimator settings see the documentation in the estimators_* modules)

  • max_lag : int - maximum temporal search depth for candidates in the processes’ past in samples

  • tau : int [optional] - spacing between candidates in the sources’ past in samples (default=1)

  • n_perm_* : int [optional] - number of permutations, where * can be ‘max_stat’, ‘min_stat’, ‘mi’ (default=500)

  • alpha_* : float [optional] - critical alpha level for statistical significance, where * can be ‘max_stat’, ‘min_stat’, ‘mi’ (default=0.05)

  • add_conditionals : list of tuples | str [optional] - force the estimator to add these conditionals when estimating TE; can either be a list of variables, where each variable is described as (idx process, lag wrt to current value) or can be a string: ‘faes’ for Faes-Method (see references)

  • permute_in_time : bool [optional] - force surrogate creation by shuffling realisations in time instead of shuffling replications; see documentation of Data.permute_samples() for further settings (default=False)

  • verbose : bool [optional] - toggle console output (default=True)

  • write_ckp : bool [optional] - enable checkpointing, writes analysis state to disk every time a variable is selected; resume crashed analysis using network_analysis.resume_checkpoint() (default=False)

  • filename_ckp : str [optional] - checkpoint file name (without extension) (default=’./idtxl_checkpoint’)

dataData instance

raw data for analysis

processint

index of process

Returns:
ResultsSingleProcessAnalysis instance

results of AIS estimation, see documentation of ResultsSingleProcessAnalysis()

Multivariate Partial Information Decomposition (PID)

class idtxl.multivariate_pid.MultivariatePID[source]

Perform partial information decomposition for individual processes.

Perform partial information decomposition (PID) for multiple source processes (up to 4 sources) and a target process in the network. Estimate unique, shared, and synergistic information in the multiple sources about the target. Call analyse_network() on the whole network or a set of nodes or call analyse_single_target() to estimate PID for a single process. See docstrings of the two functions for more information.

References:

Attributes:
targetint

index of target process

sourcesarray type

multiple of indices of source processes

settingsdict

analysis settings

resultsdict

estimated PID

analyse_network(settings, data, targets, sources)[source]

Estimate partial information decomposition for network nodes.

Estimate, for multiple nodes (target processes), the partial information decomposition (PID) for multiple source processes (up to 4 sources) and each of these target processes in the network.

Note:

For a detailed description of the algorithm and settings see documentation of the analyse_single_target() method and references in the class docstring.

Example:

>>> n = 20
>>> alph = 2
>>> s1 = np.random.randint(0, alph, n)
>>> s2 = np.random.randint(0, alph, n)
>>> s3 = np.random.randint(0, alph, n)
>>> target1 = np.logical_xor(s1, s2).astype(int)
>>> target  = np.logical_xor(target1, s3).astype(int)
>>> data = Data(np.vstack((s1, s2, s3, target)), 'ps',
>>> normalise=False)
>>> settings = {
>>>     'lags_pid': [[1, 1, 1], [3, 2, 7]],
>>>     'verbose': False,
>>>     'pid_estimator': 'SxPID'}
>>> targets = [0, 1]
>>> sources = [[1, 2, 3], [0, 2, 3]]
>>> pid_analysis = MultivariatePID()
>>> results = pid_analysis.analyse_network(settings, data, targets,
>>>                                        sources)
Args:
settingsdict

parameters for estimation and statistical testing, see documentation of analyse_single_target() for details, can contain

  • lags_pid : list of lists of ints [optional] - lags in samples between sources and target (default=[[1, 1, …, 1], [1, 1, …, 1], …])

dataData instance

raw data for analysis

targetslist of int

index of target processes

sourceslist of lists

indices of the multiple source processes for each target, e.g., [[0, 1, 2], [1, 0, 3]], all must lists be of the same lenght and list of lists must have the same length as targets

Returns:
ResultsMultivariatePID instance

results of network inference, see documentation of ResultsMultivariatePID()

analyse_single_target(settings, data, target, sources)[source]

Estimate partial information decomposition for a network node.

Estimate partial information decomposition (PID) for multiple source processes (up to 4 sources) and a target process in the network.

Note:

For a description of the algorithm and the method see references in the class and estimator docstrings.

Example:

>>> n = 20
>>> alph = 2
>>> s1 = np.random.randint(0, alph, n)
>>> s2 = np.random.randint(0, alph, n)
>>> s3 = np.random.randint(0, alph, n)
>>> target1 = np.logical_xor(s1, s2).astype(int)
>>> target  = np.logical_xor(target1, s3).astype(int)
>>> data = Data(np.vstack((s1, s2, s3, target)), 'ps',
>>> normalise=False)
>>> settings = {
>>>     'verbose' : false,
>>>     'pid_estimator': 'SxPID',
>>>     'lags_pid': [2, 3, 1]}
>>> pid_analysis = MultivariatePID()
>>> results = pid_analysis.analyse_single_target(settings=settings,
>>>                                              data=data,
>>>                                              target=0,
>>>                                              sources=[1, 2, 3])

Args: settings : dict parameters for estimator use and statistics:

  • pid_estimator : str - estimator to be used for PID estimation (for estimator settings see the documentation in the estimators_pid modules)

  • lags_pid : list of ints [optional] - lags in samples between sources and target (default=[1, 1, …, 1])

  • verbose : bool [optional] - toggle console output (default=True)

dataData instance

raw data for analysis

targetint

index of target processes

sourceslist of ints

indices of the multiple source processes for the target

Returns: ResultsMultivariatePID instance results of

network inference, see documentation of ResultsPID()

Bivariate Partial Information Decomposition (PID)

class idtxl.bivariate_pid.BivariatePID[source]

Perform partial information decomposition for individual processes.

Perform partial information decomposition (PID) for two source processes and one target process in the network. Estimate unique, shared, and synergistic information in the two sources about the target. Call analyse_network() on the whole network or a set of nodes or call analyse_single_target() to estimate PID for a single process. See docstrings of the two functions for more information.

References:

  • Williams, P. L., & Beer, R. D. (2010). Nonnegative Decomposition of Multivariate Information, 1–14. Retrieved from http://arxiv.org/abs/1004.2515

  • Bertschinger, N., Rauh, J., Olbrich, E., Jost, J., & Ay, N. (2014). Quantifying Unique Information. Entropy, 16(4), 2161–2183. http://doi.org/10.3390/e16042161

Attributes:
targetint

index of target process

sourcesarray type

pair of indices of source processes

settingsdict

analysis settings

resultsdict

estimated PID

analyse_network(settings, data, targets, sources)[source]

Estimate partial information decomposition for network nodes.

Estimate partial information decomposition (PID) for multiple nodes in the network.

Note:

For a detailed description of the algorithm and settings see documentation of the analyse_single_target() method and references in the class docstring.

Example:

>>> n = 20
>>> alph = 2
>>> x = np.random.randint(0, alph, n)
>>> y = np.random.randint(0, alph, n)
>>> z = np.logical_xor(x, y).astype(int)
>>> data = Data(np.vstack((x, y, z)), 'ps', normalise=False)
>>> settings = {
>>>     'lags_pid': [[1, 1], [3, 2], [0, 0]],
>>>     'alpha': 0.1,
>>>     'alph_s1': alph,
>>>     'alph_s2': alph,
>>>     'alph_t': alph,
>>>     'max_unsuc_swaps_row_parm': 60,
>>>     'num_reps': 63,
>>>     'max_iters': 1000,
>>>     'pid_estimator': 'SydneyPID'}
>>> targets = [0, 1, 2]
>>> sources = [[1, 2], [0, 2], [0, 1]]
>>> pid_analysis = BivariatePID()
>>> results = pid_analysis.analyse_network(settings, data, targets,
>>>                                        sources)
Args:
settingsdict

parameters for estimation and statistical testing, see documentation of analyse_single_target() for details, can contain

  • lags_pid : list of lists of ints [optional] - lags in samples between sources and target (default=[[1, 1], [1, 1] …])

dataData instance

raw data for analysis

targetslist of int

index of target processes

sourceslist of lists

indices of the two source processes for each target, e.g., [[0, 2], [1, 0]], must have the same length as targets

Returns:
ResultsPID instance

results of network inference, see documentation of ResultsPID()

analyse_single_target(settings, data, target, sources)[source]

Estimate partial information decomposition for a network node.

Estimate partial information decomposition (PID) for a target node in the network.

Note:

For a description of the algorithm and the method see references in the class and estimator docstrings.

Example:

>>> n = 20
>>> alph = 2
>>> x = np.random.randint(0, alph, n)
>>> y = np.random.randint(0, alph, n)
>>> z = np.logical_xor(x, y).astype(int)
>>> data = Data(np.vstack((x, y, z)), 'ps', normalise=False)
>>> settings = {
>>>     'alpha': 0.1,
>>>     'alph_s1': alph,
>>>     'alph_s2': alph,
>>>     'alph_t': alph,
>>>     'max_unsuc_swaps_row_parm': 60,
>>>     'num_reps': 63,
>>>     'max_iters': 1000,
>>>     'pid_calc_name': 'SydneyPID',
>>>     'lags_pid': [2, 3]}
>>> pid_analysis = BivariatePID()
>>> results = pid_analysis.analyse_single_target(settings=settings,
>>>                                              data=data,
>>>                                              target=0,
>>>                                              sources=[1, 2])

Args: settings : dict parameters for estimator use and statistics:

  • pid_estimator : str - estimator to be used for PID estimation (for estimator settings see the documentation in the estimators_pid modules)

  • lags_pid : list of ints [optional] - lags in samples between sources and target (default=[1, 1])

  • verbose : bool [optional] - toggle console output (default=True)

dataData instance

raw data for analysis

targetint

index of target processes

sourceslist of ints

indices of the two source processes for the target

Returns: ResultsPID instance results of

network inference, see documentation of ResultsPID()

History-dependence estimator for neural spiking data

Embedding optimization

class idtxl.embedding_optimization_ais_Rudelt.OptimizationRudelt(settings=None)[source]

Optimization of embedding parameters of spike times using the history dependence estimators

References:

[1]: L. Rudelt, D. G. Marx, M. Wibral, V. Priesemann: Embedding

optimization reveals long-lasting history dependence in neural spiking activity, 2021, PLOS Computational Biology, 17(6)

[2]: https://github.com/Priesemann-Group/hdestimator

implemented in idtxl by Michael Lindner, Göttingen 2021

Args:
settingsdict
  • estimation_methodstring

    The method to be used to estimate the history dependence ‘bbc’ or ‘shuffling’.

  • embedding_step_sizefloat

    Step size delta t (in seconds) with which the window is slid through the data. (default: 0.005)

  • embedding_number_of_bins_setlist of integer values

    Set of values for d, the number of bins in the embedding. (default: [1, 2, 3, 4, 5])

  • embedding_past_range_setlist of floating-point values

    Set of values for T, the past range (in seconds) to be used for embeddings. (default: [0.005, 0.00561, 0.00629, 0.00706, 0.00792, 0.00889, 0.00998, 0.01119, 0.01256, 0.01409, 0.01581, 0.01774, 0.01991, 0.02233, 0.02506, 0.02812, 0.03155, 0.0354, 0.03972, 0.04456, 0.05, 0.0561, 0.06295, 0.07063, 0.07924, 0.08891, 0.09976, 0.11194, 0.12559, 0.14092, 0.15811, 0.17741, 0.19905, 0.22334, 0.25059, 0.28117, 0.31548, 0.35397, 0.39716, 0.44563, 0.5, 0.56101, 0.62946, 0.70627, 0.79245, 0.88914, 0.99763, 1.11936, 1.25594, 1.40919, 1.58114, 1.77407, 1.99054, 2.23342, 2.50594, 2.81171, 3.15479, 3.53973, 3.97164, 4.45625, 5.0])

  • embedding_scaling_exponent_setdict

    Set of values for kappa, the scaling exponent for the bins in the embedding. Should be a python-dictionary with the three entries ‘number_of_scalings’, ‘min_first_bin_size’ and ‘min_step_for_scaling’. defaults: {‘number_of_scalings’: 10, ‘min_first_bin_size’: 0.005, ‘min_step_for_scaling’: 0.01})

  • bbc_tolerancefloat

    The tolerance for the Bayesian Bias Criterion. Influences which embeddings are discarded from the analysis. (default: 0.05)

  • return_averaged_Rbool

    Return R_tot as the average over R(T) for T in [T_D, T_max], instead of R_tot = R(T_D). If set to True, the setting for number_of_bootstraps_R_tot (see below) is ignored and set to 0 and CI bounds are not calculated. (default: True)

  • timescale_minimum_past_rangefloat

    Minimum past range T_0 (in seconds) to take into consideration for the estimation of the information timescale tau_R. (default: 0.01)

  • number_of_bootstraps_R_maxint

    The number of bootstrap re-shuffles that should be used to determine the optimal embedding. (Bootstrap the estimates of R_max to determine R_tot.) These are computed during the ‘history-dependence’ task because they are essential to obtain R_tot. (default: 250)

  • number_of_bootstraps_R_totint

    The number of bootstrap re-shuffles that should be used to estimate the confidence interval of the optimal embedding. (Bootstrap the estimates of R_tot = R(T_D) to obtain a confidence interval for R_tot.). These are computed during the ‘confidence-intervals’ task. The setting return_averaged_R (see above) needs to be set to False for this setting to take effect. (default: 250)

  • number_of_bootstraps_nonessentialint

    The number of bootstrap re-shuffles that should be used to estimate the confidence intervals for embeddings other than the optimal one. (Bootstrap the estimates of R(T) for all other T.) (These are not necessary for the main analysis and therefore default to 0.)

  • symbol_block_lengthint

    The number of symbols that should be drawn in each block for bootstrap resampling If it is set to None (recommended), the length is automatically chosen, based on heuristics (default: None)

  • bootstrap_CI_use_sdbool

    Most of the time we observed normally-distributed bootstrap replications, so it is sufficient (and more efficient) to compute confidence intervals based on the standard deviation (default: True)

  • bootstrap_CI_percentile_lofloat

    The lower percentile for the confidence interval. This has no effect if bootstrap_CI_use_sd is set to True (default: 2.5)

  • bootstrap_CI_percentile_hifloat

    The upper percentiles for the confidence interval. This has no effect if bootstrap_CI_use_sd is set to True (default: 97.5)

  • analyse_auto_MIbool

    perform calculation of auto mutual information of the spike train (default: True) If set to True:

    • auto_MI_bin_size_setlist of floating-point values

      Set of values for the sizes of the bins (in seconds). (default: [0.005, 0.01, 0.025, 0.05, 0.25, 0.5])

    • auto_MI_max_delayint

      The maximum delay (in seconds) between the past bin and the response. (default: 5)

  • visualizationbool

    create .eps output image showing the optimization values and graphs for the history dependence and the auto mutual information (default: False) if set to True:

    • output_pathString

      Path where the .eps images should be saved

    • output_prefixString

      Prefix of the output images e.g. <output_prefix>_process0.eps

  • debug: bool

    show values while calculating (default: False)

analyse_auto_MI(spike_times)[source]

Get the auto MI for the spike times. If it is available from file, load it, else compute it.

check_inputs()[source]

Check input settings for completeness

compute_CIs(data, target_R='R_max', symbol_block_length=None)[source]

Compute bootstrap replications of the history dependence estimate which can be used to obtain confidence intervals.

Args:
datadata_spiketime object

Input data

target_RString

One of ‘R_max’, ‘R_tot’ or ‘nonessential’. If set to R_max, replications of R are produced for the T at which R is maximised. If set to R_tot, replications of R are produced for T = T_D (cf get_temporal_depth_T_D). If set to nonessential, replications of R are produced for each T (one embedding per T, cf get_embeddings_that_maximise_R). These are not otherwise used in the analysis and are probably only useful if the resulting plot is visually inspected, so in most cases it can be set to zero.

symbol_block_lengthint

The number of symbols that should be drawn in each block for bootstrap resampling If it is set to None (recommended), the length is automatically chosen, based on heuristics

get_auto_MI(spike_times, bin_size, number_of_delays)[source]

Compute the auto mutual information in the neuron’s activity, a measure closely related to history dependence.

get_bootstrap_history_dependence(data, embedding, number_of_bootstraps, symbol_block_length=None)[source]

For a given embedding, return bootstrap replications for R.

get_embeddings(embedding_past_range_set, embedding_number_of_bins_set, embedding_scaling_exponent_set)[source]

Get all combinations of parameters T, d, k, based on the sets of selected parameters.

get_embeddings_that_maximise_R(bbc_tolerance=None, dependent_var='T', get_as_list=False)[source]

For each T (or d), get the embedding for which R is maximised.

For the bbc estimator, here the bbc_tolerance is applied, ie get the unbiased embeddings that maximise R.

get_history_dependence(data, process)[source]

Estimate the history dependence for each embedding to all given processes.

get_information_timescale_tau_R()[source]

Get the information timescale tau_R, a characteristic timescale of history dependence similar to an autocorrelation time.

get_past_range(number_of_bins_d, first_bin_size, scaling_k)[source]

Get the past range T of the embedding, based on the parameters d, tau_1 and k.

get_set_of_scalings(past_range_T, number_of_bins_d, number_of_scalings, min_first_bin_size, min_step_for_scaling)[source]

Get scaling exponents such that the uniform embedding as well as the embedding for which the first bin has a length of min_first_bin_size (in seconds), as well as linearly spaced scaling factors in between, such that in total number_of_scalings scalings are obtained.

get_temporal_depth_T_D(get_R_thresh=False)[source]

Get the temporal depth T_D, the past range for the ‘optimal’ embedding parameters.

Given the maximal history dependence R at each past range T, (cf get_embeddings_that_maximise_R), first find the smallest T at which R is maximised (cf get_max_R_T). If bootstrap replications for this R are available, get the smallest T at which this R minus one standard deviation of the bootstrap estimates is attained.

optimize(data, processes='all')[source]

Optimize the embedding parameters of spike time data using the Rudelt history dependence estimator.

References:

[1]: L. Rudelt, D. G. Marx, M. Wibral, V. Priesemann: Embedding

optimization reveals long-lasting history dependence in neural spiking activity, 2021, PLOS Computational Biology, 17(6)

[2]: https://github.com/Priesemann-Group/hdestimator

implemented in idtxl by Michael Lindner, Göttingen 2021

Args:
dataData_spiketime instance

raw data for analysis

processeslist of int

index of processes; spike times are optimized all processes specified in the list separately.

Returns: # ——————————————————————————————————– TODO
ResultsSingleProcessRudelt instance

results of Rudelt optimization, see documentation of ResultsSingleProcessRudelt()

if visulization in settings was set True (see class OptimizationRudelt):
.eps images are created for each optimized process containing:
  • optimized values for the process

  • graph for the history dependence

  • graph for auto mutual information (if calculated)

optimize_single_run(data, process)[source]

optimizes a single realisation of spike time data given the process number

Args:
dataData_spiketime instance

raw data for analysis

processint

index of process;

Returns:
DotDict

with the following keys

Processint

Process that was optimized

estimation_methodString

Estimation method that was used for optimization

T_Dfloat

Estimated optimal value for the temporal depth TD

tau_R :

Information timescale tau_R, a characteristic timescale of history dependence similar to an autocorrelation time.

R_totfloat

Estimated value for the total history dependence Rtot,

AIS_totfloat

Estimated value for the total active information storage

opt_number_of_bins_dint

Number of bins d for the embedding that yields (R̂tot ,T̂D)

opt_scaling_kint

Scaling exponent κ for the embedding that yields (R̂tot , T̂D)

opt_first_bin_sizeint

Size of the first bin τ1 for the embedding that yields (R̂tot , T̂D ),

history_dependencearray with floating-point values

Estimated history dependence for each embedding

firing_ratefloat

Firing rate of the neuron/ spike train

recording_lengthfloat

Length of the recording (in seconds)

H_spikingfloat

Entropy of the spike times

if analyse_auto_MI was set to True additionally:
auto_MIdict

numpy array of MI values for each delay

auto_MI_delayslist of int

list of delays depending on the given auto_MI_bin_sizes and auto_MI_max_delay

remove_subresults_single_process()[source]

delete results from self from single process

Estimators

Provide HDE estimators.

class idtxl.estimators_Rudelt.RudeltAbstractEstimator(settings=None)[source]

Abstract class for implementation of nsb and plugin estimators from Rudelt.

Abstract class for implementation of nsb and plugin estimators, child classes implement estimators for mutual information (MI) .

References:

[1]: L. Rudelt, D. G. Marx, M. Wibral, V. Priesemann: Embedding

optimization reveals long-lasting history dependence in neural spiking activity, 2021, PLOS Computational Biology, 17(6)

[2]: https://github.com/Priesemann-Group/hdestimator

implemented in idtxl by Michael Lindner, Göttingen 2021

Args:
settingsdict
  • embedding_step_sizefloat [optional]

    Step size delta t (in seconds) with which the window is slid through the data (default = 0.005).

  • normalisebool [optional]

    rebase spike times to zero (default=True)

  • return_averaged_Rbool [optional]

    If set to True, compute R̂tot as the average over R̂(T ) for T ∈ [T̂D, Tmax ] instead of R̂tot = R(T̂D ). If set to True, the setting for number_of_bootstraps_R_tot is ignored and set to 0 (default=True)

get_median_number_of_spikes_per_bin(raw_symbols)[source]

Given raw symbols (in which the number of spikes per bin are counted, ie not necessarily binary quantity), get the median number of spikes for each bin, among all symbols obtained by the embedding.

get_multiplicities(symbol_counts, alphabet_size)[source]

Get the multiplicities of some given symbol counts.

To estimate the entropy of a system, it is only important how often a symbol/ event occurs (the probability that it occurs), not what it represents. Therefore, computations can be simplified by summarizing symbols by their frequency, as represented by the multiplicities.

get_past_range(number_of_bins_d, first_bin_size, scaling_k)[source]

Get the past range T of the embedding, based on the parameters d, tau_1 and k.

get_raw_symbols(spike_times, embedding, first_bin_size)[source]

Get the raw symbols (in which the number of spikes per bin are counted, ie not necessarily binary quantity), as obtained by applying the embedding.

get_symbol_counts(symbol_array)[source]

Count how often symbols occur

get_window_delimiters(number_of_bins_d, scaling_k, first_bin_size)[source]

Get delimiters of the window, used to describe the embedding. The window includes both the past embedding and the response.

The delimiters are times, relative to the first bin, that separate two consequent bins.

is_analytic_null_estimator()[source]

Indicate if estimator supports analytic surrogates.

Return true if the estimator implements estimate_surrogates_analytic() where data is formatted as per the estimate method for this estimator.

Returns:

bool

is_parallel()[source]

Indicate if estimator supports parallel estimation over chunks.

Return true if the supports parallel estimation over chunks, where a chunk is one independent data set.

Returns:

bool

symbol_array_to_binary(spikes_in_window, number_of_bins_d)[source]

Given an array of 1s and 0s, representing spikes and the absence thereof, read the array as a binary number to obtain a (base 10) integer.

symbol_binary_to_array(symbol_binary, number_of_bins_d)[source]

Given a binary representation of a symbol (cf symbol_array_to_binary), convert it back into its array-representation.

class idtxl.estimators_Rudelt.RudeltAbstractNSBEstimator(settings=None)[source]

Abstract class for implementation of NSB estimators from Rudelt.

Abstract class for implementation of Nemenman-Shafee-Bialek (NSB) estimators, child classes implement nsb estimators for mutual information (MI).

implemented in idtxl by Michael Lindner, Göttingen 2021

References:

[1]: L. Rudelt, D. G. Marx, M. Wibral, V. Priesemann: Embedding

optimization reveals long-lasting history dependence in neural spiking activity, 2021, PLOS Computational Biology, 17(6)

[2]: I. Nemenman, F. Shafee, W. Bialek: Entropy and inference,

revisited. In T.G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, Cambridge, MA, 2002. MIT Press.

Args:
settingsdict
  • embedding_step_sizefloat [optional]

    Step size delta t (in seconds) with which the window is slid through the data (default = 0.005).

  • normalisebool [optional]

    rebase spike times to zero (default=True)

  • return_averaged_Rbool [optional]

    If set to True, compute R̂tot as the average over R̂(T ) for T ∈ [T̂D, Tmax ] instead of R̂tot = R(T̂D ). If set to True, the setting for number_of_bootstraps_R_tot is ignored and set to 0 (default=True)

H1(beta, mk, K, N)[source]

Compute the first moment (expectation value) of the entropy H.

H is the entropy one obtains with a symmetric Dirichlet prior with concentration parameter beta and a multinomial likelihood.

alpha_ML(mk, K1, N)[source]

Compute first guess for the beta_MAP (cf get_beta_MAP) parameter via the posterior of a Dirichlet process.

d2_log_rho(beta, mk, K, N)[source]

Second derivate of the logarithm of the Dirichlet multinomial likelihood.

d2_log_rho_xi(beta, mk, K, N)[source]

Second derivative of the logarithm of the nsb (unnormalized) posterior.

d2_xi(beta, K)[source]

Second derivative of xi(beta) (cf d_xi).

d3_xi(beta, K)[source]

Third derivative of xi(beta) (cf d_xi).

d_log_rho(beta, mk, K, N)[source]

First derivate of the logarithm of the Dirichlet multinomial likelihood.

d_log_rho_xi(beta, mk, K, N)[source]

First derivative of the logarithm of the nsb (unnormalized) posterior.

d_xi(beta, K)[source]

First derivative of xi(beta).

xi(beta) is the entropy of the system when no data has been observed. d_xi is the prior for the nsb estimator

get_beta_MAP(mk, K, N)[source]

Get the maximum a posteriori (MAP) value for beta.

Provides the location of the peak, around which we integrate.

beta_MAP is the value for beta for which the posterior of the NSB estimator is maximised (or, equivalently, of the logarithm thereof, as computed here).

get_integration_bounds(mk, K, N)[source]

Find the integration bounds for the estimator.

Typically it is a delta-like distribution so it is sufficient to integrate around this peak. (If not this function is not called.)

log_likelihood_DP_alpha(a, K1, N)[source]

Alpha-dependent terms of the log-likelihood of a Dirichlet Process.

nsb_entropy(mk, K, N)[source]

Estimate the entropy of a system using the NSB estimator.

Parameters
  • mk – multiplicities

  • K – number of possible symbols/ state space of the system

  • N – total number of observed symbols

rho(beta, mk, K, N)[source]

rho(beta, data) is the Dirichlet multinomial likelihood.

rho(beta, data) together with the d_xi(beta) make up the posterior for the nsb estimator

unnormalized_posterior(beta, mk, K, N)[source]

The (unnormalized) posterior in the nsb estimator.

Product of the likelihood rho and the prior d_xi; the normalizing factor is given by the marginal likelihood

class idtxl.estimators_Rudelt.RudeltBBCEstimator(settings=None)[source]

Bayesian bias criterion (BBC) Estimator using NSB and Plugin estimator

Calculate the mutual information (MI) of one variable depending on its past using nsb and plugin estimator and check if bias criterion is passed. See parent class for references.

implemented in idtxl by Michael Lindner, Göttingen 2021

Args:
settingsdict
  • embedding_step_sizefloat [optional]

    Step size delta t (in seconds) with which the window is slid through the data (default = 0.005).

  • normalisebool [optional]

    rebase spike times to zero (default=True)

  • return_averaged_Rbool [optional]

    If set to True, compute R̂tot as the average over R̂(T ) for T ∈ [T̂D, Tmax ] instead of R̂tot = R(T̂D ). If set to True, the setting for number_of_bootstraps_R_tot is ignored and set to 0 (default=True)

bayesian_bias_criterion(R_nsb, R_plugin, bbc_tolerance)[source]

Get whether the Bayesian bias criterion (bbc) is passed.

Parameters
  • R_nsb – history dependence computed with NSB estimator

  • R_plugin – history dependence computed with plugin estimator

  • bbc_tolerance – tolerance for the Bayesian bias criterion

estimate(symbol_array, past_symbol_array, current_symbol_array, bbc_tolerance=None)[source]

Calculate the mutual information (MI) of one variable depending on its past using nsb and plugin estimator and check if bias criterion is passed/

Args:
symbol_array1D numpy array

realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)

past_symbol_arraynumpy array

realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)

current_symbol_arraynumpy array

realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)

Returns:
I (float)

MI (AIS)

R (float)

MI / H_uncond (History dependence)

bbc_term (float)

bbc tolerance-independent term of the Bayesian bias criterion (bbc)

get_bbc_term(R_nsb, R_plugin)[source]

Get the bbc tolerance-independent term of the Bayesian bias criterion (bbc).

Parameters
  • R_nsb – history dependence computed with NSB estimator

  • R_plugin – history dependence computed with plugin estimator

class idtxl.estimators_Rudelt.RudeltNSBEstimatorSymbolsMI(settings=None)[source]

History dependence NSB estimator

Calculate the mutual information (MI) of one variable depending on its past using NSB estimator. See parent class for references.

implemented in idtxl by Michael Lindner, Göttingen 2021

Args:
settingsdict
  • embedding_step_sizefloat [optional]

    Step size delta t (in seconds) with which the window is slid through the data (default = 0.005).

  • normalisebool [optional]

    rebase spike times to zero (default=True)

  • return_averaged_Rbool [optional]

    If set to True, compute R̂tot as the average over R̂(T ) for T ∈ [T̂D, Tmax ] instead of R̂tot = R(T̂D ). If set to True, the setting for number_of_bootstraps_R_tot is ignored and set to 0 (default=True)

estimate(symbol_array, past_symbol_array, current_symbol_array)[source]

Estimate mutual information using NSB estimator.

Args:
symbol_array1D numpy array

realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)

past_symbol_arraynumpy array

realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)

current_symbol_arraynumpy array

realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)

Returns:
I (float)

MI (AIS)

R (float)

MI / H_uncond (History dependence)

nsb_estimator(symbol_counts, past_symbol_counts, alphabet_size, alphabet_size_past, H_uncond)[source]

Estimate the entropy of a system using the NSB estimator.

class idtxl.estimators_Rudelt.RudeltPluginEstimatorSymbolsMI(settings=None)[source]

Plugin History dependence estimator

Calculate the mutual information (MI) of one variable depending on its past using plugin estimator. See parent class for references.

implemented in idtxl by Michael Lindner, Göttingen 2021

Args:
settingsdict
  • embedding_step_sizefloat [optional] - Step size delta t (in seconds) with which the window is slid

    through the data (default = 0.005).

  • normalise : bool [optional] - rebase spike times to zero (default=True)

  • return_averaged_R : bool [optional] - rebase spike times to zero (default=True)

estimate(symbol_array, past_symbol_array, current_symbol_array)[source]

Estimate mutual information using plugin estimator.

Args:
symbol_array1D numpy array

realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)

past_symbol_arraynumpy array

realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)

current_symbol_arraynumpy array

realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)

Returns:
I (float)

MI (AIS)

R (float)

MI / H_uncond (History dependence)

plugin_entropy(mk, N)[source]

Estimate the entropy of a system using the Plugin estimator.

(In principle this is the same function as utl.get_shannon_entropy, only here it is a function of the multiplicities, not the probabilities.)

Parameters
  • mk – multiplicities

  • N – total number of observed symbols

plugin_estimator(symbol_counts, past_symbol_counts, alphabet_size, alphabet_size_past, H_uncond)[source]

Estimate the entropy of a system using the BBC estimator.

class idtxl.estimators_Rudelt.RudeltShufflingEstimator(settings=None)[source]

Estimate the history dependence in a spike train using the shuffling estimator.

See parent class for references.

implemented in idtxl by Michael Lindner, Göttingen 2021

estimate(symbol_array)[source]

Estimate the history dependence in a spike train using the shuffling estimator.

Args:
symbol_array1D numpy array

realisations of symbols based on current and past states. (first output of get_realisations_symbol from data_spiketimes object)

Returns:
I (float)

MI (AIS)

R (float)

MI / H_uncond (History dependence)

get_H0_X_past_cond_X(marginal_probabilities, number_of_bins_d, P_X_uncond)[source]

Compute H_0(X_past | X), the estimate of the entropy for the past symbols given a response, under the assumption that activity in the past contributes independently towards the response.

get_H0_X_past_cond_X_eq_x(marginal_probabilities, number_of_bins_d)[source]

Compute H_0(X_past | X = x), cf get_H0_X_past_cond_X.

get_H_X_past_cond_X(P_X_uncond, P_X_past_cond_X)[source]

Compute H(X_past | X), the plug-in estimate of the conditional entropy for the past symbols, conditioned on the response X, given their probabilities.

get_H_X_past_uncond(P_X_past_uncond)[source]

Compute H(X_past), the plug-in estimate of the entropy for the past symbols, given their probabilities.

get_P_X_past_cond_X(past_symbol_counts, number_of_symbols)[source]

Compute P(X_past | X), the probability of the past activity conditioned on the response X using the plug-in estimator.

get_P_X_past_uncond(past_symbol_counts, number_of_symbols)[source]

Compute P(X_past), the probability of the past activity using the plug-in estimator.

get_P_X_uncond(number_of_symbols)[source]

Compute P(X), the probability of the current activity using the plug-in estimator.

get_marginal_frequencies_of_spikes_in_bins(symbol_counts, number_of_bins_d)[source]

Compute for each past bin 1…d the sum of spikes found in that bin across all observed symbols.

get_shuffled_symbol_counts(symbol_counts, past_symbol_counts, number_of_bins_d, number_of_symbols)[source]

Simulate new data by, for each past bin 1…d, permutating the activity across all observed past_symbols (for a given response X). The marginal probability of observing a spike given the response is thus preserved for each past bin.

shuffling_MI(symbol_counts, number_of_bins_d)[source]

Estimate the mutual information between current and past activity in a spike train using the shuffling estimator.

To obtain the shuffling estimate, compute the plug-in estimate and a correction term to reduce its bias.

For the plug-in estimate:

  • Extract the past_symbol_counts from the symbol_counts.

  • I_plugin = H(X_past) - H(X_past | X)

Notation:

  • X: current activity, aka response

  • X_past: past activity

  • P_X_uncond: P(X)

  • P_X_past_uncond: P(X_past)

  • P_X_past_cond_X: P(X_past | X)

  • H_X_past_uncond: H(X_past)

  • H_X_past_cond_X: H(X_past | X)

  • I_plugin: plugin estimate of I(X_past; X)

For the correction term:

  • Simulate additional data under the assumption that activity

    in the past contributes independently towards the current activity.

  • Compute the entropy under the assumptions of the model, which

    due to its simplicity is easy to sample and the estimate unbiased

  • Compute the entropy using the plug-in estimate, whose bias is

    similar to that of the plug-in estimate on the original data

  • Compute the correction term as the difference between the

    unbiased and biased terms

Notation:

  • P0_sh_X_past_cond_X: P_0,sh(X_past | X), equiv. to P(X_past | X) on the shuffled data

  • H0_X_past_cond_X: H_0(X_past | X), based on the model of independent contributions

  • H0_sh_X_past_cond_X: H_0,sh(X_past | X), based on

  • P0_sh_X_past_cond_X, ie the plug-in estimate

  • I_corr: the correction term to reduce the bias of I_plugin

Args:
symbol_countsiterable

the activity of a spike train is embedded into symbols, whose occurrences are counted (cf emb.get_symbol_counts)

number_of_bins_dint

the number of bins of the embedding