PowerLaws.jl

Documentation for PowerLaws.jl

PowerLaws.DistributionComparisonType
DistributionComparison

Fields

  • data: array which according which should be distributions compared
  • loglikelihoodratio: log likelihood ratio of data
  • sig_level: sigma level
  • xmin: smallest element which was used for comparing distributions
  • Vteststat: Vuong test statistic
  • Vpval: p-value from Vuong test
  • Vpreffdistr: preffered distibution according to Vuong test
  • C_b: Clarke test b value - sum of possitive values in log likehood ratio
  • Cpval: p-value from Clarke test
  • Cpreffdistr: preffered distibution according to Clarke test
source
PowerLaws.DistributionComparisonMethod
DistributionComparison

This function calculate Vuong test and Clarke test for non nested distributions. This is necessary since it is possible to fit power law distribution to any data set. Function was implemented according to this Non nested model selection for spatial count regression models with application to health insurance.

Arguments

  • d1: First distribution to be compared.
  • d2: Second distribution to be compared.
  • data: Data to be compared.
  • sig_level: Significance level. Default is 0.05.

Returns

  • DistributionComparison: Struct containing all necessary information about comparison.
source
PowerLaws.array_binsMethod
array_bins(arr::AbstractArray{T})::Dict{T,Int64} where {T <: Real}

Create a dictionary from a sorted array arr where the keys are the unique elements and the values are the indices at which these elements first appear in the array.

source
PowerLaws.bootstrapMethod

bootstrap

Bootstrap method for estimating the parameters of a power law distribution. To quantify the uncertainty in our estimate for xmin you can use bootstrap method. More information can be found in this document Power-law distributions in empirical data.

Arguments

  • data::AbstractArray: Data to be tested.
  • d::UnivariateDistribution: Distribution to be tested (ContinuousPowerLaw or DiscretePowerLaw).
  • no_of_sims::Int64: Number of simulations. Default is 10.
  • xmins::AbstractArray: Array of xmins to be tested, default is data.
  • xmax::Int64: Maximum value of data to be considered. Default is 1e5.
  • seed::Int64: Seed for random number generator. Default is 0.

Returns

  • statistic::Array{Tuple{UnivariateDistribution, Float64}}: Array of tuples containing the distribution and the Kolmogorov-Smirnov distance between the data and the distribution.
source
PowerLaws.bootstrap_pMethod

bootstrap_p

Performs a bootstrapping hypothesis test to determine whether a power law distribution is plausible. Inspired by R poweRlaw documentation.

Arguments

  • data::AbstractArray: Data to be tested.
  • d::UnivariateDistribution: Distribution to be tested (ContinuousPowerLaw or DiscretePowerLaw).
  • no_of_sims::Int64: Number of simulations. Default is 10.
  • xmins::AbstractArray: Array of xmins to be tested, default is data.
  • xmax::Int64: Maximum value of data to be considered. Default is 1e5.
  • seed::Int64: Seed for random number generator. Default is 0.

Returns

  • statistic::Array{Tuple{UnivariateDistribution, Float64}}: Array of tuples containing the distribution and the Kolmogorov-Smirnov distance between the data and the distribution.
  • P::Float64: p-value of the hypothesis test.
source
PowerLaws.estimate_parametersMethod

estimate_parameters

Estimate x_min and α for a given data set with respect to the Kolmogorov-Smirnov test.

Parameters

  • data::AbstractArray: Array of data which should be fit to a distribution.
  • distribution::Type: Distribution type, i.e. ContinuousPowerLaw or DiscretePowerLaw.
  • xmins::AbstractArray: If not specified, all unique values in data are taken as possible x_mins. If specified, only values in xmins are considered when finding the best x_min.
  • xmax::Int64: Maximum value considered in calculations. Values above xmax are not considered in, for example, calculating the Kolmogorov-Smirnov test.

Returns

  • best_fit::distribution: A distribution of the type distribution fitted to the given parameters.
  • KS::Float64: The Kolmogorov-Smirnov distance between the data and the fitted distribution.
source
PowerLaws.kolmogorov_smirnov_testFunction

kolmogorovsmirnovtest

Calculate Kolmogorov Smirnov test on given data and distribution.

Arguments

  • dat::AbstractArray: Data to be tested.
  • d::UnivariateDistribution: Distribution to be tested (ContinuousPowerLaw or DiscretePowerLaw).
  • xmin::Number: Minimum value of data to be considered.
  • xmax::Int64: Maximum value of data to be considered. Default is 1e5.

Returns

  • KS::Float64: Kolmogorov-Smirnov distance between the data and the distribution. It is the maximum absolute difference between the empirical and theoretical cumulative distribution functions (`cdf
source