PowerLaws.jl

Documentation for PowerLaws.jl

PowerLaws.DistributionComparison — Type

DistributionComparison

Fields

data: array which according which should be distributions compared
loglikelihoodratio: log likelihood ratio of data
sig_level: sigma level
xmin: smallest element which was used for comparing distributions
Vteststat: Vuong test statistic
Vpval: p-value from Vuong test
Vpreffdistr: preffered distibution according to Vuong test
C_b: Clarke test b value - sum of possitive values in log likehood ratio
Cpval: p-value from Clarke test
Cpreffdistr: preffered distibution according to Clarke test

source

PowerLaws.DistributionComparison — Method

DistributionComparison

This function calculate Vuong test and Clarke test for non nested distributions. This is necessary since it is possible to fit power law distribution to any data set. Function was implemented according to this Non nested model selection for spatial count regression models with application to health insurance.

Arguments

d1: First distribution to be compared.
d2: Second distribution to be compared.
data: Data to be compared.
sig_level: Significance level. Default is 0.05.

Returns

DistributionComparison: Struct containing all necessary information about comparison.

source

Distributions.fit_mle — Method

fit_mle(::Type{ContinuousPowerLaw}, x::AbstractArray{<:Real})

Fit a ContinuousPowerLaw distribution to the data using maximum likelihood estimation (MLE). The x_min value is the minimum value of the data.

source

Distributions.fit_mle — Method

fit_mle(::Type{DiscretePowerLaw}, x::AbstractArray{<:Real})

Fits a discrete power law distribution to the data using an approximation to the maximum likelihood estimation (MLE). The x_min value is the minimum value of the data.

source

PowerLaws.array_bins — Method

array_bins(arr::AbstractArray{T})::Dict{T,Int64} where {T <: Real}

Create a dictionary from a sorted array arr where the keys are the unique elements and the values are the indices at which these elements first appear in the array.

source

PowerLaws.bootstrap — Method

bootstrap

Bootstrap method for estimating the parameters of a power law distribution. To quantify the uncertainty in our estimate for xmin you can use bootstrap method. More information can be found in this document Power-law distributions in empirical data.

Arguments

data::AbstractArray: Data to be tested.
d::UnivariateDistribution: Distribution to be tested (ContinuousPowerLaw or DiscretePowerLaw).
no_of_sims::Int64: Number of simulations. Default is 10.
xmins::AbstractArray: Array of xmins to be tested, default is data.
xmax::Int64: Maximum value of data to be considered. Default is 1e5.
seed::Int64: Seed for random number generator. Default is 0.

Returns

statistic::Array{Tuple{UnivariateDistribution, Float64}}: Array of tuples containing the distribution and the Kolmogorov-Smirnov distance between the data and the distribution.

source

PowerLaws.bootstrap_p — Method

bootstrap_p

Performs a bootstrapping hypothesis test to determine whether a power law distribution is plausible. Inspired by R poweRlaw documentation.

Arguments

data::AbstractArray: Data to be tested.
d::UnivariateDistribution: Distribution to be tested (ContinuousPowerLaw or DiscretePowerLaw).
no_of_sims::Int64: Number of simulations. Default is 10.
xmins::AbstractArray: Array of xmins to be tested, default is data.
xmax::Int64: Maximum value of data to be considered. Default is 1e5.
seed::Int64: Seed for random number generator. Default is 0.

Returns

statistic::Array{Tuple{UnivariateDistribution, Float64}}: Array of tuples containing the distribution and the Kolmogorov-Smirnov distance between the data and the distribution.
P::Float64: p-value of the hypothesis test.

source

PowerLaws.estimate_parameters — Method

estimate_parameters

Estimate x_min and α for a given data set with respect to the Kolmogorov-Smirnov test.

Parameters

data::AbstractArray: Array of data which should be fit to a distribution.
distribution::Type: Distribution type, i.e. ContinuousPowerLaw or DiscretePowerLaw.
xmins::AbstractArray: If not specified, all unique values in data are taken as possible x_mins. If specified, only values in xmins are considered when finding the best x_min.
xmax::Int64: Maximum value considered in calculations. Values above xmax are not considered in, for example, calculating the Kolmogorov-Smirnov test.

Returns

best_fit::distribution: A distribution of the type distribution fitted to the given parameters.
KS::Float64: The Kolmogorov-Smirnov distance between the data and the fitted distribution.

source

PowerLaws.kolmogorov_smirnov_test — Function

kolmogorovsmirnovtest

Calculate Kolmogorov Smirnov test on given data and distribution.

Arguments

dat::AbstractArray: Data to be tested.
d::UnivariateDistribution: Distribution to be tested (ContinuousPowerLaw or DiscretePowerLaw).
xmin::Number: Minimum value of data to be considered.
xmax::Int64: Maximum value of data to be considered. Default is 1e5.

Returns

KS::Float64: Kolmogorov-Smirnov distance between the data and the distribution. It is the maximum absolute difference between the empirical and theoretical cumulative distribution functions (`cdf

source