Discretize Distributions.jl
A Julia package for converting continuous and discrete probability distributions into discrete representations with interval-based support using IntervalArithmetic.jl.
The package provides functions to discretize univariate distributions into DiscreteNonParametric distributions where the support consists of IntervalArithmetic.Interval objects. Each interval [a, b) represents a probability mass over that range, computed using the cumulative distribution function (CDF) for continuous distributions or aggregated probability mass function (PMF) for discrete distributions.
Alternative Packages
In julia, there are more lightweight discretizations. This package creates a new distribution that matches the discrete approximation (and so should be faster to simulate from) however this approach has more overhead than wrapping the existing pdf so these alternatives are recommended for fitting censored or discretized distributions (i.e. with Turing or some other package):
pdf/logpdf/cdf/logcdfmethods from (StatsDiscretizations.jl)[https://github.com/nignatiadis/StatsDiscretizations.jl/tree/master]- (CensoredDistributions.jl)[https://github.com/EpiAware/CensoredDistributions.jl] which also has the ability for account for double censoring and truncation
In R:
distcretein the (discrete)[https://github.com/reconhub/distcrete] packagediscretizein the (actuar)[https://gitlab.com/vigou3/actuar] package
Limitations
- Finite support: Infinite distributions are truncated using quantile bounds (default 0.1% and 99.9%)
- Discrete distribution quirks: Discretizing already-discrete distributions has some limitations and edge cases
- Non-integer discrete values: Discrete distributions with non-integer support may behave unexpectedly
- Numeric means: Not all distributions have exact numeric means (i.e. truncated Gamma), these are needed for the
:unbiasedmethod so a backup numeric mean is calculated where possible using the trapezoid rule withtrapezoid_points.
Future Work
- Develop better warnings for incompatible distributions
- Support for multivariate distributions
API Overview
The package provides three main discretize methods:
- Fixed intervals:
discretize(dist, interval_width)- Creates uniform intervals of specified width - Custom boundaries:
discretize(dist, boundaries)- Uses custom interval boundaries - Pre-constructed intervals:
discretize(dist, intervals)- Uses pre-builtIntervalobjects
All methods return a DiscreteNonParametric distribution with support determined by the method parameter.
Method Parameter
The discretize functions accept a method parameter that controls the output format:
:interval(default): ReturnsIntervalArithmetic.Intervalobjects as support points:left_aligned: Returns left endpoints of intervals as point masses:centred: Returns interval midpoints as point masses:right_aligned: Returns right endpoints of intervals as point masses:unbiased: Returns mean-preserving point masses (requires equal interval widths)
normal_dist = Normal(0, 1)
# Different output methods
intervals = discretize(normal_dist, 0.5; method=:interval) # Interval objects
left_points = discretize(normal_dist, 0.5; method=:left_aligned) # Left endpoints
center_points = discretize(normal_dist, 0.5; method=:centred) # Midpoints
right_points = discretize(normal_dist, 0.5; method=:right_aligned) # Right endpointsUnbiased Method
The :unbiased method provides mean-preserving discretization designed to minimize the difference between the original distribution's mean and the discretized distribution's mean. This is an implementation from the discretize function in the R package actuar.
# Unbiased discretization - preserves mean
normal_dist = Normal(2.0, 1.0)
unbiased_discrete = discretize(normal_dist, 0.2; method=:unbiased)
# Compare means
println("Original mean: ", mean(normal_dist)) # 2.0
println("Unbiased mean: ", mean(unbiased_discrete)) # ≈ 2.0
println("Centered mean: ", mean(discretize(normal_dist, 0.2; method=:centred)))Both preserve the mean but the unbiased gives more control, supporting all values between [min, min + interval, ..., max] or [lowerquantile, lowerquantile + interval, ..., upper_quantile], where as centred (which maintains the mean) by necessity supports [min + interval/2, min + 3*interval/2, ..., max - interval/2]. However this requires that the mean of the given distribution be defined, and where an analytical approach is not defined (but the mean of the distribution isn't undefined in general) in Distributions an empirical mean is calculated.
Working with Results
using Distributions, DiscretizeDistributions, IntervalArithmetic
# Discretize a normal distribution
normal_dist = Normal(0, 1)
interval_dist = discretize(normal_dist, 0.5)
# The result has interval support
support(interval_dist) # Vector of Interval{Float64} objects
probs(interval_dist) # Corresponding probabilities
# Convert to point-based distributions
left_aligned = left_align_distribution(interval_dist) # Use left endpoints
centered = centred_distribution(interval_dist) # Use midpoints
right_aligned = right_align_distribution(interval_dist) # Use right endpointsMathematical Details
Continuous Distributions
For continuous distributions, discretisation computes probability masses using the cumulative distribution function (CDF):
\[P(X' ∈ [a_i, a_{i+1})) = F(a_{i+1}) - F(a_i)\]
where F(x) is the CDF of the continuous distribution X.
Discrete Distributions
For discrete distributions, probability masses are aggregated over intervals using the probability mass function (PMF):
\[P(X' ∈ [a_i, a_{i+1})) = ∑_{k=⌈a_i⌉}^{⌊a_{i+1}⌋-1} P(X = k) + (P(X = ⌊a_i⌋) × (⌈a_i⌉ - a_i)) + (P(X = ⌊a_{i+1}⌋) × (a_{i+1} - ⌊a_{i+1}⌋))\]
All resulting discrete distributions are normalized to ensure probabilities sum to 1.
Advanced Usage
Handling Unbounded Distributions
For distributions with infinite support, control truncation with quantile bounds:
# Normal distribution - unbounded in both directions
normal_dist = Normal(0, 1)
discrete_normal = discretize(normal_dist, 0.2; min_quantile=0.005, max_quantile=0.995)
# Exponential distribution - unbounded above
exp_dist = Exponential(1.0)
discrete_exp = discretize(exp_dist, 0.1; max_quantile=0.99)
# Result includes infinite tail intervals
support(discrete_exp) # [..., interval(4.5, 5.0), interval(5.0, ∞)]Custom Interval Structures
Create non-uniform discretisations with custom boundaries:
# Fine resolution near zero, coarser elsewhere
custom_boundaries = [-5.0, -2.0, -1.0, -0.5, 0.0, 0.5, 1.0, 2.0, 5.0]
discrete_custom = discretize(Normal(0, 1), custom_boundaries)
# Results in intervals: [(-∞,-5], [-5,-2], [-2,-1], ..., [5,∞)]
length(support(discrete_custom)) # 10 intervals (8 from boundaries + 2 infinite tails)Working with Pre-constructed Intervals
For advanced use cases, you can provide pre-constructed IntervalArithmetic.Interval objects:
using IntervalArithmetic
# Create custom intervals with specific properties
intervals = [
interval(-2.0, -1.0), # Standard interval
interval(-1.0, 0.0), # Adjacent interval
interval(0.0, 2.0), # Wider interval
interval(2.0, Inf) # Semi-infinite interval
]
# Discretize using these intervals
normal_dist = Normal(0, 1)
discrete_custom = discretize(normal_dist, intervals)DiscretizeDistributions.discretize — Functiondiscretize(dist::Distributions.UnivariateDistribution, interval::Real;
method=:interval, min_quantile=0.001, max_quantile=0.999)Discretize a univariate distribution into a discrete distribution using fixed intervals.
This function converts a univariate distribution into a discrete one by dividing the distribution's support into intervals of fixed width and computing the probability mass in each interval.
Arguments
dist::Distributions.UnivariateDistribution: The distribution to discretize (continuous or discrete)interval::Real: The width of each discretisation intervalmethod::Symbol=:interval: Method for representing the output distribution:interval(default): ReturnIntervalArithmetic.Intervalobjects as support:left_aligned: Convert intervals to left-aligned point values:centred: Convert intervals to centered point values:right_aligned: Convert intervals to right-aligned point values:unbiased: Return unbiased point estimates (requires equal interval widths), designed such that the means match, seediscretizefrom the R packageactuar
min_quantile=0.001: Lower quantile bound for unbounded distributionsmax_quantile=0.999: Upper quantile bound for unbounded distributionstrapezoid_points::Int=10000: Number of points for numerical integration of the mean (when needed)
Returns
DiscreteNonParametric: Discrete distribution with support determined by the method parameter
Details
For bounded distributions, the natural bounds are used. For unbounded distributions, the bounds are determined using the specified quantiles. The probability mass in each interval is computed using the CDF for continuous distributions or a pseudo-CDF for discrete distributions.
Examples
using Distributions, DiscretizeDistributions, IntervalArithmetic
# Discretize a normal distribution with interval width 0.5
normal_dist = Normal(0, 1)
# Different output methods
discrete_intervals = discretize(normal_dist, 0.5) # Intervals (default)
discrete_left = discretize(normal_dist, 0.5; method=:left_aligned) # Left endpoints
discrete_center = discretize(normal_dist, 0.5; method=:centred) # Midpoints
discrete_right = discretize(normal_dist, 0.5; method=:right_aligned) # Right endpoints
# Compare means (centered method typically closest to original)
println("Original mean: ", mean(normal_dist))
println("Centered discretization mean: ", mean(discrete_center))
# Discretize a discrete distribution
poisson_dist = Poisson(3.0)
discrete_poisson = discretize(poisson_dist, 2; method=:centred)discretize(dist::Distributions.UnivariateDistribution, interval::AbstractVector; method=:interval)Discretize a univariate distribution using custom interval boundaries.
This function converts a univariate distribution into a discrete one using user-specified interval boundaries. The resulting distribution represents the probability mass in each interval.
Arguments
dist::Distributions.UnivariateDistribution: The distribution to discretizeinterval::AbstractVector: Vector of interval boundaries (will be sorted automatically)method::Symbol=:interval: Method for representing the output distribution:interval(default): ReturnIntervalArithmetic.Intervalobjects as support:left_aligned: Convert intervals to left-aligned point values:centred: Convert intervals to centered point values:right_aligned: Convert intervals to right-aligned point values:unbiased: Return unbiased point estimates (requires equal interval widths), designed such that the means match, seediscretizefrom the R packageactuar
trapezoid_points::Int=10000: Number of points for numerical integration of the mean (when needed)
Returns
DiscreteNonParametric: Discrete distribution with support determined by the method parameter
Details
The input interval vector is automatically sorted and combined with distribution bounds. Probability masses are computed using the CDF for continuous distributions or pseudo-CDF for discrete distributions. The resulting distribution represents probability masses over intervals [a_i, a_{i+1}).
For the :unbiased method with unequal intervals, the function will warn and fall back to :centred.
Examples
using Distributions, DiscretizeDistributions, IntervalArithmetic
# Discretize using custom intervals
normal_dist = Normal(5, 2)
custom_intervals = [0.0, 2.0, 4.0, 6.0, 8.0, 10.0]
# Different output methods
discrete_intervals = discretize(normal_dist, custom_intervals) # Intervals
discrete_left = discretize(normal_dist, custom_intervals; method=:left_aligned) # Left points
discrete_center = discretize(normal_dist, custom_intervals; method=:centred) # Midpoints
discrete_right = discretize(normal_dist, custom_intervals; method=:right_aligned) # Right points
# Support: intervals like [interval(-∞, 0.0), interval(0.0, 2.0), ..., interval(10.0, ∞)]
# Discrete distribution with custom intervals
poisson_dist = Poisson(3.0)
discrete_poisson = discretize(poisson_dist, [0.5, 2, 4, 6, 8, 10]; method=:centred)discretize(dist::Distributions.UnivariateDistribution,
interval::AbstractVector{IntervalArithmetic.Interval{X}}; method=:interval) where X <: RealDiscretize a univariate distribution using pre-constructed interval objects.
This function converts a univariate distribution into a discrete one using user-specified IntervalArithmetic.Interval objects. This is the core discretization method that all other discretize methods ultimately call.
Arguments
dist::Distributions.UnivariateDistribution: The distribution to discretizeinterval::AbstractVector{IntervalArithmetic.Interval{X}}: Vector of pre-constructed intervalsmethod::Symbol=:interval: Method for representing the output distribution:interval(default): Return intervals as support points:left_aligned: Convert intervals to left-aligned point values:centred: Convert intervals to centered point values:right_aligned: Convert intervals to right-aligned point values:unbiased: Return unbiased point estimates (requires equal interval widths), designed such that the means match, seediscretizefrom the R packageactuar
trapezoid_points::Int=10000: Number of points for numerical integration of the mean (when needed)
Returns
DiscreteNonParametric: Discrete distribution with support determined by the method parameter
Details
This method computes probability masses directly using the interval boundaries. For each interval [a, b], the probability is computed as cdf(dist, b) - cdf(dist, a). The resulting probabilities are normalized to sum to 1.
Examples
using Distributions, DiscretizeDistributions, IntervalArithmetic
# Create intervals manually
intervals = [interval(-1.0, 0.0), interval(0.0, 1.0), interval(1.0, 2.0)]
# Discretize using these intervals with different methods
normal_dist = Normal(0, 1)
discrete_intervals = discretize(normal_dist, intervals) # Intervals
discrete_centered = discretize(normal_dist, intervals; method=:centred) # Midpoints
discrete_left = discretize(normal_dist, intervals; method=:left_aligned) # Left endpoints
# Each method gives the same probabilities but different support representationsDiscretizeDistributions.left_align_distribution — Functionleft_align_distribution(dist::Distributions.DiscreteNonParametric{IntervalArithmetic.Interval{T}, ...})Convert an interval-based discrete distribution to a left-aligned point-based distribution.
This function takes a discrete distribution with interval support and creates a new distribution where each support point is positioned at the left endpoint (infimum) of the corresponding interval. Infinite intervals are automatically removed before conversion.
Arguments
dist::DiscreteNonParametric{Interval{T}, ...}: Input discrete distribution with interval support
Returns
DiscreteNonParametric{T, ...}: New distribution with left-aligned point support
Examples
using Distributions, DiscretizeDistributions, IntervalArithmetic
# Create an interval-based distribution
intervals = [interval(0.0, 1.0), interval(1.0, 2.0), interval(2.0, 3.0)]
probs = [0.3, 0.4, 0.3]
interval_dist = DiscreteNonParametric(intervals, probs, check_args=false)
# Convert to left-aligned points
left_aligned = left_align_distribution(interval_dist)
# Support becomes [0.0, 1.0, 2.0] (left endpoints of intervals)DiscretizeDistributions.centred_distribution — Functioncentred_distribution(dist::Distributions.DiscreteNonParametric{IntervalArithmetic.Interval{T}, ...})Convert an interval-based discrete distribution to a centered point-based distribution.
This function takes a discrete distribution with interval support and creates a new distribution where each support point is positioned at the center (midpoint) of the corresponding interval. Infinite intervals are automatically removed before conversion.
Arguments
dist::DiscreteNonParametric{Interval{T}, ...}: Input discrete distribution with interval support
Returns
DiscreteNonParametric{T, ...}: New distribution with centered point support
Examples
using Distributions, DiscretizeDistributions, IntervalArithmetic
# Create an interval-based distribution
intervals = [interval(0.0, 1.0), interval(1.0, 2.0), interval(2.0, 3.0)]
probs = [0.3, 0.4, 0.3]
interval_dist = DiscreteNonParametric(intervals, probs, check_args=false)
# Convert to centered points
centered = centred_distribution(interval_dist)
# Support becomes [0.5, 1.5, 2.5] (midpoints of intervals)DiscretizeDistributions.right_align_distribution — Functionright_align_distribution(dist::Distributions.DiscreteNonParametric{IntervalArithmetic.Interval{T}, ...})Convert an interval-based discrete distribution to a right-aligned point-based distribution.
This function takes a discrete distribution with interval support and creates a new distribution where each support point is positioned at the right endpoint (supremum) of the corresponding interval. Infinite intervals are automatically removed before conversion.
Arguments
dist::DiscreteNonParametric{Interval{T}, ...}: Input discrete distribution with interval support
Returns
DiscreteNonParametric{T, ...}: New distribution with right-aligned point support
Examples
using Distributions, DiscretizeDistributions, IntervalArithmetic
# Create an interval-based distribution
intervals = [interval(0.0, 1.0), interval(1.0, 2.0), interval(2.0, 3.0)]
probs = [0.3, 0.4, 0.3]
interval_dist = DiscreteNonParametric(intervals, probs, check_args=false)
# Convert to right-aligned points
right_aligned = right_align_distribution(interval_dist)
# Support becomes [1.0, 2.0, 3.0] (right endpoints of intervals)