The ContinuousBeta
transform adapts
values from the driver domain to a bounded range, where the relative concentrations are governed
by a member of the beta family of distribution curves.
The range of values output by ContinuousBeta.convert()
is
controlled by two parameters implemented as Java fields: minRange
and
maxRange
. These have the restriction that
minRange < maxRange
.
The shape of the distribution curve is
controlled by two additional parameters: alpha
, symbolized α and
beta
, symbolized β. Both shape parameters must be greater
than zero.
Each ContinuousBeta
instance internally maintains a
ContinuousDistribution
instance
which divides the range from zero to unity into trapezoids of width
1/itemCount
. The trapezoid height for sample value
z
is calculated using the formula:
Math.pow(z, alpha - 1) * Math.pow(1 - z, beta - 1)
The convert()
method maps a value x
in the driver domain from zero to unity into a value v
in the application-range
from minRange
to maxRange
in two
steps.
The first step uses ContinuousDistribution.quantile()
to recast the driver value x
into an intermediate value
z
, also between zero and unity.
The second step applies the
linear interpolation
formula:
v = (maxRange-minRange)*z + minRange
.
For the beta family of distribution curves, Wikipedia gives a parametric mean (average, symbolized μ) of α / (α+β) and a parametric variance (squared deviation, symbolized σ2) of αβ / ((α+β)2(α+β+1)).
Figure 1 illustrates the influence which
ContinuousBeta.convert()
exerts over driver sequences when alpha
is 5 and
beta
is 2. This panel was created using the same driver sources used for the
ContinuousUniform
,
which earlier panel provides a basis for comparison.
ContinuousBeta
output from three different
Driver
sources. Each row of graphs provides a time-series graph of samples (left)
and a histogram analyzed from the same samples (right).
The first row of graphs was generated using the standard random number generator. The second
row was generated using the balanced-bit generator. The third row was generated using an ascending sequence of driver values,
equally spaced from zero to unity.
The standard-random time-series graph (top row of Figure 1) has the same relative ups and downs as the standard-random time-series graph prepared for
ContinuousUniform
, but the
specific values are squinched up toward the upper range bound. This difference becomes much clearer in the standard-random
histogram, where the whitespace separating the vertical v axis from the
smallest f(v) value progressively increases as v
increases from zero to unity. Notice that while these histogram peaks and valleys are similar to those derived for
ContinuousUniform
, they
are not the same. The fact that values squinch upwards means that range values which fell into the bottommost histogram
region in the uniform histogram were spread across the bottom three regions here in the beta histogram. Likewise the range
values which fell into the topmost histogram region here were spread across three regions in the uniform histogram.
The balanced-bit time-series (middle row of Figure 1) likewise has the same ups and downs as the balanced-bit time-series graph prepared for
ContinuousUniform
with
values squinched similarly. Since balanced-bit sequences strive aggressively for uniformity, the histogram peaks and
valleys are comparatively restrained.
The time-series graph generated using ascending, equally spaced driver values (bottom row of Figure 1) presents the quantile function for this instance of the continuous beta distribution. The histogram of sample values presents the distribution's probability density function or PDF. The PDF is an equal-ratios curve bending upward from f(v) = 1 when v = 0 to f(v) = 3 when v = 1. Looking back at the time-series graph, notice how the quantile function rises more steeply where the distribution is rarefied and less steeply where the distribution is concentrated.
For each graph in Figure 1 the average sample value is plotted as a dashed green line, while the interval between ± one standard deviation around the average is filled in with a lighter green background. For the ideally uniform driver values plotted in the third row of graphs, the average sample value is 0.719 and the standard deviation is 0.159. The interval from 0.719-0.159 to 0.719+0.159 is 2*0.159 = 0.318 = 32% of the full application range from zero to unity. Since the continuous uniform distribution had 58% of samples within ± one standard deviation of the mean, this suggests that with the beta distribution with alpha 5 and beta 2 is squeezing 58% of samples into 32% of the application range, giving a concentration rate of 58/32 = 1.81.
Since the bottommost row in Figure 1 illustrates the most ideal conditions
under which a profile can be generated, the numerical average and deviation should closely match the parametric
values supplied by Wikipedia. For α = 5 and β = 2,
the parametric mean calculates out to μ = 5 / (5+2) = 5/7 = 0.714. The parametric
variance (square of deviation) is
σ2 = 5×2 / ((5+2)2(5+2+1)) = 10 / (7×7×8) = 10/392 = 0.0255
so the deviation is σ = √0.0255 = 0.159. Increasing the itemCount
from 200 (used for Figure 1) to 500 produces a bottom-row numerical average of 0.716 without noticably changing the
graph.
Figures 2 (a) through 2 (d) show how changes in parameter settings affect the distribution curves. Each figure provides two graphs. The upper graph shows the probability density function or PDF. The lower graph shows the cumulative distribution function or CDF.
The series of graphs start with α = 1 and β = 1; (Wikipedia shows things going a little crazy when these parameters fall below unity.) The graph for α = β = 1 (━ in Figure 2 (a)) resolves to the flat continuous uniform curve with a deviation of σ = 0.289. For α = β = 2 (━ in Figure 2 (b)), the curve becomes a gentle bump anchored at zero at both extremes, with a deviation of σ = 0.224. For α = β = 3 (━ in Figure 2 (c)), the deviation narrows to σ = 0.188 and the curve just begins to flare outward at the extremes. For α = β = 5 (━ in Figure 2 (d)), the deviation narrows further to σ = 0.151 and the regions near the extremes can definitely be described as tails.
For α ≠ β, keep in mind that the parameters are symmetric: the graph for α = B and β = A is the mirror image of the graph for α = A and β = B. The α parameter exerts a suppressive effect on the distribution near zero, while the β parameter exerts a suppressive effect on the distribution near unity. In consequence, the mean μ shifts leftward as β increases relative to α and rightward as α increases relative to β.
ContinuousBeta
implementation class.
The type hierarchy for ContinuousBeta
is:
TransformBase<T
extends Number>
extends WriteableEntity
implements Transform<T>
ContinuousDistributionTransform
extends TransformBase<Double>
implements Transform.Continuous
BoundedTransform
extends ContinuousDistributionTransform
ContinuousBeta
extends BoundedTransform
Class ContinuousDistributionTransform
embeds a
ContinuousDistribution
instance capable of approximating most any continuous distribution as a succession of trapezoids.
Each ContinuousDistribution
trapezoid item has
left
, right
, origin
,
and goal
fields.
Understand that the succession of trapezoids ranges from zero to unity, not minRange
to
maxRange
. The trick with leveraging ContinuousDistribution
instances is that the trapezoids need recalculating every time a parameter changes. Updating one single trapezoid item is
not that big a deal, but more typically the number of will be 20 or more (my canned Normal distribution uses 200 trapezoids); also, the calculating
formulas often include exponents. So it makes sense to abstract the range boundaries out of the distribution and to apply range scaling
separately.
The distributing step of conversion happens in ContinuousDistributionTransform
,
where the convert()
method does this:
return getDistribution().quantile(driver);
Range scaling happens in BoundedTransform
,
where the convert()
method does this:
return interpolate(super.convert(driver));
And BoundedTransform.interpolate(factor)
does this (ignoring pesky initialization checks):
return (maxRange-minRange)*factor + minRange;
.
TransformBase
maintains a valid
field
to flag parameter changes. This field starts out false
and reverts to false
with every time ContinuousBeta
calls TransformBase.invalidate()
. This happens
with any change to alpha
, beta
,
or itemCount
. Any call to TransformBase.getDistribution()
(and ContinuousDistributionTransform.convert()
makes such a call) first creates
the distribution if it does not already exist, then checks valid
. If false
,
then getDistribution()
calls validate()
, which is
abstract
to TransformBase
but whose implementation is
made concrete by ContinuousBeta
. And that particular implementation of validate()
makes use of ContinuousDistribution.calculateBeta(alpha, beta, itemCount)
to recalculate the
succession of trapezoids.
© Charles Ames | Page created: 2022-08-29 | Last updated: 2022-08-29 |