Discrete Bernoulli Trial1

Introduction

The DiscreteTrial transform adapts values from the driver domain to the binary integer range {0, 1}. The weight associated with range value 1 is user-specified. The weight associated with range value 0 is one minus that.

The DiscreteTrial transform models the random binary trial around which Jakob Bernoulli built his urn model. Here range value 1 indicates success and range value 0 represents failure. The urn is filled with a mixture of N white balls and M black marbles. The marbles are mixed around, and one is drawn out. If the ball is white, the trial succeeds. If the ball is black, the trial fails. The controlling parameter p = N/(N+M) gives the probability of success. 1-p = M/(N+M) gives the probability of failure.

The proportion of 1's versus 0's output by DiscreteTrial.convert() is controlled by one parameter implemented as a Java field, weight, a double-precision number ranging from zero (always fails) to unity (always succeeds).

Profile

Figure 1 illustrates the influence which DiscreteTrial.convert() exerts over driver sequences when weight = 0.333. The vertical v axis ranges from 0 to 2; that is, the application range {0, 1} plus the number of outcomes, 2. The horizontal k axis shows the sample values vk which have been obtained from driver values xk using convert(). Each left-side sample graph presents 200 values; the right-side histogram presents a sidewise bar for each range value.

The source sequences used to create Figure 1 are the same sequences used to create the profile panel for ContinuousUniform which transform, being both continuous and uniform, passes through its driver values undecorated. So you can view the actual source sequences in that panel. All three source sequences are nominally uniform. The first source is standard randomness from Lehmer. The second source is balanced-bit values from Balance. The third source is an asceding succession produced using DriverSequence.


Figure 1: Panel of DiscreteTrial output from three different Driver sources. Each row of graphs provides a time-series graph of samples (left) and a histogram analyzed from the same samples (right). The first row of graphs was generated using the standard random number generator. The second row was generated using the balanced-bit generator. The third row was generated using an ascending sequence of driver values, equally spaced from zero to unity.

Following the Driver/Transform design, DiscreteTrial delegates the random component of the trial to an external driver, which would traditionally be the standard random number generator wrapped by Lehmer. This is what happens in the standard-random time-series graph presented as the top row of Figure 1. The second row of graphs swaps out randomness for the fair-share principle,

The standard-random time-series graph has the same relative ups and downs as the standard-random time-series graph prepared for DiscreteUniform. The histograms for the three driver sequences all closely conform to the weights prescribed above. Such conformity is hardly guaranteed from standard randomness; however, consider this: Under ideal circumstances (e.g. those of the bottom row of the number of Figure 1), the number of samples required to represent the least-weighted sample just once can be calculated as the sum of weights (0.667+0.333 = 1) divided by the smallest weight (0.333), which calculates out to 3. The 200 sample values of Figure 1 provide 200/3 = 67 opportunities to get the distribution right, so it should not be shocking if standard randomness here actually produces the distribution asked for. In truth the calculated average of 0.330 differs from weight = 0.333 by less than 1%. The standard deviation of 0.470 around this average exceeds the average's distance above zero.

The balanced-bit time-series (middle row of Figure 1) likewise has the same ups and downs as the balanced-bit time-series graph prepared for DiscreteUniform. The calculated average was again 0.330 and the standard deviation again 0.470.

The time-series graph generated using ascending, equally spaced driver values (bottom row of Figure 1) presents the cumulative distribution function or CDF for the custom distribution described above. This is an irregular ascending step function with just two steps. The horizontal width of the step is proportional to the range value's weight. The rise between steps one unit. The bottom-row histogram of sample values presents the distribution's probability density function or PDF. The bottom-row sample-sequence graph presents the distribution's cumulative distribution function or CDF.

Coding

/**
 * The {@link DiscreteTrial} class implements a discrete statistical transform based on the notion of a Bernoulli trial.
 * The Bernoulli trial was originally conceived by Jakob Bernoulli.
 * It is a random experiment with two outcomes:  success and failure, along with a probability p of success.
 * For example, to experience a trial with p=7/10 you can fill an urn with seven white balls and three black balls,
 * close your eyes, mix the balls around, and select one ball.
 * If the ball is white, the trial has succeeded.
 * If the ball is black, the trial has failed.
 * The {@link #weight} property of the the {@link DiscreteTrial} class stands in for the probability of success p.
 * As a statistical transform, {@link DiscreteTrial} does not itself employ probability or randomness.
 * Instead it responds to an externally generated driver sequence which may or may not be random.
 * {@link DiscreteDistribution#quantile(double)} converts the driver value to an outcome.
 * @author Charles Ames
 */
public class DiscreteTrial extends DiscreteDistributionTransform {
   /**
    * Determines the success rate for a trial.
    */
   private double weight;
   /**
    * Constructor for {@link DiscreteTrial} instances.
    * @param container An entity which contains this transform.
    */
   public DiscreteTrial(WriteableEntity container) {
      super(container);
      this.weight = Double.NaN;
   }
   /**
    * Getter for {@link #weight} .
    * @return The assigned {@link #weight} value.
    * @throws UninitializedException when {@link #weight} has not been initialized.
    */
   public double getWeight() {
      if (Double.isNaN(weight)) throw new UninitializedException("Weight not initialized");
      return weight;
   }
   /**
    * Setter for {@link #weight}.
    * @param weight The intended {@link #weight} value.
    * @return True if {@link #weight} has changed; false otherwise.
    */
   public boolean setWeight(double weight) {
      checkWeight(weight);
      if (this.weight != weight) {
         this.weight = weight;
         invalidate();
         makeDirty();
         return true;
      }
      return false;
   }
   /**
    * Check if the indicated value is suitable for {@link #weight}.
    * @param weight The indicated value.
    */
   public void checkWeight(double weight) {
      if (0. > weight || 1. < weight)
         throw new IllegalArgumentException("Weight not in range from zero to unity");
   }
   @Override
   protected void validate(DistributionBase<Integer> distribution) {
      ((DiscreteDistribution) distribution).calculateBinomial(1, getWeight());
   }
}
Listing 1: The DiscreteTrial implementation class.

The type hierarchy for DiscreteTrial is:

DiscreteDistributionTransform embeds a DiscreteDistribution which manages the succession of value-weight items.

Each DiscreteTrial instance internally maintains a DiscreteDistribution instance whose succession of items is populated by the call to DiscreteDistribution.calculateBinomial() in method DiscreteTrial.validate(). This call to calculateBinomial() creates two items. The first item has sample value 0 and weight 1.0−DiscreteTrial.weight. The second item has sample value 1 and weight DiscreteTrial.weight.

The distributing step of conversion happens in DiscreteDistributionTransform, where the convert() method does this:

return getDistribution().quantile(driver);

TransformBase maintains a valid field to flag parameter changes. This field starts out false and reverts to false with every time DiscreteTrial calls TransformBase.invalidate(). This happens with any change to weight. Any call to TransformBase.getDistribution() (and DiscreteDistributionTransform.convert() makes such a call) first creates the distribution if it does not already exist, then checks valid. If false, then getDistribution() calls validate(), which is abstract to TransformBase but whose implementation is made concrete by DiscreteTrial. And that particular implementation of validate() makes use of DiscreteDistribution.calculateBinomial(1, getWeight()) to recalculate the two distribution items.

Comments

  1. The present text is adapted from my Leonardo Music Journal article from 1991, "A Catalog of Statistical Distributions". The heading is "Bernoulli", p. 58.

© Charles Ames Page created: 2022-08-29 Last updated: 2022-08-29