Discrete Build-It-Yourself Transform1

Introduction

The DiscreteWeighted transform adapts values from the driver domain to a bounded integer range, where the weight associated with each range value is user-specified. The DiscreteWeighted transform was implemented to satisfy a functional role for build-it-yourself distributions when none of the boilerplate transforms will do the job. However there is a specific use case in mind, and that is the procedure Iannis Xenakis used to select timbres for notes in his Stochastic Music Program.

The range of values output by DiscreteWeighted.convert() is controlled by one parameter implemented as a Java field, itemCount, a positive integer.

Each DiscreteWeighted instance internally maintains a DiscreteDistribution instance whose succession of items is supplied either by the package consumer or the end user. The first item's sample value is zero; each following item's sample value field equals its predecessor's value plus one, and the final item's sample value field is itemCount-1. The weight field for any item may never be negative, and not all weights may be zero.

The convert() method maps a value x in the driver domain from zero to unity into a value v in the application range from zero to itemCount-1 by delegating to DiscreteDistribution:

v = DiscreteDistribution.quantile(x);

Profile

Figure 1 illustrates the influence which DiscreteWeighted.convert() exerts over driver sequences. The vertical v axis shows the application range when itemCount = 7, which is the succession of integers from 0 to 6. The horizontal k axis shows the sample values vk which have been obtained from driver values xk using convert(). All three source sequences are nominally uniform. Each left-side sample graph presents 200 values; the right-side histogram presents a sidewise bar for each range value.

This panel was created using the same driver sources used for ContinuousUniform, which earlier panel provides a basis for comparison. The distribution was specified using the text string:

0 3 1 2 0 4 1

This distribution was chosen to illustrate the features of the DiscreteWeighted transform. The text above specifies weights for range values 0 to 6 giving an implict itemCount of 7. Specifying a weight of zero for value 0 and value 4 excludes these values from selection. Decimal fractions do not appear in this text (the proportions are clearer that way), but their use is not forbidden.


Figure 1: Panel of DiscreteWeighted output from three different Driver sources. Each row of graphs provides a time-series graph of samples (left) and a histogram analyzed from the same samples (right). The first row of graphs was generated using the standard random number generator. The second row was generated using the balanced-bit generator. The third row was generated using an ascending sequence of driver values, equally spaced from zero to unity.

The standard-random time-series graph (top row of Figure 1) has the same relative ups and downs as the standard-random time-series graph prepared for DiscreteUniform. The histograms for the three driver sequences all closely conform to the weights prescribed above. Such conformity is hardly guaranteed from standard randomness; however, consider this: Under ideal circumstances (e.g. those of the bottom row of the number of Figure 1), the number of samples required to represent the least-weighted sample just once can be calculated as the sum of weights (0+3+1+2+0+4+1 = 11) divided by the smallest weight (1). The 200 sample values of Figure 1) provide 200/11 = 18 opportunities to get the distribution right, so it should not be shocking if standard randomness here actually produces the distribution asked for.

The balanced-bit time-series (middle row of Figure 1) likewise has the same ups and downs as the balanced-bit time-series graph prepared for DiscreteUniform.

The time-series graph generated using ascending, equally spaced driver values (bottom row of Figure 1) presents the cumulative distribution function or CDF for the custom distribution described above. This is an irregular ascending step function with one step for each positively weighted range value. The horizontal width of the step is proportional to the range value's weight. The height of the step is two units when a zero-weighted value is passed over, and one unit otherwise. The one-unit gap at the bottom happens because sample value 0 has zero weight, and zero weight excludes a value from selection. The bottom-row histogram of sample values presents the distribution's probability density function or PDF. For the present distribution the histogram contains bars whose lengths are proportional to the range value's weight, and no bar at all when the weight is zero.

Summary statistics such as the average sample value and the standard deviation have little relevance for Figure 1, where the average is not situated at the peak of a single discernable hump. I therefore disabled the feature which plots the average as a dashed green line and the standard deviation as an enclosing region.

Coding

/**
 * The {@link DiscreteWeighted} class implements a discrete statistical transform based distribution of
 * values from 0 (inclusive) to <i><b>N</b></i> (exclusive) with non-uniform weights.
 * <p>
 * As a statistical transform, {@link DiscreteWeighted} does not itself employ probability or randomness.
 * Instead it responds to an externally generated driver sequence which may or may not be random.
 * {@link DiscreteDistribution#quantile(double)} converts the driver value to an outcome.
 * </p>
 * <p>
 * For more information including a graph mapping driver values to results and a second graph showing a random population, see <a href="http://www.jstor.org/stable/1513123"><i>A Catalog of Statistical Distributions</i>, 1991</a>.
 * </p>
 * @author Charles Ames
 */
public class DiscreteWeighted extends DiscreteDistributionTransform {
   /**
    * Constructor for {@link DiscreteWeighted} instances.
    * @param container An entity which contains this transform.
    */
   public DiscreteWeighted(WriteableEntity container) {
      super(container);
   }
   /**
    * Set the weights data from an array.
    * @param weights An array of non-negative double-precision numbers.
    * @throws IllegalArgumentException when any weight is negative.
    */
   public void setWeights(double[] weights) {
      super.addWeights(weights);
   }
   /**
    * Set the weights data from an collection.
    * @param weights An array of non-negative double-precision numbers.
    * @throws IllegalArgumentException when any weight is negative.
    */
   public void setWeights(List<Double> weights) {
      super.addWeights(weights);
   }
   /**
    * Set the weights data from an array of {@link ContourFromRealFromRatio} instances.
    * @param weights An array of {@link ContourFromRealFromRatio} instances whose abscissas are
    * always non-negative double-precision numbers.
    * @param ordinate Used to look up a specific weight from a contour.
    * @throws IllegalArgumentException when any weight is negative.
    */
   public void setWeights(ContourFromRealFromRatio[] weights, Ratio ordinate) {
      DiscreteDistribution distribution = getDistribution();
      distribution.initialize();
      int value = 0;
      for (ContourFromRealFromRatio contour : weights) {
         double weight = contour.valueAt(ordinate);
         distribution.addItem(value++, weight);
      }
      distribution.normalize();
   }
   /**
    * Set the weights data from an array of {@link ContourFromRealFromReal} instances.
    * @param weights An array of {@link ContourFromRealFromReal} instances whose abscissas are
    * always non-negative double-precision numbers.
    * @param ordinate Used to look up a specific weight from a contour.
    * @throws IllegalArgumentException when any weight is negative.
    */
   public void setWeights(ContourFromRealFromReal[] weights, Double ordinate) {
      DiscreteDistribution distribution = getDistribution();
      distribution.initialize();
      int value = 0;
      for (ContourFromRealFromReal contour : weights) {
         double weight = contour.valueAt(ordinate);
         distribution.addItem(value++, weight);
      }
      distribution.normalize();
   }
   /**
    * Set the weights data from a text string.
    * @param text A whitespace-delimited list of decimal numbers.
    */
   public void parseWeightsText(String text) {
      try {
         String[] tokens = text.split("\\s+");
         double[] weights = new double[tokens.length];
         for (int index = 0; index < tokens.length; index++) {
            weights[index] = Double.valueOf(tokens[index]);
         }
         setWeights(weights);
      }
      catch (RuntimeException e) {
         throw new IllegalArgumentException("Invalid weight vector [" + text + "]");
      }
   }
}
Listing 1: The DiscreteWeighted implementation class.

The type hierarchy for DiscreteWeighted is:

DiscreteDistributionTransform embeds a DiscreteDistribution which manages the succession of value-weight items.

The all-important convert() method is implemented by DiscreteDistributionTransform. DiscreteWeighted has no valid field to flag parameter changes. The individual weights are in fact the parameters and if you want to change those, you simply make a call to DiscreteWeighted.setWeights(). This method is itself simply a proxy for DiscreteDistributionTransform.addWeights().

The variants of setWeights() that accept ContourFromRealFromRatio arrays and ContourFromRealFromReal arrays are intended to facilitate statistical selection when weights vary with time or with some other parameter (e.g. what Xenakis did).

The parseWeightsText() method makes rough-and-ready use of String.split() to break up a space-delimited string of numbers into single-value text snippets. I am not totally comfortable with the way this method allocates snippets off the heap, keeps them around just long enough to extract a number, then abandons the snippets to garbage collection. If I thought this method were to be frequently used (and I do not), I would seek a parsing algorithm which was less heap-intensive.

Comments

  1. The present text is adapted from my Leonardo Music Journal article from 1991, "A Catalog of Statistical Distributions". The heading is "Discrete Distributions", p. 58.

© Charles Ames Page created: 2022-08-29 Last updated: 2022-08-29