ChartDirector 6.0 (Perl Edition)

Histogram with Bell Curve




The example demonstrates creating a histogram with a bell curve.

A histogram is chart plotting the distribution of numerical data. Typically, this is by plotting count of objects that fall within certain data ranges. The most common data representation is bars, as a bar can represent the count with its height, and the data range with its position and its width.

One of the most common types of distribution is the normal distribution. So it is common to add a normal distribution curve (also known as the bell curve) on the chart.

In this example, the histogram is achieved by using a bar layer (BarLayer), and the normal distribution curve by using a spline layer (SplineLayer). About half of the code in this example is in computing the data to be passed to the bar layer and spline layer, and the other half creating the chart. The ArrayMath utility class is used to obtain the max, min, mean and standard deviation, thereby simplifying the computation code.

Source Code Listing

[Standalone Version] perldemo\histogram.pl
#!/usr/bin/perl

# In the sample code, the ChartDirector for Perl module is assumed to be in "../lib"
use File::Basename;
use lib (dirname($0)."/../lib") =~ /(.*)/;

use perlchartdir;

#
# This example demonstrates creating a histogram with a bell curve from raw data. About half of the
# code is to sort the raw data into slots and to generate the points on the bell curve. The
# remaining half of the code is the actual charting code.
#

# Generate a random guassian distributed data series as the input data for this example.
my $r = new RanSeries(66);
my $samples = $r->getGaussianSeries(200, 100, 10);

#
# Classify the numbers into slots. In this example, the slot width is 5 units.
#
my $slotSize = 5;

# Define our own floor function to avoid using the huge POSIX module just for this function.
sub floor_func { my $ret = int($_[0]); $ret - (($ret > $_[0]) ? 1 : 0); }

# Compute the min and max values, and extend them to the slot boundary.
my $m = new ArrayMath($samples);
my $minX = floor_func($m->min() / $slotSize) * $slotSize;
my $maxX = floor_func($m->max() / $slotSize) * $slotSize + $slotSize;

# We can now determine the number of slots
my $slotCount = int(($maxX - $minX + 0.5) / $slotSize);
my $frequency = [(0) x $slotCount];

# Count the data points contained in each slot
for(my $i = 0; $i < scalar(@$samples); ++$i) {
    my $slotIndex = int(($samples->[$i] - $minX) / $slotSize);
    $frequency->[$slotIndex] = $frequency->[$slotIndex] + 1;
}

#
# Compute Normal Distribution Curve
#

# The mean and standard deviation of the data
my $mean = $m->avg();
my $stdDev = $m->stdDev();

# The normal distribution curve (bell curve) is a standard statistics curve. We need to vertically
# scale it to make it proportion to the frequency count.
my $scaleFactor = $slotSize * scalar(@$samples) / $stdDev / sqrt(6.2832);

# In this example, we plot the bell curve up to 3 standard deviations.
my $stdDevWidth = 3.0;

# We generate 4 points per standard deviation to be joined with a spline curve.
my $bellCurveResolution = int($stdDevWidth * 4 + 1);
my $bellCurve = [(0) x $bellCurveResolution];
for(my $i = 0; $i < $bellCurveResolution; ++$i) {
    my $z = 2 * $i * $stdDevWidth / ($bellCurveResolution - 1) - $stdDevWidth;
    $bellCurve->[$i] = exp(-$z * $z / 2) * $scaleFactor;
}

#
# At this stage, we have obtained all data and can plot the chart.
#

# Create a XYChart object of size 600 x 360 pixels
my $c = new XYChart(600, 360);

# Set the plotarea at (50, 30) and of size 500 x 300 pixels, with transparent background and border
# and light grey (0xcccccc) horizontal grid lines
$c->setPlotArea(50, 30, 500, 300, $perlchartdir::Transparent, -1, $perlchartdir::Transparent,
    0xcccccc);

# Display the mean and standard deviation on the chart

$c->addTitle(sprintf("Mean = %s, Standard Deviation = %s", $c->formatValue($mean, "{value|1}"),
    $c->formatValue($stdDev, "{value|2}")), "arial.ttf");


# Set the x and y axis label font to 12pt Arial
$c->xAxis()->setLabelStyle("arial.ttf", 12);
$c->yAxis()->setLabelStyle("arial.ttf", 12);

# Set the x and y axis stems to transparent, and the x-axis tick color to grey (0x888888)
$c->xAxis()->setColors($perlchartdir::Transparent, $perlchartdir::TextColor,
    $perlchartdir::TextColor, 0x888888);
$c->yAxis()->setColors($perlchartdir::Transparent);

# Draw the bell curve as a spline layer in red (0xdd0000) with 2-pixel line width
my $bellLayer = $c->addSplineLayer($bellCurve, 0xdd0000);
$bellLayer->setXData2($mean - $stdDevWidth * $stdDev, $mean + $stdDevWidth * $stdDev);
$bellLayer->setLineWidth(2);

# Draw the histogram as bars in blue (0x6699bb) with dark blue (0x336688) border
my $histogramLayer = $c->addBarLayer($frequency, 0x6699bb);
$histogramLayer->setBorderColor(0x336688);
# The center of the bars span from minX + half_bar_width to maxX - half_bar_width
$histogramLayer->setXData2($minX + $slotSize / 2.0, $maxX - $slotSize / 2.0);
# Configure the bars to touch each other with no gap in between
$histogramLayer->setBarGap($perlchartdir::TouchBar);
# Use rounded corners for decoration
$histogramLayer->setRoundedCorners();

# ChartDirector by default will extend the x-axis scale by 0.5 unit to cater for the bar width. It
# is because a bar plotted at x actually occupies (x +/- half_bar_width), and the bar width is
# normally 1 for label based x-axis. However, this chart is using a linear x-axis instead of label
# based. So we disable the automatic extension and add a dummy layer to extend the x-axis scale to
# cover minX to maxX.
$c->xAxis()->setIndent(0);
$c->addLineLayer2()->setXData($minX, $maxX);

# For the automatic y-axis labels, set the minimum spacing to 40 pixels.
$c->yAxis()->setTickDensity(40);

# Output the chart
$c->makeChart("histogram.png")

[CGI Version] perldemo_cgi\histogram.pl
#!/usr/bin/perl

# In the sample code, the ChartDirector for Perl module is assumed to be in "../lib"
use File::Basename;
use lib (dirname($0)."/../lib") =~ /(.*)/;

use perlchartdir;

#
# This example demonstrates creating a histogram with a bell curve from raw data. About half of the
# code is to sort the raw data into slots and to generate the points on the bell curve. The
# remaining half of the code is the actual charting code.
#

# Generate a random guassian distributed data series as the input data for this example.
my $r = new RanSeries(66);
my $samples = $r->getGaussianSeries(200, 100, 10);

#
# Classify the numbers into slots. In this example, the slot width is 5 units.
#
my $slotSize = 5;

# Define our own floor function to avoid using the huge POSIX module just for this function.
sub floor_func { my $ret = int($_[0]); $ret - (($ret > $_[0]) ? 1 : 0); }

# Compute the min and max values, and extend them to the slot boundary.
my $m = new ArrayMath($samples);
my $minX = floor_func($m->min() / $slotSize) * $slotSize;
my $maxX = floor_func($m->max() / $slotSize) * $slotSize + $slotSize;

# We can now determine the number of slots
my $slotCount = int(($maxX - $minX + 0.5) / $slotSize);
my $frequency = [(0) x $slotCount];

# Count the data points contained in each slot
for(my $i = 0; $i < scalar(@$samples); ++$i) {
    my $slotIndex = int(($samples->[$i] - $minX) / $slotSize);
    $frequency->[$slotIndex] = $frequency->[$slotIndex] + 1;
}

#
# Compute Normal Distribution Curve
#

# The mean and standard deviation of the data
my $mean = $m->avg();
my $stdDev = $m->stdDev();

# The normal distribution curve (bell curve) is a standard statistics curve. We need to vertically
# scale it to make it proportion to the frequency count.
my $scaleFactor = $slotSize * scalar(@$samples) / $stdDev / sqrt(6.2832);

# In this example, we plot the bell curve up to 3 standard deviations.
my $stdDevWidth = 3.0;

# We generate 4 points per standard deviation to be joined with a spline curve.
my $bellCurveResolution = int($stdDevWidth * 4 + 1);
my $bellCurve = [(0) x $bellCurveResolution];
for(my $i = 0; $i < $bellCurveResolution; ++$i) {
    my $z = 2 * $i * $stdDevWidth / ($bellCurveResolution - 1) - $stdDevWidth;
    $bellCurve->[$i] = exp(-$z * $z / 2) * $scaleFactor;
}

#
# At this stage, we have obtained all data and can plot the chart.
#

# Create a XYChart object of size 600 x 360 pixels
my $c = new XYChart(600, 360);

# Set the plotarea at (50, 30) and of size 500 x 300 pixels, with transparent background and border
# and light grey (0xcccccc) horizontal grid lines
$c->setPlotArea(50, 30, 500, 300, $perlchartdir::Transparent, -1, $perlchartdir::Transparent,
    0xcccccc);

# Display the mean and standard deviation on the chart

$c->addTitle(sprintf("Mean = %s, Standard Deviation = %s", $c->formatValue($mean, "{value|1}"),
    $c->formatValue($stdDev, "{value|2}")), "arial.ttf");


# Set the x and y axis label font to 12pt Arial
$c->xAxis()->setLabelStyle("arial.ttf", 12);
$c->yAxis()->setLabelStyle("arial.ttf", 12);

# Set the x and y axis stems to transparent, and the x-axis tick color to grey (0x888888)
$c->xAxis()->setColors($perlchartdir::Transparent, $perlchartdir::TextColor,
    $perlchartdir::TextColor, 0x888888);
$c->yAxis()->setColors($perlchartdir::Transparent);

# Draw the bell curve as a spline layer in red (0xdd0000) with 2-pixel line width
my $bellLayer = $c->addSplineLayer($bellCurve, 0xdd0000);
$bellLayer->setXData2($mean - $stdDevWidth * $stdDev, $mean + $stdDevWidth * $stdDev);
$bellLayer->setLineWidth(2);

# Draw the histogram as bars in blue (0x6699bb) with dark blue (0x336688) border
my $histogramLayer = $c->addBarLayer($frequency, 0x6699bb);
$histogramLayer->setBorderColor(0x336688);
# The center of the bars span from minX + half_bar_width to maxX - half_bar_width
$histogramLayer->setXData2($minX + $slotSize / 2.0, $maxX - $slotSize / 2.0);
# Configure the bars to touch each other with no gap in between
$histogramLayer->setBarGap($perlchartdir::TouchBar);
# Use rounded corners for decoration
$histogramLayer->setRoundedCorners();

# ChartDirector by default will extend the x-axis scale by 0.5 unit to cater for the bar width. It
# is because a bar plotted at x actually occupies (x +/- half_bar_width), and the bar width is
# normally 1 for label based x-axis. However, this chart is using a linear x-axis instead of label
# based. So we disable the automatic extension and add a dummy layer to extend the x-axis scale to
# cover minX to maxX.
$c->xAxis()->setIndent(0);
$c->addLineLayer2()->setXData($minX, $maxX);

# For the automatic y-axis labels, set the minimum spacing to 40 pixels.
$c->yAxis()->setTickDensity(40);

# Output the chart
binmode(STDOUT);
print "Content-type: image/png\n\n";
print $c->makeChart2($perlchartdir::PNG);