A Small Algol 68 Project, Part 1

In memory of J. Kevin Douglas, a good friend and fellow fan of Algol 68

In the last article in this series, we looked at our first simple Algol 68 program:

1 BEGIN

2 first random(42);

3 REAL sum of randoms := 0.0;

4 WHILE sum of randoms < 10.0 DO

5 REAL r := next random;

6 sum of randoms +:= r;

7 print(("random number = ",r,"; sum = ",sum of randoms,new line))

8 OD

9 END

As we reviewed the program line-by-line to understand its meaning, we learned some Algol 68 terminology to describe what is going on in those nine lines of code.

And, I know I said “In the next article, we’ll dig into a few of the less obvious but still important details that make this all work. Then we can move on to see some more cool things that we can do in Algol 68”, but I’m going to postpone that until we have some more useful code examples developed.

Because in the end, this little program is about as interesting a demo of Algol 68 as a “Hello World”. Let’s do something a bit more useful for now.

I think it’s fair to say that Algol 68 reflects some of the interests of computer scientists of the 1960s. In particular, with its interest in handling arrays and looping, we can see the interest in numerical computation. So why not look at a simple approximation problem through the Algol 68 lens?

Vectors, matrices and least squares estimation

A common problem to solve for anyone handling data in the real world is using least squares estimation to produce a “best fit” of an equation to a set of data. Of course, in 2025, we can use a special purpose language like R or numerical statistics libraries called from general purpose languages like C or Java to solve this kind of problem; but for really simple problems, it’s kind of cool to know how to “do it yourself”.

[pmath size=12]S(f)(t)=a_{0}+sum{n=1}{+infty}{a_{n} cos(n omega t)+b_{n} sin(n omega t)}[/pmath]

Let’s imagine we have a text file in CSV format, with two columns containing x and y values, and we want to fit those values with a straight line so that we minimize the error of approximating the point values with the straight line.

We know the equation of a line is:

y = m · x + b

where m is the slope of the line and b is the y-intercept.

If there are only two points in our CSV file, we know we can fit a line to them exactly. However, if there are three or more points, then we need to choose values for m and b to “best fit” those x and y values. If we rewrite the above equation slightly as:

m · x + b · 1 = y

That is, m is the coefficient of x, b the coefficient of 1 (I know that might sound a bit weird but bear with me here).

And we think of our CSV file as having n+1 lines, looking like:

X-value,Y-value

x₁,y₁

x₂,y₂

x₃,y₃

…

x_n,y_n

then we can see that this leads to a system of linear equations that looks like:

m · x₁ + b · 1 = y₁

m · x₂ + b · 1 = y₂

m · x3 + b · 1 = y₃

…

m · x_n + b · 1 = y_n

which we can write in matrix form as

X c = y

where X is the matrix:

	x₁	1
	x₂	1
	x₃	1
	…
	x_n	1

c is the vector of coefficients we’re solving for:

	m
	c

and y is the vector:

	y₁
	y₂
	y₃
	…
	y_n

You remember this from your high school or undergrad linear algebra course, right? The hard part is done, now I just have to wave my hands a bit and tell you that:

This is what is called an “overdetermined” system of equations;
We can solve this problem in a least squares fashion by premultiplying both sides of the equation by the transpose of the matrix X, which we denote as X^T

That is,

X^T X c = X^T y

What this means is that, by forming the products of X^TX and X^Ty, we will end up with a two-rows by two-columns matrix and a two-row vector respectively, with which we can solve for the two-row vector c.

Good so far? Awesome, let’s plan the tasks we need to handle. I should mention that this excellent article from Oxford University Press will dispel the hand-waving used in my explanation above.

Steps to solve the linear least squares matrix formulation

We need t the following:

A test data set;
Some utilities to let us read CSV files;
The least squares approximation program that reads the CSV file, calculates XTX and XTy and solves for c

We’ll finish up this article by writing a test data set generator.

Test data generator

Recall the equation of a line given previously:

y = m · x + b

Let’s create a set of points that lie along the line that joins x = 0, y = 1 with x = 9, y = 10. This corresponds to a specific line with the equation:

y = x + 1

Furthermore, let’s use the random number generator described in the last article and shown at the top of this article to add a small random offset to the y value in our generated points. With enough points, we intuitively feel that the least squares fit to the points should give values to m and b similar to the above.

Here’s the code:

1 BEGIN

2 first random(42);

3 REAL x1 = 0.0, y1 = 1.0, x2 = 9.0, delta = 1.0 / 64;

4 REAL x := x1, y := y1;

5 print(("X-value", ",", "Y-value", new line));

6 WHILE x <= x2 DO

7 REAL y adj = y + (next random - 0.5) / 8.0;

8 print((fixed(x,7,5), ",", fixed(y adj,7,5), new line));

9 x +:= delta;

10 y +:= delta

11 OD

12 END

We won’t go through this line by line as there isn’t much that is different than in the previous program. However:

In line 3, we see the beginning x and y values and the end x value (we don’t need the end y value as we’re going to just test for x reaching the end), and a delta value, used to increment x and y, of 1/64. These are all constants.

In line 7 we see the y value being adjusted by the “next random” value, which falls between 0 and 1, less -0.5, which then falls between -0.5 and +0.5, and finally divided by 8 so that the random adjustment is never greater than 1/16 or less than -1/16.

We can run this with Algol 68 Genie in a Linux terminal window as:

$ a68g a2.a68

X-value,Y-value

+.00000,+.97022

+.01562,+.95618

+.03125,+1.0211

…

+8.9688,+9.9956

+8.9844,+10.012

+9.0000,+10.057

$

Of course, we’ll want to redirect this to a file such as “test_data.txt”.

In the next article, we’ll work on a few useful utilities to handle CSV files.

InAlgol 68

About those livesys error messages