Jan Krüger's blog

Creative Engineering and randomness

HOWTO: encode a string into a complicated-looking trigonometric function

· Read in about 5 min · (952 words)
Hack Waste of time

Today’s glance at reddit.com yielded a blog posting by a fellow who calls himself “Poromenos” and who recently wasted his day by designing a function made up of sines and cosines that encode the string “Hello world!". “Hey”, I immediately thought, “I can do that too! I’m an expert at wasting my day, after all.” Only I decided to go a step further and write a program that generates this sort of function. I’m lazy, remember?

The theory

The magic behind Poromenos’s function is the Fourier theorem: any “mostly” continuous and periodic function can be expressed as a sum of sines and cosines. I’m not going to bore you with the details; suffice to say that this also works for sampled functions, i.e. discrete series of values.

There’s an algorithm called DFT (Discrete Fourier Transform) that takes a series of N complex sample values and generates a corresponding Fourier series which encodes the various sine/cosine coefficients in N complex output values. In the special case of real input values (which is an extremely common case), you can effectively throw away half of these output values and take the remaining N real/imaginary components, do a bit of magic, and end up with coefficients for a function of the form:

Now, \(f(\frac{2 \pi n}{N})\) returns exactly the \((n+1)\)th character of the original string.

Magic?

Glad you ask. No, it’s not really magic. In fact, I used the trigonometric interpolation polynomial from Wikipedia’s article on the Discrete Fourier transform (remember: it’s science if you copied it from Wikipedia!):

Here, \(X_i\) are the complex values of the Fourier series, and \(e^{\pm it}\) transforms to \(\cos t \pm i \sin t\) according to Euler’s formula. The lone \(\cos\) term in the middle doesn’t show up if \(N\) is odd.

Now, if the Discrete Fourier Transform is fed with real values, it holds that

or, in other words, the right half of the series is equal to the complex conjugates (\((a + i b)^\star = a - i b\)) of the left half in reverse order. Due to that, \(p(t)\) gets a lot simpler after a bit of reduction:

Again, the last term disappears for odd \(N\).

There we are, all magic has now been reduced to more or less tangible math.

The program

I’ve written a small C program that takes as its input an arbitrary string and outputs the above function, modified a bit so that you can use the character index directly, i.e. f(4) gives you the fifth character of the original string.

The generated function will typically have an error of less than 0.1 at each position (which goes away by rounding the values as Poromenos does it in his code). In fact, for the string “Hello world!”, the mean square error is less than 0.0019.

Because this isn’t exactly a serious project, I didn’t spend any time adding stuff like UTF-8 support. You’ll have to resign yourself to using ASCII or any 8-bit-per-character encoding.

Compiling the code requires that you have fftw3 installed (including header files).

You can download the code or look at it in all its syntax-highlighted glory:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <complex.h>
#include <math.h>
#include <fftw3.h>

/*
 * fourierfit.c:
 * Fits a series of sines/cosines onto a series of character values from a
 * string.
 *
 * Copyright (c) 2008, Jan Krueger <jast@heapsort.de>
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */

/*
 * Compile like this:
 * cc -o fourierfit -lm -lfftw3 fourierfit.c
 */

/*
 * Precision of output function (in digits after decimal point).
 * Yes, it's a string. printf much?
 */
#define PRECISION "3"
/* Names of cos/sin functions to use in output */
#define COSNAME "cos"
#define SINNAME "sin"
/* Name of function variable */
#define FNAME "t"

/* Don't change below here etc. */

#define PREC_FORMAT "%."PRECISION"f"
const char *cos_format = " %c "PREC_FORMAT"*"COSNAME"("PREC_FORMAT"*"FNAME
	")";
const char *sin_format = " %c "PREC_FORMAT"*"SINNAME"("PREC_FORMAT"*"FNAME
	")";

void format_term(const char *f, double d, int p, int N)
{
	printf(f, (d >= 0 ? '+' : '-'),
		fabs(d), /* scale of function */
		(double)p * M_PI * 2.0 / (double)N /* scale of parameter */
	);
}

int main(int argc, char **argv)
{
	int N;
	int odd;
	int i;
	double *in;
	fftw_complex *out;
	fftw_plan p;
	if (argc != 2) {
		puts("Syntax: fourierfit <string>");
		exit(1);
	}
	N = strlen(argv[1]);
	odd = N % 2;

	in = fftw_malloc(sizeof (double) * N);
	out = fftw_malloc(sizeof (fftw_complex) * N);

	p = fftw_plan_dft_r2c_1d(N, in, out,
		FFTW_ESTIMATE | FFTW_DESTROY_INPUT);

	/* Fill in array */
	for (i = 0; i < N; i++) {
		in[i] = (double) argv[1][i];
	}

	fftw_execute(p);
	fftw_destroy_plan(p);
	fftw_free(in);

	/* "DC" */
	printf(PREC_FORMAT, creal(out[0])/N);

	for (i = 1; i <= (N-1)/2; i++) {
		format_term(cos_format, 2.0*creal(out[i])/N, i, N);
		format_term(sin_format, -2.0*cimag(out[i])/N, i, N);
	}
	/* Nyquist component */
	if (!odd) {
		format_term(cos_format, creal(out[N/2])/N, N/2, N);
	}
	printf("\n");
	fftw_free(out);
	exit(0);
}

The example

For the sake of completeness, here’s the function for “Hello world!” as generated by the above code:

f(t) = 93.083 - 10.549cos(0.524t) - 0.195sin(0.524t) - 17.167cos(1.047t) + 22.805sin(1.047t) - 5.000cos(1.571t) - 1.833sin(1.571t) + 8.667cos(2.094t) + 19.630sin(2.094t) - 7.951cos(2.618t) - 1.638sin(2.618t) + 10.917cos(3.142t)