A Function Golf Challenge

“Code golf” is the practice of trying to solve a specific problem or implement a specific algorithm in the smallest possible number of characters of a given programming language. A related concept is exploring what can be accomplished with a tweet-length program (for example see here), though “tweet-length” can be a moving target. In the spirit of code golf, I propose the following “function golf” challenge.

What is the simplest function (defined on the real numbers) that is continuous everywhere except on a given subset of the reals? It is not always possible to find such a function - there are subsets of the reals which cannot be represented as the set of discontinuities of a function. So I will restrict to subsets of the reals that are open or closed.

Here “simplest” is not defined in any formal way - feel free to define it as you wish. In general, I consider a function expressed in fewer characters to be simpler than a function expressed in more characters, but I don’t want to unnecessarily sacrifice clarity on the altar of brevity.

More formally, the desired function is \(f: \mathbb{R} \to \mathbb{R}\) such that when given any open or closed set \(A \subseteq \mathbb{R}\), \(f(x)\) is continuous if \(x\notin A\) and discontinuous if \(x \in A\).

I’ll use the notation invented by Iverson and popularized by Knuth: a statement in square brackets is equal to 1 if the statement is true, and 0 if the statement is false. Like Knuth, I will make this “very strongly zero,” so that if a statement \(S\) is false, \([S] \times (\text{expression})\) is 0, even if \((\text{expression})\) is undefined when \(S\) is false. A brief explanation of other notation used is included at the end of this post.

Here is my proposal for the desired function (which I’m certain is far from optimal):

\[ \boxed{ f(x) = [x \in A] \big( [A \text{ is open}]\, [x \in \mathbb{Q}]\, d(x, A^c) + [A \text{ is closed}]\, (1 + [x \in \mathbb{Q}]) \big)} \]

It fits on one line, but does it work? Before I give a proof, I’ll note that it seems as if there should be a version of \(f\) that is written in similar notation but is much shorter - like there is some duality that I am not quite exploiting properly. Please let me know if you find a shorter and/or simpler version and I’ll post it on my blog (with credit to you of course). My email is ab@gmail.com, where a is my first name in lowercase, and b is my last name in lowercase.

Here is a proof, split into three cases: \(A\) is open and not closed, \(A\) is closed and not open, and \(A\) is both open and closed. The only two subsets of \(\mathbb{R}\) that satisfy the last condition are the empty set (\(\emptyset\)) and \(\mathbb{R}\) itself, which implies that in the first two cases \(A\) is nonempty.

Case 1: \(A\) is open and not closed, so \([A \text{ is open}] = 1\) and \([A \text{ is closed}] = 0\). The function then reduces to \[ f(x) = [x \in A]\, [x \in \mathbb{Q}]\, d(x, A^c). \] If \(x \in A\), \(d(x, A^c) \geq \epsilon\) for some \(\epsilon > 0\), since \(A\) is open. Again since \(A\) is open, we can construct a sequence of rationals and a sequence of irrationals, both in \(A\) and converging to \(x\). Applying the function to one of the two sequences will result in a sequence that does not converge to \(f(x)\), so \(f(x)\) is discontinuous at all \(x \in A\).

For \(x \in A^c\), \(f(x) = 0\), and applying the function \(f\) to any sequence converging to \(x\) results in a sequence that converges to 0 (if the sequence is in \(A\) and irrational, it is driven to zero by the \(d(x, A^c)\) term). So \(f(x)\) is continuous for all \(x \in A^c\).

Thus \(D_f = A\) as desired.

Case 2: \(A\) is closed and not open, so \([A \text{ is open}] = 0\) and \([A \text{ is closed}] = 1\). The function then reduces to \[ f(x) = [x \in A] (1 + [x \in \mathbb{Q}]). \] If \(x \in A^c\), \(f(x) = 0\), and since \(A^c\) is open, there exists an \(\epsilon\)-neighborhood \(V_\epsilon(x) \subseteq A^c\). Since \(f(x) = 0\) everywhere in this \(\epsilon\)-neighborhood, \(f(x)\) is continuous at all points \(x \in A^c\).

If \(x \in A\), \(x\) is either in \(A^\circ\) (the interior of \(A\)) or in \(A\setminus A^\circ\) (the boundary of \(A\)). If \(x \in A^\circ\), \(f(x)\) is discontinuous by a similar argument to the one given in Case 1 (since \(A^\circ\) is open). If \(x \in \partial A\), every \(\epsilon\)-neighborhood of \(x\) intersects \(A^c\), so we can construct a sequence \((x_n)\) in \(A^c\) converging to \(x\). Then \(f (x_n) \to 0 \neq f(x) = (1 \text{ or } 2)\). So \(f(x)\) is discontinuous for \(x \in \partial A\).

Since \(A = \partial A \cup A^\circ\), we have \(D_f = A\) as desired.

Case 3: \(A\) is both open and closed. This case can be split into two subcases: \(A = \emptyset\) and \(A = \mathbb{R}\). For both subcases, \([A \text{ is open}] = 1\) and \([A \text{ is closed}] = 1\). If \(A = \emptyset\), \([x \in A] = 0\), so \(f(x) = 0\) everywhere, which is continuous everywhere, so \(D_f = \emptyset = A\).

If \(A = \mathbb{R}\), \([x \in A] = 1\), and \(A^c = \emptyset\), so \(f(x) = 1 + [x \in \mathbb{Q}] (1 + d(x, \emptyset))\). Here is a point where I must engage in a bit of notational overload. The distance \(d(x, \emptyset)\) is technically undefined, and while an argument can be made that it should be \(+\infty\) (which is not a real number), I will arbitrarily define it to be 0. Or, using the Iversonian notation, \(d(x, A) = [A \neq \emptyset] \inf\{|x - a| : a \in A\}\). Then \(f(x) = 1 + [x \in \mathbb{Q}]\), which is 2 on the rationals and 1 on the irrationals, so is discontinuous everywhere. So \(D_f = \mathbb{R} = A\) as desired.

Notation (all pretty standard):

\(\mathbb{Q}\) denotes the rationals. The complement of a set is defined as \(A^c = \{x \in \mathbb{R} : x \notin A\}\). \(D_f \subseteq \mathbb{R}\) is the set of points where a function \(f: \mathbb{R} \to \mathbb{R}\) is discontinuous. The distance between a real number \(x\) and a subset of the reals \(A\) is defined as \(d(x, A) = \inf\{ |x - a| : a \in A\}\). The interior of a set \(A\) is defined as \(A^\circ = \{x \in A : \text{ there exists } V_\epsilon(x) \subseteq A\}\). The boundary of a set \(A\) is defined as the closure of \(A\) minus the interior of \(A\): \(\partial A = \overline{A}\setminus A^\circ\). The \(\epsilon\)-neighborhood of a point \(a\) is defined as \(V_\epsilon (a) = \{ x \in \mathbb{R} : |x - a| < \epsilon \}\).

Landon Lehman
Data Scientist

My research interests include data science, statistics, physics, and applied math.