Most scientific fields usually order by the amount of work each author contributed, sometimes with other conventions, like students being listed first or the grant holder being listed last. Mathematical fields, however, typically order authors alphabetically, since small ideas are often instrumental in a paper, and it can be difficult to rank the importance of each author’s contributions. Computer science can go in either direction, depending on the subfield and the paper. Using the author ordering, we can measure how applied (science) vs. theoretical (math) a conference is.
Following Appel’s approach, I recreated the algorithm to compute the most likely proportion of mathematicians to scientists in each conference, with associated error bars. The code can be found at this Git repo. The data for these conferences was collected from DBLP (copy and pasted, then cleaned up using emacs macros).
The original plot:
My plot, with sloppy manually-added lines:
If you look closely at the data, you may notice that many of my results differ from Appel’s original paper. I had a hard time figuring out the reasons for this. One possible reason is that sometimes surnames are difficult to discern. My code uses the probablepeople library to guess the surname portion of the name, and sometimes I have to decide when the library can’t figure it out, for example by manually removing whitespace or adding hyphenation in a name. However, I tried different hyphenation and whitespace combinations for the POPL authors, with no success in reproducing the original paper’s results. Note that this may still be a factor for other conferences, which I didn’t look at in detail.
My second hypothesis was that the DBLP data differed from the actual author lists that Appel used in 1992. To test this, I needed a paper copy of the proceedings, since the digital copies on the ACM digital library did not include front matter, and only the individual papers. I could have looked at each paper individually for the author lists, but I deemed that to be too much work. Instead, I found a copy of one of the POPL proceedings in the Penn library, and requested for the table of contents to be scanned and sent to me. A few days later I confirmed that this was the source of the discrepency, at least for that specific data point. Some of the authors switched to publishing under a different surname, and DBLP displayed their most current name, even for older papers.
To make my work easier, I didn’t do anything to solve these two issues, so my results will differ from Appel’s original results. For the first, I didn’t do any special handling of surnames, taking the name that probablepeople returned, and only intervening if it could not determine a unique surname. For the second, I just used the DBLP data as-is.
Next, I ran the algorithm on modern data. The most difficult part was resolving name issues, when probablepeople could not figure out an author’s surname. When this happened, I looked up the author and tried to figure out their surname from their personal webpages. If their surname didn’t really matter, for example when the author list is clearly not alphabetical, I was less careful. This means that even for the same person, I would edit names inconsistently when it didn’t matter for alphabetical ordering. Apologies if you’re one of these authors, or if I messed up your name!
One interesting thing I learned while collecting the data is that many of the ACM conferences were consistently held in the US early on. I was surprised since some of the major conferences currently alternate (well, usually) between North American and non-North-American locations. I thought this was a tradition since the start, but this has only been happening for about 20 years. For example, PLDI was in North America for its first 22 iterations. POPL was in the US until the 14th in Munich, and did not leave the US again until the 24th POPL.
Finally, the results are plotted below. They suggest that POPL has been getting more practical over time, and that POPL is the most theoretical of the PL conferences, both of which I agree with.
The errors bars are far smaller nowadays, owing to the lower number of single-author papers and the higher overall number of papers presented each year.
This idea is not new, but originates from an article by Melanie Stefan.
I had very few scheduled responsibilities: only a single meeting scheduled for each week. The goal was to reduce mood changes due to the day of the week, like getting no work done on Fridays and disappointment on Sundays. Though studies on this ^{1} suggest that there is little support for some of these phenomena, I wanted to see for myself.
The first few week were fine; I’d try to go to work around five days a week, and take days off when I felt like I needed one. I especially tried to work on weekends, since I was more productive (or at least happier with music on my speakers) when the office was empty. This was a pretty good setup. I felt like I had a lot of flexibility and got a decent amount done.
But then I got a cat. I stayed home way too much to watch her, even though she was still adjusting and wasn’t especially playful. The summer ended with me doing far less than I wanted to. Nevertheless, I’d like to try this experiment again, especially since it seemed fruitful from the first few weeks.
Anyways, here’s some photos of my cat Strawberry, the real reason for the post.
Arthur A. Stone, Stefan Schneider & James K. Harter (2012) Day-of-week mood patterns in the United States: On the existence of ‘Blue Monday’, ‘Thank God it’s Friday’ and weekend effects, The Journal of Positive Psychology, 7:4, 306-314, DOI: 10.1080/17439760.2012.691980 ↩
It should not be confused with this coinduction, which may put you to sleep instead.
Inductive (or recursive) definitions are ubiquitous in mathematics, to the point where they are often implicit. They follow a common pattern to build up a set of objects incrementally. A base case (or multiple) is first established, and then rules for building up objects based on previous levels are defined.
The set of finite strings \(S\) on an alphabet \(\Sigma\) is the set inductively defined by the following rules, in inference rule notation:
\[\frac{}{\epsilon \in S} \qquad \frac{s \in S \quad \sigma \in \Sigma}{\sigma s \in S}\]So \(\epsilon\) (the empty string) is a string, and for any symbol \(\sigma\) in the alphabet, we can prepend that onto another string to yield a string. Only the objects generated from the rules are in \(S\).
Inductive definitions can be thought of as an iterative process: we start with the empty set and keep adding objects according to the definition, until in the limit, we reach a fixed point, when applying the rules no longer adds anything new to the set. We add \(\epsilon\), then the length 1 strings, then the length 2 strings, and so on, until we have the infinite set of strings over \(\Sigma\) of any length in \(\mathbb{N}\).
An inductive definition is thus the smallest set closed forward under its defining rules. That is, \(S\) is the smallest set such that \(\epsilon \in S\) and that if \(s \in S\), then \(\sigma s \in S\) for any \(\sigma \in \Sigma\). We apply the rules from premises to conclusion.
Since coinduction is the dual to induction, let’s try “flipping” the inductive definition. A coinductive definition is the largest set closed backward under its defining rules.
What does this mean? For an inductive definition, we can think of the set as starting from \(\varnothing\) and iteratively adding elements according to the rules. For a coinductive definition, we can think of the set as starting from the set of all possible objects (even infinite ones), and iterative removing objects that contradict the rules.
If we use the same rules that inductive defined \(S\) above, the coinductively defined set \(S'\) is the largest set such that \(\epsilon \in S'\) and that if \(\sigma s \in S\), then \(s \in S\) (and \(\sigma \in \Sigma\)). Here, the backward closure goes from the conclusion to the premises, the opposite of the forward closure. The set of finite strings, \(S\), is included in \(S'\). But we also have some new strings in \(S'\), the infinitely long strings. Consider the string \(s = aaaaaa \dots\), where \(a \in \Sigma\). We cannot construct it using the base case, but it doesn’t lead to a contradiction either, since if \(s = aaaaa \dots \in S\), taking off the first \(a\) results in the same infinite string \(s\), and \(s \in S\) as desired.
The proof tree for \(s\) is infinite, and looks like the following:
\[\large \frac{a \in \Sigma \quad \frac{ a \in \Sigma \quad \frac{ \cdots }{ aaa \dots \in S' } }{ aaa \dots \in S' }}{ aaa \dots \in S' }\]While objects of inductive definitions require finite derivations, objects of coinductive definitions can have infinite derivations.
For the following, I will skip over some (many) details.
The function \(F\) can be thought of as the set of rules for a given (co)inductive definition. \(F(X)\) is the set of conclusions obtained after applying the rules using \(X\) as the set of premises.
Recall that an inductive definition is the least fixed point of a set of rules, and that a coinductive definition is the greatest fixed point. Now here is a specialization of the Knaster–Tarski fixpoint theorem:
Theorem:
The least fixed point of \(F = \mu F = \bigcap \{ X \mid F(X) \subseteq X \}\).
The greatest fixed point of \(F = \nu F = \bigcup \{ X \mid X \subseteq F(X) \}\).
\(F(X) \subseteq X\) captures the meaning of the informal “closed forwards” definition from earlier. Given a set \(T\) where the premises \(X \subseteq T\), we can apply all rules in the “forwards” direction, obtaining the set of conclusions \(F(X)\) which are also in \(T\): \(F(X) \subseteq X \subseteq T\).
Dually, \(X \subseteq F(X)\) captures the meaning of “closed backwards”. Given a set \(T\) where the conclusions \(F(X) \subseteq T\) from some set of premises \(X\), we can apply some rule for each \(t \in F(X)\) in the “backwards” direction, obtaining the set of premises \(X\) which are also in \(T\): \(X \subseteq F(X) \subseteq T\).
Simple corollaries of the fixpoint theorem gives us proof principles for inductive and coinductive definitions:
Lemma (Induction Principle):
If \(F(X) \subseteq X\), then \(\mu F \subseteq X\).
Lemma (Coinduction Principle):
If \(X \subseteq F(X)\), then \(X \subseteq \nu F\).
Using the induction principle, we can show that every element of a inductively defined set satisfies some condition, by showing that the condition is preserved for each rule of the definition.
We can derive the more familiar principle of mathematical induction using this. Let \(F(X) = \{ 0 \} \cup \{ 1 + x \mid x \in X \}\). This is the set of rules for the natural numbers. It may be more familiar if I write it as the following:
\[\frac{}{0 \in \mathbb{N}} \qquad \frac{n \in \mathbb{N}}{1 + n \in \mathbb{N}}\]Then to prove some fact about the natural numbers, we just need to show that it is preserved when applying these rules in the forwards direction. For example, we will show that \(1 + 2 + \dots + n = \frac{n(n+1)}{2}\) is true for all natural numbers. Let’s take \(X = \{ n \in \mathbb{N} \mid 1 + 2 + \dots + n = \frac{n(n+1)}{2} \}\). Then we will prove that \(\mu F = \mathbb{N} \subseteq X\). This is exactly the conclusion of the Induction Principle, so we need to show that \(F(X) \subseteq X\).
An element of \(F(X)\) can either be \(0\) (the base case), which we can easily verify is in \(X\), or \(1 + n\) (the inductive case) where \(n \in X\) (the inductive hypothesis). This should look familiar. Some fiddling will show that the second case is true as well, and we are done! \(\Box\)
Dually, using the coinduction principle, we can show that an element is in the coinductively defined set.
Using just \(S'\), our only coinductively defined set so far, would not be very interesting, since it would involve only the membership proofs we saw earlier. Let’s make another coinductive definition, this time a relation on elements of \(S'\): let \(F(X) = \{ (\epsilon, \epsilon) \} \cup \{ (\sigma_1 s_1, \sigma_2 s_2) \mid \sigma_1 \le \sigma_2 \land (s_1, s_2) \in X \}\), where \(\le\) is some ordering on the alphabet (the usual one on the English alphabet, for instance). Can you tell what relation this defines? Let’s write down the inference rules:
\[\frac{}{\epsilon \leqslant \epsilon} \qquad \frac{\sigma_1 \le \sigma_2 \qquad s_1 \leqslant s_2}{\sigma_1 s_1 \leqslant \sigma_2 s_2}\]The notation should help: \(\nu F\) is the lexicographic ordering relation on our (possibly) infinite strings, displayed here as \(\leqslant\).
Now we can prove that some strings are related by this relation. For an example, we will show \(aaaa \dots \leqslant baaaa \dots\). Note that these are infinitely long strings.
Using the coinduction principle, we just need to show that \((aaaa \dots, baaaa \dots)\) is in some set of pairs of strings that is closed backwards under \(F\). Let’s try the singleton set \(X = \{(aaaa \dots, baaaa \dots)\}\) first. Then \(F(X) = \{ (\epsilon, \epsilon) \} \cup \{ (\sigma_1 aaaa \dots, \sigma_2 baaaa \dots) \mid \sigma_1 \le \sigma_2 \}\). But then \(X \not \subseteq F(X)\), since the second string of every pair in \(F(X)\) has a \(b\) as the second symbol.
\(X\) is our “coinductive hypothesis”. Like how during induction we sometimes have to strengthen the inductive hypothesis, here we have to strengthen the coinductive hypothesis by making it bigger.
Recall the “backwards closed” intuition. We want to show that by applying some rule “backwards”, we obtain something still in \(X\). If we start with \((aaaa \dots, baaaa \dots)\), we can only apply the second rule, stripping off the first symbol of each string. \(a \le b\), so that premise is fine, and we just need to show that \((aaaa \dots, aaaa \dots) \in X\) now. It looks like we need to grow \(X\) by adding this new pair to it, strengthening the coinductive hypothesis.
Now \(X = \{ (aaaa \dots, baaaa \dots), (aaaa \dots, aaaa \dots) \}\), and \(F(X) = \{ (\epsilon, \epsilon) \} \cup \\ \{ (\sigma_1 aaaa \dots, \sigma_2 baaaa \dots) \mid \sigma_1 \le \sigma_2 \} \cup \\ \{ (\sigma_1 aaaa \dots, \sigma_2 aaaa \dots) \mid \sigma_1 \le \sigma_2 \}\)
Let’s check that \(X \subseteq F(X)\).
\((aaaa \dots, baaaa \dots) = (a\cdot aaaa \dots, b\cdot aaaa \dots)\), and \(a \le b\).
\((aaaa \dots, aaaa \dots) = (a\cdot aaaa \dots, a\cdot aaaa \dots)\), and \(a \le a\).
And since \((aaaa \dots, baaaa \dots) \in X\), we’re done! \(\Box\)
Recently I’ve been working on Interaction Trees, a library that provides a coinductive data structure for reasoning about interactive programs in Coq. Coinduction is less convenient than induction in Coq. For example, in the coinductive proof above the “coinductive hypothesis” included exactly the conclusion we were trying to prove. When doing the proof informally, we know we must apply one of the rules backwards and only then can we apply the coinductive hypothesis.
Doing this in a proof assistant like Coq is more complex. Using “vanilla” Coq, it will allow you to apply the coinductive hypothesis immediately, and then complain that you got it wrong when you try to finish the proof. The paco library solves this problem, but more complex reasoning quickly gets complex, which is why I started learning more about the theory behind coinduction.
I find it really intriguing how (relatively) new coinduction is and how useful it has become. There’s been a lot of work recently on areas related to coinduction, and I’m excited to do more work in this area.
I first encountered coinduction in Types and Programming Languages by Benjamin C. Pierce, where they are introduced to talk about the metatheory of recursive types. While I wouldn’t recommend reading this if you’re just interested in coinduction, it serves as an excellent introduction to programming languages and type systems.
Introduction to Bisimulation and Coinduction by Davide Sangiorgi is a very accessible textbook that goes into detail about all of this and more. It cleared up a lot of questions I had about coinduction, and helped me understand it more rigorously.
]]>I decided to start this blog because I rarely write anything longer than a sentence at a time, which seems like a useful thing to practice for a PhD student. I’ve also always felt pretty weak at communicating about research or technical stuff (not to mention just in general). Hopefully this will help me with these things, as well as improve my understanding of the technical material I’ll be writing about.
I plan to write about various technical things I encounter during my research work. These will probably be things related to functional programming and programming language theory.
Let me tell you a bit about the inner workings of the website, which I spent (and will continue to spend) a lot of time on—instead of writing posts. The site is static, hosted on Github Pages, and is generated by Jekyll. I don’t want to handle any complexity related to hosting, so a static website seems fine to me.
However if you look at the bottom of the page, you’ll see something less standard: an ugly hacked together comment system. I originally tried using Disqus, which was really quite nice and easy to use. You can see an example of it on my blog here.
I preferred something more lightweight though, and also something I controlled entirely. Here’s an example of what my current solution looks like. You can even embed html (what’s sanitization?)!
I’m using Staticman to display user-generated content. When a comment is submitted, it goes through the Staticman web service, which creates a pull request on my website’s Github repo to add the comment as a text file, to be included in the (updated) static site.
I think this is super cool. No databases or anything to deal with! Staticman was pretty nice to use, though the documentation is a little out of date. Originally it was run as a single public instance, and due to the number of users it had it was hitting the rate limit for the Github API (see this Github issue for details). The developer then updated Staticman to be a Github app, so each user would get their own instance and thus their own API quota. However, this was fairly recent (Dec 2018) and the documentation wasn’t updated to reflect this, so it took me a few hours to get it working.
Edit Oct 3: Staticman died at some point! The service is open source, so maybe I will host my own instance on Heroku.
Edit Aug 18 2020: Finally hosted my own instance on Heroku.
Edit Dec 21 2022: Heroku shut down their free tier, but I’ve managed to migrate to fly.io’s free tier with basically a single click using their migration tool.
I’ll be working on updating the site a bit more to add a navigation menu (Edit Mar 7: Done!), less ugly comments (Edit Mar 8: Arguably done!), and so on. For my research I’ll be spending most of my time in the next few weeks on coinduction, a very cool proof technique, and I hope to write something introductory about it soon.
]]>