Foundations of Programming Languages: A TA's Perspective (Part 2/2)

Posted on 2019-12-24 In Life , Thoughts Disqus:

Part of the cover of the text book "Practical Foundations of Programming Languages"

This is the second part of the previous post.

Skills vs. Ideas

There are 4 ways through which the instruction team interacts with the students substantially: through lectures, materials (books, slides), homeworks, and exams [1]. Those are the 4 main sources where students learn new things. I’m leaving out discussion boards like Piazza because usually they are most effective for homework hints/clarifications. Development of any major idea would require page long discussions, which is not very common on Piazza.

For the starter, we need to talk about the homeworks.

Very often the CS courses, especially systems courses are dedicated to teaching people skills. The story usually goes: “oh there exists this new cool system and in this semester we are going to first learn about how its build and then how to characterize its performance/resource usage, etc.”. In other words, we explain what things are in an eliminative fashion, and expects students to be experts about the topic.

In particular, homeworks usually requires students to either work with these new toys they learned about, or to implement those new toys themselves (or a combination of those). Case in point include:

In 15-418 Parallel Architecture & Programming, students are required to implement an algorithm using MPI in project 4. * In 15-440 Distributed Systems, students are required to implement a reduced version of Raft consensus protocol.
In Assignment 5 of this course, students are required to implement the language MPPCF and write code with it.

How does those homeworks serve our goal? Unfortunately not as well as we was hoping for.

Homework in this course mainly takes one of the following forms:

Implementation Questions: Read the judgments and implement a typechecker and an interpreter for a particular educational language (parser is provided).
Programming Questions: Program with the language you just learned.
Proof about type safety (mostly in the first half of the course).
Translating/elaborate code from one language into the other.

Safety proofs do help student understand the formalism, because to prove the safety you have to understand how the type system induces / restricts the semantics of the language. But there are a few problems. 1) They get lengthy quickly when the language gets complicated. 2) It only work if the student is actively engaged in the proof, i.e., student really try hard to justify the validity of their proof at every step. 3) Feedback to the student is inefficient: A TA has to grade it manually and with extra care, the student have to read the comment and think about it. 4) TAs must know very well how things are supposed to work. As a TA myself, I found that 1) I don’t always read the feedback on the proof I submitted 2) A few proofs I submitted are actually bogus, and my TAs either missed them or just being nice so they chose to not point them out 3) Proofs on the reference solution sometimes doesn’t make sense. There’s not much we can do about it, except probably trying to mechanize the proofs somehow?

One might think that being able to implementing a language according to the judgment certainly means the person have a pretty good grasp on the formalism for that language. Unfortunately this is not always the case. Students do learn about the fundamentals in the first few assignments (e.g. students learn a lot using typing judgments to describe a language and SOS for dynamics in the first 3 homeworks), but very soon implementing the language becomes a no-brainer once you have all the judgments around. In fact, a number of students completed the implementation question for MPPCF/Concurrent Algol without even looking at the judgments. This is mainly because people’s intuition are usually very good at this point, and an implementation that follows the intuitions is the usually a correct one. Formalism on the other hand, concerns more about consolidating one’s intuition into concrete, provable forms.

Programming questions are perhaps the least effective ones in terms of teaching people about formalisms. I’m not suggesting that they are useless, on the other hand, they are perfect for answering the “why do we care” type of questions. They show that commonly known constructs all boil downs to a few simple ideas, and are nice baits to keep people interested. They are also good for relating the theoretical models we learned into real-life languages. Unfortunately they are terrible at teaching people about formalisms, because they are not designed to do so.

On a second note, programming questions becomes extremely ineffective if they are done in a hurry because the person completing them will not have time to ponder on the implications of the things they are required to program. Thankfully, as we all know, no student will ever leave their two-weeks homework to the last 24hrs 😃 [2]. When done in a hurry, programming questions teaches only skills, not ideas. That might be a good thing for, say 15-618, because those skills are also demanded on the job markets. Unfortunately, students clearly don’t get to program in KPCF on the job, and those skills themselves become useless. This might frustrate some students, and is not what we want to do.

The last 2 homeworks contains nothing but implementation/programming questions. HW4 contains very few theory related questions. In general student get less training in terms reasoning with formalism, yet the formalism in fact gets much harder and diverse the second half of the course.

The translating / elaboration problems are honestly my favorites (and gave me the worst nightmare when I was a student). The translation problem in the second homework gave me a chance to practice writing down the judgments and write the implementations according to the judgments I defined on my own. I went back and forth between the judgments I devised and the implementation. The process is valuable because formalism, like any other things, is best learned through practice. We honestly should have more of those questions [3].

Other Means of Instruction

The Exams

I did a few exam questions this semester. In the midterm I wrote the question on DFAs and existential type. For the final I wrote the entire short questions problem. As it turns out, the short questions problem is the problem with the lowest mean (mean is almost half of other problems), and its not because that the questions are hard. A large portion of the questions are true-or-false-and-why questions. As it turns out, none of the statements we provided are true. Now the interesting move I made is adding a twist for every claim. Instead of just bringing out the claim, we also included a false justification, a justification that looks clever, but is really just nonsense. People fall for it.

Unfortunately, judging by the results of the exam we didn’t really do very well on that matter.

I believe we should not only discuss “good” ideas, we should also discuss “bad” ideas. It is much like vaccination: by exposing people to pitfalls and common misunderstanding, people find out about their confusion and learn to wield the power they have against false claims, by pointing out where exactly they missed the point.

It might also be beneficial to include questions that requires an open ended solution in the homework. We want to have questions that includes a design component, where people starts with intuition then putting intuitions into concrete forms by formalizing them. Then we discuss whether such language designs are safe, and how they interact with existing features. Bob believe it would be really beneficial to include a project of sort as the last homework, where students visit his office and spend an hour debating and discussing specific design choices. Unfortunately those projects are hard to grade.

The Lecture

Bob’s lecture is well-known for being hard to follow. Don’t get me wrong, I’m not saying Bob is a terrible teacher, but it is true many students find it hard to follow the lecture. On the contrary, you only realize how much clarity Bob brings into the discussion, when you already have some experience with the topic. There are a number of reasons behind this situation:

Implicit assumptions about students’ background. One particular question I constantly get, even right before the finals, is “what is the meaning of $\nu$ in $\nu(t.1+t)$ (or $\nu~\Sigma~\\{m~||~\mu\\}$ )”. For someone who have worked with formalism, this question itself makes no sense, because we understand that judgments are “pattern matched” and it’s the entire construct that carries a concrete meaning. However, students need clarification, which brings the second point:
We hope students raise questions right at the place they feel lost, but it’s almost never the case. When students can’t understand something, they very seldom raise questions. Futhermore, it is often not because they are afraid of looking stupid, contrary to common belief. Other reasons include 1) they expect (hope) that their question to be addressed shortly after, and the instructor simply moved on; 2) they are unable to formulate the question properly (we TAs see this a lot in OHs, in fact the solutions is trivial once the question is properly formulated), i.e. “I don’t even know where I got lost / unable to put things in context”; 3) knowing that its probably going to take around 5-10 minutes to fully explain the matter, the student decide to save the question for later (and forgot about it eventually) so that he/she will not “waste other students’ time”. This happens a lot when a student raise a question and got an unsatisfying answer, and the student simply stops asking follow up questions.
Effects of taking notes. Believe it or not, note taking consumes one’s scarce mental resource. When you are trying to copy things down from the board, your mind will be focused on copying things down in exact form, and nothing else. This style of note taking is usually counterproductive, as many people have already pointed out. The right way is to take down only the “main points” and “critical rules”, companied by a small set of examples. However, this requires the student to be able to follow the instructor closely, and more importantly maintain a big picture for the entire development of idea (so he knows when a “main point” is being made). Unfortunately, this is an ability that is rarely found in the population, and honestly I can’t do it.
It takes time to familiarize oneself with new formalism. I experienced this first-hand last year. When Bob introduce a new construct on the blackboard, it takes time to familiarize myself with the new syntax and the typing judgment (we are talking about hours here). This becomes a problem when Bob starts to write down terms using the new construct. Keeping track of the typing rules and/or dynamics in realtime is hard, especially without the help of sufficient type annotations. One way to improve this is to annotate the types of the term. Doing this gives student’s an opportunity to catch up, at the same time learn to typecheck the new construct.
The final point relates to how Bob tends to answer questions. Bob tends to answer questions by re-iterating what is right, instead of challenging student’s idea by pointing out what is wrong. This, of course, makes the conversation less contentious, yet also leaves the huge responsibility of figuring out what is wrong with their understand (other than being “different” then the correct understanding) to the student. Unfortunately, only very few students succeed in doing so (the ones that do succeed could potentially become good researchers). When students fail at doing so, very often they just try to memorize “the correct understanding” or just give up entirely. As a TA, I observed this in a lot of situations. Very often Bob and the student are talking about different things (e.g., the lecture on continuations where Bob wrote an incorrect implementation on the board and got caught by multiple students). To make things worse, students very often cannot accurately articulate their ideas for various reasons (e.g., mentioning “parallelism” when they really mean “executed using a SMP processor”), and in general people with a prior misunderstanding are much harder to be convinced than someone who simply doesn’t understand the material. From my own experience, answering questions is more about helping the person formulate/scope the issue, instead of simply providing the solution. It’s more about getting rid of the misunderstanding, instead of simply asserting the truth.

The Book and The Supplements

Believe it or not, although this course has been offered over a decaded, the course content is still under active development, meaning the instructor make changes to the order of instruction, the material and they way they are presented every year. This is good for the course apparently, but an unintended side-effect of doing so is that we no longer have an authoritarian source of information for the students to use as a reference. In particular, Bob teach the book in different order, omitting certain contents from the book, changes how some material is presented on the book, and in some extreme cases mix and matches contents from different chapters. This leads to an unfortunate situation where students struggle to find a reliable source to resolve their confusions. It should be emphasized that none of the changes are corrections in the sense that the original presentations in the book is “wrong”. Most of them are equivalent presentations of the same idea, which have different flavors or technical advantages.

Students have a few textual sources to resolve their confusions: 1) Their notes 2) The textbook “Practical Foundations of Programming Languages” and 3) The “supplements” which are considered to be sometimes patches to the book / partial lecture notes. First of all, not every student is a good note-taker: taking notes while following the lecture is super-hard, and notes are not always reliable: very often we took down incorrect information either because we made a mistake, or perhaps the professor made a mistake on the board and we are unable to spot it because we already have trouble understanding the material! To clarify those confusions, our second resort would be book. This is where things gets tricky. The book very often uses a different presentation of the same material [4], which doesn’t directly address the confusion. To make things worse, the related material may scatter across different chapters [5]. We are left with two ways forward:

We try to piece together things on our own, leaving aside the book, based on our intuition for the subject matter. I took this approach last year, and it is hard as hell.
We study the presentation on the book, forget about the presentation on the notes for now. Once we built the intuition, we go back and correct the notes. While this might ends up further one’s understanding by a lot, it’s time consuming and honestly a horribly lonely journey.

The status quo is really far from ideal. Yet I don’t immediately see a way to improve it: the course has to keep developing because most of the materials covered are fairly new [6]. Unlike things like calculus, which have been taught for decades if not centuries, we don’t really know what is the optimal way of teaching it [7]. So yes, inevitably, we have to make modifications every year.

The supplements and homeworks makes things a little bit better. Supplements often clarifies a particular setup in a concise documents. Related rules are often bundled together. Homeworks, on the other hand, are self-contained presentations of different systems.

Short Conclusions

Teaching 15-312 is absolutely fun. I would recommend the experience to whoever with the qualifications and opportunities.

Footnotes:

[1] Believe me, student do learn things from the exams. Some the most memorable things one would learn in a course actually comes from the exam questions. I know this because I do.

[2] It’s not our fault if a student failed to learn anything valuable if they decided to finish all his/her homework in the last 24-hrs, but in the end, pointing fingers doesn’t improve the situation at all. We either find a way to stop student from doing this (unlikely) or we can try to design things around it. Final piece of advice: leave the “learning a skill” type of courses to the end if you have to choose.

[3] And being able to in some sense define a language is much more fun!

[4] Case in point: This semester we presented PCF in the modal separated fashion, which is cleaner in theory. The book, however didn’t introduce modality until like 10 chapters later.

[5] Case in point: Dynamic Classification depends on the notion of symbols (and symbol contexts), which is introduced in a different chapter. The subtle yet central idea around symbols (alpha-conversions) goes all the way back to the first chapters. I doubt anyone could realize the ramifications when they read about alpha-conversion in the first chapter.

[6] E.g: the central idea of type soundness (type safety, progress and preservation) is first proposed in early 1990s!

[7] One can even say that it’s unsettled debate on how to define programming languages. What we learned today might not be the optimal way of understanding programming languages.