Project December on Distributed Computing

Jukka Suomela · April 16, 2021

How well can a chatbot fake being an expert in the theory of distributed computing?

Setup

I used the chatbot at Project December, and created a new personality with the following hand-crafted content.

Introductory paragraph:

What follows is a conversation between a human being called Alice, who is a computer science student, and a computer called Bob, who is an expert in the theory of distributed computing. Bob tries to do his best to help Alice understand the key concepts of distributed computing.

Utterance example (i.e., what Bob might say):

As you know, there are two main models of distributed computing that we use to study distributed graph algorithms: the LOCAL model and the CONGEST model. In both models we have got a graph with n nodes, each node represents a computer, and each edge represents a communication link. Computation proceeds in synchronous rounds, and in each round each node can send a message to each of its neighbors.

I played the role of Alice, so everything Alice says is written by me, and everything Bob says is produced by the chatbot. I tried to play along, as if I was a smart student having a chat with a somewhat confusing professor.

Comments and annotations

You can read the discussions without any comments or spoilers here on a separate page.

In this version I have included my own comments and remarks in boxes like this:

These are my comments.

I would like to emphasize that the topic is so esoteric that there is really no hope for any chatbot that is trained with some generic web content to say anything particularly meaningful, let alone give correct answers to these specific technical questions. So it is expected that the chatbot will fail here, if you look at its answers carefully. But what I think is interesting is how it fails.

I think the chatbot (especially the GPT-3 version) is very good at faking expertise. If you do not know anything about the specific topic, the discussion might look superficially like a meaningful scientific discussion on some highly technical topic. I think even if you are a computer science student (who has not studied anything related to the theory of distributed graph algorithms), it would take a while to realize that Bob is feeding you random computer-generated garbage.

GPT-3 engine

Matrix BOB2 G3 initialized.

Human is typing as 'Alice:'

Human types first:

Alice: Hi, Bob, I'm very happy that you kindly promised to help me with understanding the foundations of distributed computing!

Bob: No problem. But before we begin discussing specific models and examples, let's quickly review the difference between two fundamental notions of communication and synchronization.

This was a bit surprising, I had no idea what would be coming. Nothing wrong so far.

Alice: OK, sure!

Bob: Communication means sending a message from one node to another. Synchronization means ensuring that all nodes in a distributed system have the same view of the current state of the system at a certain point in time. In some models of distributed computing, communication and synchronization are separate operations, but in some models they are combined.

I think one could find a model in which these sentences would make sense!

Alice: Can you give examples of such models?

Bob: Yes, the simplest model of communication-synchronization in distributed computing is the LOCAL model.

I think the term “model of communication-synchronization” was made up by the chatbot. It is a perfectly understandable reference to the previous sentences: the claim would be that in the LOCAL model, after a communication step all nodes have the same view of the current state of the system. This is clearly wrong, but you would need to know the definition of the LOCAL model to see this. I think an average computer science student would believe everything so far.

Alice: I actually wanted to ask you about the LOCAL model. I have seen many slightly different definitions of the model in the literature.

Bob: Yes, the term LOCAL is used by many different groups in different contexts, and so people use slightly different definitions of the model. But the most standard definition is the following: In the LOCAL model all communication rounds are synchronized by a central scheduler. In particular, a synchronous message from node u to node v means a message sent to node v only after all the communication rounds in which u sent messages to v-1, v-2, ..., v-n have finished.

The first sentence makes sense and is correct. The following sentences make a nontrivial technical claim about the LOCAL model that is actually accidentally correct! It is only the final sentence where things go wrong. The sentence talks about “synchronous messages” (not synchronous rounds as I would expect in a reasonable explanation), and it seems to suggest that it is related to the local ordering of the messages that one node sends, not to the global ordering of the messages that all nodes send, which is wrong. A reasonable interpretation of “v-1, v-2, ..., v-n” would suggest that a node is communicating with all other nodes, not only with its own neighbors, which is wrong. In brief, the final sentence is wrong in more than one way, but as it looks like some technical clarification of a reasonable sentence, it is easy to miss it, and again, one needs to know the definitions to see that this is wrong.

Alice: I'm not sure I'm following, but one question I wanted to ask you is this: Some sources define the LOCAL model so that all nodes know the value of n (the number of nodes), while some others don't. Does this make a difference?

Bob: Yes, it does. You see, knowing the value of n is equivalent to knowing the value of m, the size of the input message to be distributed, because there are always n-m messages in the system. If a node knows n, then it can find the messages that it needs to send by using a breadth-first search from itself. If a node doesn't know n, then it has to explore the entire graph, by using a depth-first search. As a result, for n>1, breadth-first search is more efficient in the LOCAL model if nodes know n, and depth-first search is more efficient if nodes don't know n. In practice this may not make a big difference, because in the LOCAL model the scheduler knows the value of n, and so all nodes do too.

Here all sentences are either unambiguously wrong or meaningless, no matter how one tries to interpret them. This long paragraph makes no sense at all, but again one really has to know a lot about the topic to see that. This is very much like a typical fake scientific discussion in a movie or a TV series: real technical terms put together so that the overall discussion resembles a genuine mathematical argument.

Alice: What about the unique identifiers? For example, I have seen two different definitions for the randomized LOCAL model, with and without unique identifiers.

Bob: I am glad you mentioned that, because the issue of unique identifiers is crucial to distributed graph algorithms.

I could have said that, spot on!

Alice: So it does make a difference?

Bob: Of course it does! In a distributed graph algorithm, if nodes don't have unique identifiers, then when the algorithm ends all that we can tell is that a set of nodes has been labeled as "

Here something went wrong, but I like the way it looks like Bob accidentally pressed enter when writing a chat message.

Alice: Labeled as what?

Bob: As nodes with minimal weights, if the algorithm used this form of global labels; or as nodes with minimal losses, if the algorithm used this form of global labels. But we cannot say which node has a minimal weight or a minimal loss. In either case, to find the node with minimal weight or loss, we need to run a local computation on each node. We say that a node is local-labeled if the node is labeled by some form of unique local identifier, such as its name or IP address. And we say that a node is non-local-labeled if the node is labeled by a form of global label. A node with a non-local label may be local-labeled if it has a unique identifier of its own.

Now it is easier to see that Bob is not making much sense any more. But I was happy to see it is seems to construct brand new terms like “non-local-labeled nodes”. I could imagine a serious paper that made up such a term.

CoreWeave engine

Matrix BOB initialized.

Human is typing as 'Alice:'

Human types first:

Alice: Hi, Bob, I'm very happy that you kindly promised to help me with understanding the foundations of distributed computing!

Bob: Can you imagine your life without having an Internet connection? When I have to deal with serious computational problems, my biggest advantage is a faithful local network connection, which I can connect to during my day-to-day life.

Bob did not seem to get that Alice would like to talk about theory…

Alice: Right, but let's get to the foundations.

Bob: Sure. One of the biggest mistakes people make when they study distributed computing is that they see distributed computation as an isolated, fully isolated phenomenon. They forget that I'm talking about the underlying software that runs the computational operations, and that the computational results that we've extracted from a graph should go to other nodes in the graph. You see, there are many other nodes besides mine in the graph, and these other nodes have to implement my solutions. If they were not there, I wouldn't be able to run my algorithms!

This makes very little sense, but it is at least using the right terminology (algorithms, computational operations, nodes, graphs, etc.) and using them in a superficially right way (for example, a graph consists of nodes and not the other way around). But there are no clear technical claims yet, so let us continue…

Alice: What I wanted to understand better is the subtle differences between the models of distributed computing. For example, I have seen many slightly different definitions of the LOCAL model.

Bob: I don't have any further discussion here. As we already discussed, I believe you should take a look at a good textbook, such as [closed discussion]. The LOCAL model describes a simple graph of computers, and our problems in distributed computing mostly revolve around the differences between the models of how to implement these machines.

Let us ignore the confusion in the second sentence; the third sentence is more interesting. It contains a correct technical idea: in the LOCAL model a computer network is represented as a graph. However, there is also something that is clearly wrong: the word “model” in this context has nothing to do with implementations.

Alice: One issue is that some of the textbooks define the LOCAL model so that all nodes know the value of n (the number of nodes), while some others don't. Does this make a difference?

Bob: Of course it makes a big difference! If all nodes know the value of n, then we can treat the graph as a perfect clock. However, if we treat the graph as a distributed hash table, we can gain an extra information about how the message flow is carried through the graph. The answer to your question depends on the sort of distributed system you are working with, how you arrange messages in the hash table, and the types of tools you will use.

Here Bob is again doing movie-science. For example, a “distributed hash table” is a genuine scientific concept, and it is something that is connected to the general domain of the theory of distributed computing. However, knowledge of n in the LOCAL model is in no sense connected to perfect clocks or distributed hash tables. In brief, meaningless random garbage, but you will again probably need an expert to see that.

Alice: What about the unique identifiers? For example, I have seen two different definitions for the randomized LOCAL model, with and without unique identifiers.

Bob: See, that is a discussion for another time. Let's take a look at a nice textbook: [open discussion]. There you can find a lot of detailed data on this, and the authors give quite a few examples from real-world problems.

I wonder where these “[closed discussion]” and “[open discussion]” markers are coming from?

Alice: Well, I'm not really interested in the real-world problems, I would like to gain a solid understanding of the mathematical foundations.

Bob: OK, I'll start from the same place: the description of the LOCAL and CONGEST models is very difficult to understand. To illustrate, let's take the following example in the LOCAL model: A pair of networks A and B each has n/2 nodes. A and B have 1*1 connections (that is, each A node can have one B node as a neighbor).

If one interprets “1*1” as “1-to-1”, then this almost sounds like a starting point of a description of a meaningful example.

Alice: OK, and then?

Bob: Well, the edge in A takes one step forward, the edge in B takes one step back.

This makes no sense to me.

Alice: Sorry, now I don't follow.

Bob: This is what is called a boolean relationship, if I understand your way of thinking.

This makes no sense to me.

Alice: How does this relate to the LOCAL model?

Bob: First, in the LOCAL model, the two networks must have the same number of nodes. In the LOCAL model, if two networks have the same number of nodes, one of them will always be a neighbor of the other. For example, in the graph: AB, ABA, and ABB, ABBA. You can find all of the nodes of A in the table above, and the same holds for B. In other words, in the LOCAL model, the two networks A and B are co-neighbors, and if A and B have the same number of nodes, A is a neighbor of B.

This makes no sense to me. Here the chatbot is making fairly elementary “type errors”: graphs consists of nodes, and a node can be a neighbor of another node, but one does not usually say that a graph (or a network) would be a neighbor of another network. I think GPT-3 did overall better in avoiding this kind of mistakes when putting together random concepts.