Playing with Copilot and introducing Hello Evolved

I occasionally dabble in programming languages outside of the main few I work in. Because it may be a few years before I get back to a language, I wanted to create a few short example programs that show the basics of a language and that I could keep together in one place. The traditional “Hello World” is too trivial to be an example, so I started working on a specification for examples that is a little more elaborate. I then realized it would be a good way to introduce a new language to any experienced developer. So here we go: I call it Hello Evolved. It’s a work in progress, but you can see it on GitHub today at https://github.com/jimleonardo/hello_evolved. This little project also let me take GitHub’s Copilot, an AI driven code assistant, out for a spin. Copilot complements the concept of Hello Evolved nicely by helping an experienced developer who is working in a new language understand that language, but it isn’t even close to being ready to be a virtual programming partner.

Almost every author of a programming language tutorial seems obligated to start off with Hello World. It’s a simple program that outputs “Hello World” and does nothing else. Hello World got started back when fewer people even knew what code was and a simple program usually had many parts. An example that could demonstrate something functioning was needed to explain all the parts of a minimal program. What is simpler than just putting some text on the screen? A Hello World example sheds quite a bit of light on what makes up a minimum program in C:

    #include <stdio.h>
    int main() {
        printf("Hello World/n");
        return 0;
    }

Each of those 5 lines does something different and each is needed to form a functioning C program even though the program only does one thing: put “Hello World” on the screen.

Include the standard input/output library. This is needed to use the printf function.
Declare the entry point of the program. This is needed to tell the operating system where to start executing the program.
Output the string “Hello World” to the screen using the printf function.
Return a value to the operating system. This is needed to tell the operating system that the program has finished executing. 0 means success.
Close the program with }.

That’s a lot of code to put 11 characters on the screen. Other languages can be far simpler. In Python, the following code will print “Hello World”:

    print("Hello World")

Ruby also keeps it simple:

    puts "Hello World"

There is an output command (print and puts) and a parameter (“Hello World”). Strings of characters appear in between double quotes (“) and the developer doesn’t need a lot of ceremony just to create a simple bit of output. While both are complete programs, they are not terribly useful. They tell the reader how to call a single method, that’s it.

This approach of introducing only a single line of code that does something translates into the learner needing to sift through multiple pages of documentation just to get to a point where they can do something that is useful. While a complete novice may need that level of explanation, it’s painful for anyone with any experience at all. So with Hello Evolved, I aimed to create a specification for a program that will do a few, small, things that can help an experienced developer understand a language they’re looking at for the first time. To accomplish that, the spec needs to be rich enough to demonstrate a few features, but not so exhaustive that it won’t be relevant to more than a few languages. It also has to be simple enough that it can be implemented in one screen of code.

You can read the spec for Hello Evolved in the README.md file on GitHub. I expect more changes will come, but it is more than good enough to start testing out with some examples. The specification says that an example program should:

Welcome the user.
Allow the user to input some text.
Perform a basic validation.
Output the text.
List all the values entered so far.
Allow the user to stop by typing “stop” (yes, that means you can’t use “stop” for one of the values in the list).
Loop back to allowing the user to input some text if they have not typed “stop”.
Thank the user for using the program.

The spec is a more wordy than that in order to be a little more precise. I didn’t wan’t to exclude less structured languages or special purpose languages, so I avoided things like saying you had to have a class or should use subroutines. I also did not specify anything about internal structure because it could result in code that was not idiomatic for a given language. I’m hoping this is just barely complex enough that there can be multiple good solutions in a given language. For example, I’ve created two C# samples: one using traditional C# and one using C#’s newer top-level statements.

It should only take a few minutes to create an example. I expect most languages will come in between 30 and 45 lines of code. If you are interested in creating an example, please feel free to submit a pull request along with any instructions on how to set up an environment to test your example. If you are interested in seeing an example in a language that is not yet represented and you don’t know how to do it, please feel free to open an issue at https://github.com/jimleonardo/hello_evolved/issues. No promises on when I’ll get to it, but I’ll do my best.

Examples have been created in the following languages:

Python
C# (traditional and top-level statements)
Ruby
Rust
Java
VB.Net

GitHub’s Copilot AI coding assistant really accelerated creation of these examples. I started by creating the Python example, then moved to the C# traditional example. I let Copilot do most of the work of creating the Python and C# code by putting the specification at the top of the file as a code comment and letting Copilot fill in the code. From the C# code, I used GitHub Copilot inside of Visual Studio Code to translate code into the other languages by highlighting the code I wanted translated, selecting the language, and clicking the button. The VS Code UI for Copilot is clunky (example: it translates the code, but there is no way to save it other than copy and paste to a file), but it got the job done.

The results were generally ok for syntax, but there was often nuance that got missed. Most of nuances resulted in code that wouldn’t compile/run, but some issues resulted in more subtle errors. For example, the compiler errors in the Java code were easy: I just had to tell it to import a few standard classes (ArrayList, Iterator, List). Unfortunately, the logic error was the kind of subtle bug that escapes into production too often when you don’t have enough testing. It was the check for the string of characters “stop”. While C# and Java are very similar, they are very different when it comes to string implementations and that impacts how you check whether two strings have the same value. For more info on how C# does it, see Jon Skeet’s answer to this Stack Overflow question about comparing strings. The short story is that while we can compare strings in C# using the operations as we use for other simple data types by using the == operator, in Java, we have to use .equals() because Java only treats strings as reference objects. That means == only tells you if the variable name points to the same part of memory. For more on how to compare strings in Java, see this answer in Stack Overflow.

Therefore, the C# code that looked like this:

    if (input == "stop")

needed to become this in Java:

    if (input.equals("stop"))

What is really weird is that Copilot did not figure that difference out when it translated the Hello Evolved code, although it did figure it out as I was it typing above! This kind of inconsistency in Copilot leads me to conclude that we are a long way from calling it ready to be a fulltime code assistant.

Even with that issue, for the purpose I had in mind, it was a net win. I haven’t written code for Java in years but thanks to Copilot, I didn’t need to go hunt up a tutorial page to remind myself how to create working Java code. When I tested the code, I caught the bug right away and suspected this problem. However, if it were a more complex program that was being converted or I was new to Java instead of rusty, I probably would have struggled to figure out the problem.

What about using Copilot to write text? I’m writing this post in VS Code, so I’m seeing (and discarding) prompts from Copilot anytime I pause a few seconds while typing. Short snippets are usually ok, but it often tries to be too smart and ends up with repetitive nonsense like this:

The compiler errors were mostly around the fact that Java is a strongly typed language and I didn’t specify the types of the variables. The logic errors were mostly around the fact that Java is a strongly typed language and I didn’t specify the types of the variables.

It is a code generator, not a text generator, so I’m not going to conclude that it’s a bust just because of that. Maybe a dedicated text AI would be better, but I don’t see much purpose in my life for such a thing yet.

I do not think the current approach to AI will ever get us to a good code assistant. Programming languages change over time and most real world code is less than ideal, so using existing code to train a code assistant means it will never replace features like we get from tools like Resharper or IntelliJ that will alert us to better ways to code. One of the things I love about using Resharper in Visual Studio is that it tells me about new language features in C# as I code. Training AI on existing code simply can’t help us there.

Where I think we’ll eventually land is when we fuse the two approaches: AI to prompt us as we type, but traditional rule based tools to help us meet good coding standards, restructure our code, and handle the questions that have clear answer.

There is also a brewing legal storm with AI and copyright. There’s a lot of strong opinions out there, but my not-a-lawyer self suspects we’re getting into very new ground with what will constitute infringement. Outside of concerns that Copilot may reproduce code verbatim in a way that may infringe someone’s license, there’s a lot of discussion about whether a model can legally/ethically/morally train on intellectual property(IP) like code, text, or artwork that it doesn’t have rights to reproduce. I think we’re going to see some interesting court cases in the very near future. Morally and ethically, I see training on IP that you don’t have rights to even have a single copy of is an obvious problem. The morals and ethics of training on IP that you legally possess a copy of but you don’t have rights to reproduce is less clear to me. I am writing this post based on a lifetime of learning to write by reading what other people have written. My biological neural network (my brain) was trained on text that I largely don’t have rights to reproduce. Is the output from an AI assistant similar? Does accelerating the learning process make a difference? Should we apply different standards when we’re training a computer based neural network? I’m honestly not sure what the answers are to any of those questions, but I’m sure we should all go watch “Colossus: The Forbin Project” to help build our comfort with notion of AI running the world.

The biggest lesson I learned from all of this? It is nice to have support in Visual Studio Code for so many languages! I’m a big fan of finding ways to lower the amount of brain I need to dedicate to tools so I can use more brain for the problem solving, so being able to stay in one editor was the biggest win.