Writing clean code with ChatGPT

Author’s note: For ChatGPT tips, click here. The ChatGPT transcript is here.

I love Crafting Interpreters. I first worked through the book at Recurse Center, next at Bradfield and most recently with David Beazley (repos here, here and here) [1].

The book teaches you how to implement a programming language from scratch. I worked through most of the chapters, but what I enjoy re-implementing is the ‘minimal’ set of features to to generate Fibonacci numbers. It’s a nice use case, requiring built-in operations, control flow and recursive function calls.

Since the code has gone through a couple of iterations, I was curious to see how ChatGPT could help me improve code quality further.


Source code to output

The scanner (or lexer) converts the source code into tokens (truncated to ease viewing).

> source_code = """\
func fibonacci(n) {
    if (n < 2) {
        return 1;

    return fibonacci(n - 1) + fibonacci(n - 2);

> tokens = scanner.scan(source_code)

> for token in tokens: print(token)
Token(token_type=<TokenType.FUNC: 'FUNC'>, value='func', line=1)
Token(token_type=<TokenType.NAME: 'NAME'>, value='fibonacci', line=1)
Token(token_type=<TokenType.PAREN_LEFT: 'PAREN_LEFT'>, value='(', line=1)
Token(token_type=<TokenType.NAME: 'NAME'>, value='n', line=1)
Token(token_type=<TokenType.PAREN_RIGHT: 'PAREN_RIGHT'>, value=')', line=1)
Token(token_type=<TokenType.NAME: 'NAME'>, value='fibonacci', line=8)
Token(token_type=<TokenType.PAREN_LEFT: 'PAREN_LEFT'>, value='(', line=8)
Token(token_type=<TokenType.INTEGER: 'INTEGER'>, value='9', line=8)
Token(token_type=<TokenType.PAREN_RIGHT: 'PAREN_RIGHT'>, value=')', line=8)
Token(token_type=<TokenType.SEMICOLON: 'SEMICOLON'>, value=';', line=8)
Token(token_type=<TokenType.EOF: 'EOF'>, value='EOF', line=8)

Next the parser converts the tokens into abstract syntax trees (hereafter, ASTs).

> statements = parser.parse(tokens)

> for statement in statements: print(statement)

Finally the interpreter ‘executes’ the ASTs to generate the desired output.

> results = interpreter.interpret(statements)

> for result in results: print(result)

Project setup

To separate the ChatGPT changes from my own implementation, I copied over the latest version of minimal-lox into a new repo simple-lox. Naturally I got ChatGPT to suggest the name.

The source code is small enough to fit within the maximum tokens allowed, so next I asked ChatGPT to create a source code consolidator that pulls everything into a single file [2].

Let’s create a test runner to ensure the code passes all the tests, and set up Github Actions.

OK we’re good to go! We can finally ask ChatGPT how to improve code quality. Since there’s no cost in asking for an egregious number of suggestions, let’s go for broke and sort the recommendations in descending order of impact.

Changes implemented


ChatGPT is excellent at generating documentation, which was the main thing my code lacked. These were docstrings across modules (suggestion #1), functions (#2) and classes (#3). Even if I wasn’t happy with the result, it’s much faster to edit than to write from scratch.

The parser.py module is responsible for converting a sequence of tokens,
provided by the scanner.py module, into an abstract syntax tree (AST). This
AST represents the structure and syntax of the programming language being
interpreted. The parser handles constructs like variable declarations,
function definitions, and control flow statements.

I was surprised ChatGPT didn’t suggest a README.md. In any case, this is easy to generate.


I only had generic exceptions, so ChatGPT suggested specific errors with more descriptive error messages. In other words, changing from Exception to ValueError , TypeError etc and adding more details like line numbers.

# before
raise Exception(
		f"Expected {token_type}, got {token.token_type}."

# after
raise ValueError(
		f"Expected {token_type}, got {token.token_type} at line {token.line}."

ChatGPT made a later suggestion to also create custom exceptions, which allows for more specific error handling. Perhaps this would have been a bit cleaner had ChatGPT included both in the same suggestion.


I had my constants (or globals) in lowercase, so ChatGPT recommended the convention that constants be capitalized.

# before
literals: Dict[str, TokenType] = {
    "{": TokenType.BRACE_LEFT,

# after
LITERALS: Dict[str, TokenType] = {
    "{": TokenType.BRACE_LEFT,

This concept I like; I can see extending the ask towards higher consistency more broadly (say in ‘harmonizing’ different styles).

Changes not implemented


ChatGPT is very good at creating tests, which I often use on other occasions. However my test coverage here is pretty good. It’s hard implementing new language features without breaking existing ones, and it’s easy to know if you’re breaking existing features with tests.

Related to the idea of ‘tests as guardrails’ is the use of types. I like types, so I already had my code passing mypy. It was amusing, however, to see ChatGPT conveniently ignore existing type annotations in making the suggestion to add types.


I struggle with names, and ChatGPT makes naming so much easier. That said I’m guessing my naming here is OK. ChatGPT proposed I rename tokens to token_list but on a later prompt proposed I rename token_list back to tokens...


ChatGPT recommends I refactor a number of functions into smaller ones, especially where there’s branching. Since each branch is not that long, I decided to keep the code as is. It’s easier to compare the different sections when you can see them side-by-side.

Changes less expected

Runtime validation

I had assertions in a number of places, mainly to keep mypy happy. For example, if I had a function that returns only specific types of a union type, the ‘narrowing’ of the union type would confirmed with an assert.

Environment = Dict[str, Union[int, statem.Function, "Environment"]]

value = environment[name]
assert isinstance(value, int) or isinstance(value, statem.Function)

ChatGPT suggested I go even further and implement runtime validation for my functions, similarly through the use of assertions.

def execute(statement: Statem, environment: Environment) -> List[Optional[int]]:
    if not isinstance(statement, Statem):
        raise TypeError("'statement' must be an instance of Statem")

    if not isinstance(environment, dict):
        raise TypeError("'environment' must be a dictionary")

I always wanted to try out pydantic and prodded ChatGPT a bit in this direction. ChatGPT suggested creating a data model from BaseModel, but this feels a too much ceremony. I looked around the documentation to happily discover the validate_call decorator.

def execute(statement: Statem, environment: Environment) -> List[Optional[int]]:


In the past, I would enforce read-only data structures through static checks. In particular, instead of List and Dict, I would use Sequence and Map respectively. ChatGPT suggested this be enforced at runtime with a dataclass decorator to make attributes read-only [3]. Nice!

# before
class Name(Expr):
    text: str

# after
class Name(Expr):
    text: str

Version upgrades

This didn’t come up in the list of suggestions, but I realized ChatGPT could easily assist with upgrading the code to a later Python version. I have always been impressed by the Go toolchain, now it seems ChatGPT helps fill that gap for Python.


If I haven’t made it clear, I’m very much a fan of ChatGPT.

Less fun aspects such as documentation and tests are practically done for you. Even if you feel the quality is lacking, it’s usually a good starting point. I also like how ChatGPT makes it much easier to improve consistency, whether it be naming or style.

ChatGPT does have a few quirks, here are some workarounds: ▪️ “Please don't make things up, only suggest changes to code that actually exist.” ▪️“Please provide suggested changes to the maximum output token allowable, and sort in descending order of importance.” ▪️ Create a new chat window when ChatGPT slows down from large amounts of text. To avoid pasting code repeatedly, consider creating a custom GPT version.

[1] David Beazley has ~30 years experience with Python, and there’s something magical about hearing how he had his mind blown working on a parser combinator in Haskell.

As an aside, sometimes people ask me "what can I do to improve my Python skills?". Much to their surprise, I often suggest doing a project in a completely different language or outside of their area of expertise. I think the main benefit of doing this is that you'll often see a completely different way of thinking about a problem that you can bring home to your own projects.

[2] For larger codebases, it’s possible to compress and upload into a custom GPT version like for LangChain here.

[3] The generic type Sequence will raise a mypy error on an append (since it’s read-only), but no error is raised at runtime.

sequence: Sequence[str] = ["a"]

> mypy sequence.py
sequence.py: error: "Sequence[str]" has no attribute "append"  [attr-defined]

> python sequence.py
['a', 'b']

The dataclass decorator enforcement of read-only occurs both statically and at runtime.

class Name:
    text: str

name = Name(text="a")
name.text = "b"
> mypy name.py
run.py:55: error: Property "text" defined in "Name" is read-only  [misc]

> python name.py
Traceback (most recent call last):
  File "name.py", line 6, in <module>
    name.text = "b"
  File "<string>", line 2, in __setattr__
dataclasses.FrozenInstanceError: cannot assign to field 'text'