Book: Pull Requests and Code Review: Best Practices for Developers

The book Pull Requests and Code Review: Best Practices for Developers has been pointed out to me by my colleague Paulo Pinto, and I liked it.

I always find it hard to express what a makes a good code review, this book attempts to describe it. There are probably lots of other opinions about and for better recommendations but I liked this one for it trying to give some advice that I can agree with.

You can buy the book for “name-your-price” or check out the Github repo and download or build the format you want to read it in.

Posted in Web

OpenAI Text-to-Speech

It has been a somewhat interesting coincidence that I am currently without voice because of a cold, and OpenAI has just released some really good Text-to-Speech voices with their Create speech API. So in preparation for a meeting today, I created a little script that will output the spoke audio what I typed.

Since the voice will read exactly what’s there, I added a spell fixer that will (through ChatGPT) automatically fix typos before it’s sent to the audio API.

$ php talk.php
Voice: echo
Fix spelling: off
Speed: 1.0
> hi everyone and welcoem to tis meetin
> sc
Fix spelling: on
> hi everyone and welcoem to tis meetin
Hi everyone and welcome to this meeting.
> s2
Speed: 2
> my voice is gone because of a pretty string cold that iv pickd up
My voice is gone because of a pretty strong cold that I've picked up.
> s1.1
Speed: 1.1
> my voice is gone because of a pretty string cold that iv pickd up
My voice is gone because of a pretty strong cold that I've picked up.
> turns out, even suing the streaming audio aip, typing and then waiting for the srsult is too lsow for a conversation. but it's been interesting
Turns out, even using the streaming audio API, typing and then waiting for the result is too slow for a conversation. But it's been interesting.
> sc
Fix spelling: off
> without spell fixer it's faster but for good intonation it only makes sense to send full sentences, not single words as soon as they have been typed. maybe that can also be solved, but that's for the next experiment

In any case, it’s been fun. Thanks Simon for highlighting the API.

Posted in Web

chat-cli: renamed and added Ollama support

I have added support for Ollama to my chat-cli tool (formerly named cli-chatgpt, see previous posts). Ollama is a very easy way to run llama2 locally; it runs a local HTTP server and this will be used.

Depending on whether an OpenAI key or Ollama are available, the models will be made available and you can see this in cgt -h. (Using cgt as a command here is based on my recommendation of using an alias in your shell.)

When you’re offline, OpenAI is deemed unavailable, thus–if it is installed and active–Ollama/llama2 will be used automatically. While it currently doesn’t switch between models during a conversation (because of your online state), you can simply exit and re-continue the conversation with another model using cgt -l.

Both models show their responses in a streaming way.

❯ cgt where is gerolstein
Model: gpt-3.5-turbo
> where is gerolstein

Gerolstein is a town in the Bitburg-Prüm district in Rhineland-Palatinate, Germany. It is located in the Eifel mountain range, approximately 15 kilometers southeast of Bitburg.
> ^c

### Went offline here.
❯ cgt where is gerolstein
Model: llama2:latest
> where is gerolstein

 Gerolstein is a town located in the state of Rhineland-Palatinate, Germany. It is situated in the northern part of the state, approximately 20 kilometers (12 miles) northwest of the city of Mainz. The exact address of Gerolstein is:

Gerolstein, Germany

If you are planning to visit Gerolstein or need more detailed information, please let me know and I will be happy to help.
> 

While there is no config available, you can change the priority of models in the source.

New Unit Tests Need to Fail (Running the Old Code)

When possible, I very much recommend adding unit tests to a Pull Request when you fix something as a way to prevent that it breaks again in future. This is no news.

One important step of adding that unit test, though, is to make sure it actually tests the bug you are fixing. Specifically, this means that you need to test that fact like this:

Keep the new unit test, undo the other code changes. The unit test now needs to fail.

If your newly introduced unit test still passes, then you didn’t expose the bug in the test. A simple check but often neglected.

A tooling solution

Here is how you can automatically test this in Github. I have created a sample repo with 3 pull requests https://github.com/akirk/unit-test-failure/pulls:

3 Pull requests of which one fails the unit tests

You can see the first one failed:

A screenshot of a Github Action that has a unit test passing when it shouldn't

The problem is that the old code already passes the new tests. This means the tests don’t test the change.

The second one passes but only because no new tests were added at all.

A screenshot of a Github Action that skipps testing without the unit test changes because no new unit test was added

And the third one does it correctly. The old code fails the new tests as desired:

A screenshot of a Github Action that has correctly fails the unit test when running the old code

Here is the code for the Github Action to test this:

name: Pull Request Unit Test Validation

on:
  pull_request:
    types:
      - opened
      - synchronize

jobs:
  test-changes:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v2

      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: 3.9

      - name: Run unit tests with changes
        run: python -m unittest discover -s tests

  test-without-changes:
    needs: test-changes
    runs-on: ubuntu-latest
    if: ${{ github.event_name == 'pull_request' }}

    steps:
      - name: PR commits + 1
        run: echo "PR_FETCH_DEPTH=$(( ${{ github.event.pull_request.commits }} + 1 ))" >> "${GITHUB_ENV}"

      - name: Checkout PR branch and all PR commits
        uses: actions/checkout@v3
        with:
          ref: ${{ github.event.pull_request.head.sha }}
          fetch-depth: ${{ env.PR_FETCH_DEPTH }}

      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: 3.9

      - name: 'Fetch the other branch with enough history for a common merge-base commit'
        run: git fetch origin ${{ github.event.pull_request.base.ref }}

      - name: Check if tests were changed
        run: |
          git restore --source=$(git merge-base origin/${{ github.event.pull_request.base.ref }} HEAD) --worktree tests/
          if git diff --quiet; then
            echo "TESTS_CHANGED=0" >> "${GITHUB_ENV}"
          else
            echo "TESTS_CHANGED=1" >> "${GITHUB_ENV}"
          fi
          git restore .

      - name: Revert code changes (excluding tests)
        run: git restore --source=$(git merge-base origin/${{ github.event.pull_request.base.ref }} HEAD) --worktree :^tests/
        if: ${{ env.TESTS_CHANGED }} == '1'

      - name: Run unit tests without changes
        run: |
          if [[ ${{ env.TESTS_CHANGED }} -eq 0 ]]; then
            echo "No unit test changes detected, skipping"
            exit 0
          fi

          if python -m unittest discover -s tests; then
            echo "Unit test should not pass"
            exit 1
          else
            echo "Unit test failed as expected"
          fi

Thanks @jesusamieiro-com for pointing out that you need to be careful with this for code that adds new test coverage. Probably best to use a Github label to activate or deactivate this action.

Resuming ChatGPT chats in CLI

I have issued an update to my CLI ChatGPT Client (see when I first announced it) so that you can now resume conversations. Here is the new --help output (I personally have an alias cgt=path/to/chat.php to have it quickly available):

Usage: chat.php [-l] [-r [number]] [-s system_prompt] [conversation_input]

Options:
  -l                 Resume last conversation.
  -r [number]        Resume a previous conversation and list 'number' conversations (default: 10).
  -s [system_prompt] Specify a system prompt preceeding the conversation.

Arguments:
  conversation_input  Input for the first conversation.

Notes:
  - To input multiline messages, send an empty message.
  - To end the conversation, enter "bye".

Example usage:
  chat.php -l
    Resumes the last conversation.

  chat.php -r 5
    Resume a conversation and list the last 5 to choose from.

  chat.php -s "Only respond in emojis"
    Have an interesting conversation 🙂

  chat.php Tell me a joke
    Starts a new conversation with the given message.

The CLI client uses a streaming response, so you can watch as it generates your answer.

The resuming functionaliy also changed re-arranges the history in the chat cli folder to a structure chats/%Y/%M/history.%U.txt (with %Y being a 4-digit year, %M a two-digit month and %U the unix timestamp when it was started). The prompt is readline compatibly with a history file so that you can quickly reuse previous questions. You can avoid something to be added to history by prepending it with whitespace.

chat.php -l will quickly resume the last conversation but you can also resume other ones:

$ cgt -r 3                                                                                                                                                     Resuming a conversation. Please choose one: 

1) is there an alternative to the logwatch program that can send daily e-mails with reports on log file (1 answer, 248 words)

2) please give me javascript that restarts a video when i click on it (2 answers, 237 words)

3) does a one-letter domain like w.org (specifically) cost more than other .org domains? (3 answers, 253 words)

Please enter the number of the conversation you want to resume (m for more):

To me, it is very powerful to have ChatGPT available in the CLI since I always have a terminal open. Keeping searchable local history is also something that I like for later (offline) reference.

Using Text Expansion for URL Completion

In my professional life on the web, I tend to visit lots of the same URLs frequently. While I have (most of) them bookmarked in my browser, I usually don’t navigate to the bookmark and click it.

I start typing in the URL field of my browser (Firefox) and since autosuggest also searches the bookmarks, those are often visible. I realized though that this is still often too slow and not straight forward enough: It happens that many URLs are very similar and have the same first part, such as on Github many repos are under the same organization.

So a while ago I started using Alfred’s snippets to expand URLs for me. By using / as a suffix1, this allows for speedy expansion of URLs I visit frequently. A bonus is that the same expansion works in other places where I need the URL, e.g. for linking it in blog posts.

So, for example, I have the snippet g/a/ to expand to https://github.com/akirk/. If I then continue typing fr, my browser can autocomplete much more precisely because I have already entered part of the URL to https://github.com/akirk/friends which I often visit.

Hope this can serve as an inspiration!

  1. Although I natively speak German, I have switched to US (Intl for the inverted-L Enter key, I cannot let got of that one) keyboards a long time ago to benefit from dedicated keys for square and curly brackets, this also means I have a dedicated key for slash ↩︎

Letting ChatGPT Access the Internet

With the new Completions function calling API you can actually enable ChatGPT to access the internet. I did a few quick experiments where I provided it with a function called get_extracted_url_contents. See how it made use of it:

Get News

Script: https://github.com/akirk/cli-chatgpt/blob/main/get-news.php

$ php get-news.php

Prompt: Get some news from a random US news website on the internet.
Fetching https://www.cnn.com by request of ChatGPT.
AI (239 tokens used):
According to CNN, the breaking news is that the former US president has been arraigned on 37 federal charges involving the alleged mishandling of highly classified military documents. This is a historic first.

Or from a German website:

Prompt: Get some news from a random German news website on the internet. Please only respond in English.
Fetching https://www.spiegel.de/international by request of ChatGPT.
AI (1363 tokens used):
Here are some news headlines from SPIEGEL International:

  1. “Ukraine’s President Calls on NATO to Set Clear ‘Red Lines’ for Russia” – Ukrainian President Volodymyr Zelensky has urged NATO to establish clear “red lines” to deter further aggression from Russia. Zelensky made these remarks during a visit to Brussels.
  2. “German Cabinet Approves New COVID-19 Restrictions, Including Nationwide Vaccination Mandate” – The German government has approved new measures to combat the spread of COVID-19, including a nationwide vaccination mandate for certain groups of people. These measures come as Germany faces a surge in COVID-19 cases.
  3. “Turkey Summons German Ambassador Over ‘AfD-Led’ Turkish Ship Inspection” – Turkey has summoned the German ambassador to protest against the inspection of a Turkish ship by a German frigate in the Mediterranean Sea. The inspection was led by a member of the far-right Alternative for Germany (AfD) party.
  4. “European Court of Justice Rules Hungary’s NGO Law Violates EU Law” – The European Court of Justice has ruled that Hungary’s controversial law that restricts the activities of non-governmental organizations (NGOs) violates EU law. The law required NGOs that receive funding from abroad to register as “foreign-funded organizations.”

Please note that these are just a few selected headlines. For more information, you can visit the SPIEGEL International website.

One more try for a specific site:

Prompt: Get some news from ORF. Please only respond in English.
Fetching https://orf.at/ by request of ChatGPT.
AI (1286 tokens used):
Here are some news headlines from ORF:

  1. Document affair: Trump declares “not guilty”
  2. Babler presents his new team
  3. Johnson vs. Sunak: Insults keeping the British on edge
  4. Slovenia wants to keep borders with Croatia open
  5. WKStA secures data in the Chancellor’s Office
  6. Alaba is about to reach the next milestone
  7. Alpine climate in winter is significantly too mild
  8. IAEA “very concerned” about fighting near Zaporizhia nuclear power plant
  9. NATO hopes Ukrainian successes will force Moscow to negotiate

Please note that these are only the headlines and more details can be found on the ORF website.

Summarize URLs

Script: https://github.com/akirk/cli-chatgpt/blob/main/summarize-url.php

$ php summarize-url.php https://alex.kirk.at/2023/02/02/what-is-matrix/
Fetching https://alex.kirk.at/2023/02/02/what-is-matrix/ by request of ChatGPT.
AI (1340 tokens used):
Matrix is an open-source internet-based chat system that operates on a federated model similar to email. It allows users to send and receive messages through homeservers, which maintain up-to-date copies of chat rooms. Matrix rooms can be either unencrypted and openly discoverable or end-to-end encrypted and invite-only, with encryption keys shared between users’ client software. There is a wide selection of Matrix clients available for different platforms, and a WordPress plugin called Chatrix has been developed to embed Matrix into WordPress posts or pages.

Research

Script: https://github.com/akirk/cli-chatgpt/blob/main/research.php

Here it required several runs to get the right response since it tried to access a lot of invalid URLs. But it eventually managed.

$ php research.php
Prompt: Research on the internet who won the ATP French Open 2023 but don’t use a search engine.
Fetching https://en.wikipedia.org/wiki/2023_French_Open by request of ChatGPT.
AI (1085 tokens used):
According to the information on the Wikipedia page for the 2023 French Open, Novak Djokovic won the ATP Men’s Singles championship.

Things also worked out in the end when I gave it the ability to do multipel requests:

$ php research.php
Prompt: Research on wikipedia who won the (both men’s and women’s) ATP French Open 2023. If you cannot find a result, ask for a subsequent a function call.
Fetching https://en.wikipedia.org/wiki/2023_French_Open_%E2%80%93_Men%27s_Singles by request of ChatGPT.PHP Warning: file_get_contents(https://en.wikipedia.org/wiki/2023_French_Open_%E2%80%93_Men%27s_Singles): Failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found

Fetching https://en.wikipedia.org/wiki/2023_French_Open_%E2%80%93_Women%27s_Singles by request of ChatGPT.PHP Warning: file_get_contents(https://en.wikipedia.org/wiki/2023_French_Open_%E2%80%93_Women%27s_Singles): Failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found
Fetching https://en.wikipedia.org/wiki/2023_French_Open by request of ChatGPT.
AI (1630 tokens used):
The winner of the men’s singles ATP French Open 2023 is Novak Djokovic, and the winner of the women’s singles ATP French Open 2023 is Iga Świątek.

This has been just a quick exploration following OpenAI’s announcement. Curious how we’ll leverage this better in future!

CLI ChatGPT Client in PHP

For my personal use, I created a CLI ChatGPT client a few months ago. It has been very useful for me to have it available in the CLI quickly whenever I needed it (caveat, you need an API key!).

Unfortunately, for larger responses it can feel very slow, whereas the web version feels quite fast. But this is perceived speed because it shows you each word as it arrives and not just the whole response when finished (after all it is a completion AI that generates the response word by word).

So now, I have added streaming support to it. It now feels almost too fast :) The CLI has a few nice things such as readline support (i.e. you can go back to old queries with the up key) and it keeps all of your conversation in a text file. All of this in only 100 lines of PHP.