• Building a Content Retrieval App with FastAPI (MRR@3 ≥ 0.8)
    I recently applied for an ML Engineer position, and the company had the audacity to reject me (can you believe it?). Part of the application was a coding challenge process involved building a FastAPI content retrieval application that would return the top-3 blogposts from a dataset of posts that most closely matched a user's query. This post explains how I created a custom blogpost dataset, created a synthetic evaluation set, created sparse lexical features and dense semantic embeddings, and served and evalued my FastAPI locally. Everything is reproducible and there are many possibilities for extending this approach or applying it to downstream tasks.
  • Scraping Autotrader Data to See How Much I'm Being Ripped Off
    Selling a used car at the right price is hard! Most of us have no idea how much used cars are worth, so it's easy to get fooled into selling way below market value. Here's how I used autotrader's UK public adverts to create a dataset with make, model, year, mileage and cost, see where my car stacks up, and get a more precise idea of what it should be worth.
  • Ultra Modernity - The Ossification of the Faustian Culture and What it Means for Nascent Proto-Cultures
    It has been one hundred years since Spengler first enlightened us with his philosophy of history, a biological philosophy of peoples and the future...
  • Transcibing Youtube Videos to English using Whisper and Deep-translator
  • Renaming Blogposts Using ruamel.yaml

    I wanted to rename all my TIL blog posts from [Title] -> [TIL: Title], because this blog might be evolving from a TIL blog to an everything blog (with link posts, personal posts, reflections, etc.). So I did what I usually do when I want to automate something in a way I’ve never done before, I asked ChatGPT. In the process I learned about the ruamel.yaml and io packages…

  • 2024 Review - 40 Questions to Myself

    After reading Xuanwo’s 40 questions to myself (inspired by Steph Ango’s 40 questions to ask yourself each year) I decided to have a go myself. Steph Ango also has 40 questions to ask yourself each decade but I’m not up to that just yet. Here goes…

  • AI for Oncology
    Summary of how the AI for Oncology LAB (NKI, the Netherlands) is planning on using AI to automatically segment SCC biopsies intra-operatively, and how they developed the DIRECT package to recreate high-quality MRI medical images from under-sampled k-space MRI data.
  • Connect to a Remote Server and Setup a Jupyter Notebook with Port Forwarding
    Some simple commands and instructions for connecting to remote linux server with ssh, setting up a jupyter notebook and then setting up port forwarding so you can edit the notebook on the remote machine directly locally in the browser or VSCode.
  • Training a fastai Image Classification Model to Identify Japanese Food
    test
  • Bash
    A simple bash script, for calling python scripts in a virtual environment globally.
  • Using Python and Bash to Print Daily Crypto Prices
    test
  • How to Add Pinned Posts to a Jekyll Blog
    A bit of HTML and CSS to add pinned posts to the top of the blog, simple.
  • How to Enable Scheduled Redeploys in Github Actions
    This post explains how to to setup github actions so that your jekyll static website is rebuild daily, irrespective of push redeploys, enduring that pushed future posts are added to the live site on the right day automatically.
  • How to Write Simple Unit Tests
    How to write unit tests in pythin using unittest for an absolute beginner.
  • How to Plot the Kaggle Leaderboard Scores for a Particular Competition
  • Generate Plain-Language System Summaries using Simon Willison's llm Package
    Simon Willison's `llm` package can be used to easily interact with large language models like gpt-4o and o1-preview in the command line, which is especially powerful when you want the model to have immediate access to responses to bash commands, for example to generate system detail summaries.
  • Why is the Closed-Form Solution to Linear Regression Inefficient?
    I realised, to my horror(!), that I don't actually understand the nuts and bolts of linear regression. This post outlines some of what I have learned about how the linear algebra works, and why gradient descent beats the closed-form solution, despite being an iterative solution instead of a single computation solution.
  • How to Use the OpenAI API

    It is very easy to use the OpenAI Python API to have chat comversations with their available models. I wrote some scripts that enable the user to have a back and forth conversation with any of the API available models in the terminal. You can also print the entire conversation logs, view the logs in a log.txt file, and track the cost of the conversation. Here is an example:

  • Using Python to Automatically Generate Anki Cloze Deletion Cards

    You can use python to generate a .txt file from a .csv file to auto-generate many Anki cards with the same format, which you can then import simply into Anki. I used the following procedure to Ankify approximately 1000 Dutch vocabulary words.

  • What I Learned From a Postdoc Interview Rejection
    Doing bits of coding here or there is not enough, you need to lock in and have the fundamentals at your fingertips.
  • What is the Python Walrus Operator (:=)?

    The Python “walrus operator” is a neat little operator introduced in Python 3.8 that allows you to assign a variable that is immediately available for use on the same line. This makes code more concise and readable (if you understand how the operator works!).

  • What is Jekyll Anyway?

    Jekyll is a statis site generator written in Ruby by the great Tom Preson-Wener (founder and former CEO of Github, aswell as prolific blogger/educator). It takes text content (in the form of Markdown, .md, files) and templates (HTML layouts) and converts them into a complete static website that can be served without a backend or database.

  • Thursday - Overload
    test

    Sometimes the modern world overtakes and overwhelms. Too much, too fast, too intense. Numbs the nerve endings. We must reset every now and then. We must reset constantly.

  • The Beginning of Infinity - David Deutsch
  • A Memory Exercise
    In defence of rote memorisation
  • Learning Something New as a New Dad
    Learning something from scratch is hard enough at the best of times. Even that privileged time of university with its long, leisurely days it was difficult to carve out time between hangovers, procrastination and the occasional lecture to dedicate to focused study. One has the time, and one squanders it. One has no time, and one still squanders it.
  • A New Compass and a New Map
    Where have I been and where am I going?
  • Introductions
    test

    The end is nigh. The beginning is nigh. What does it reveal about your approach to something when it always feels like you are at the beginning, or always in the middle, or always at the end? Perspective, as usual, is everything.