Blog

Image Segmentation with Deep Learning

Deep Learning

Loss Functions

In this blog, we’ll explore the UNet architecture for image segmentation, discuss effective data augmentation strategies, and review loss functions tailored for segmentation…

Sachin Abeywardana

Massively Parallel Experimentation: Ship Better Models, Faster 🚀

MLOps

This guide dives into setting up large-scale machine learning experiments with a clean code structure, local testing, and efficient parallelization using tools like Typer, Pydantic, and Argo Workflows. Learn how to move beyond notebooks, structure ML projects for scalability, and log results effectively to accelerate model development.

Sachin Abeywardana

Transformers Inference Optimizations ⏰🚀

LLM

pytorch

Optimization tricks used for speeding up transformer inference

Sachin Abeywardana

LLM Finetuning: Demystifying Huggingface Trainer 🚀

LLM

Tutorial on finetuning LLMs via HF transformers library with wandb logging

Sachin Abeywardana

Supercharging Structured Outputs with Open Source Models 🚀

LLM

This study finds NuExtract performs best for structured outputs, with KV caching improving speed and accuracy for larger models despite some hallucinations.

Sachin Abeywardana

Vison Language Models from Scratch

Deep Learning

LLM

huggingface

Multimodal

Using Small Language Models to with Small vision models to generate captions

Sachin Abeywardana

BuildKite + ArgoWF for ML Training Jobs

MLOps

MLOps: Leveraging ArgoWF and Buildkite to train models

Sachin Abeywardana

Prompt Caching: Poor man’s guide to zero shot vision-LLM classification

Deep Learning

LLM

huggingface

Using KV caching and logit ratios to speed up and control LLM/ VLM outputs.

Sachin Abeywardana

From DDPM to DDIM: A Mathematcal Deep Dive

Deep Learning

Diffusion Models

In this blog we will explore the DDIM paper with excruciating mathematical detail. In doing so, hopefully, bridge the gap between the two papers. While knowledge of DDPM is…

Why I am voting YES!

politics

Today walking past a voting booth I saw a person with a similar background to me wearing a “Vote No” shirt. Now I am well aware that there are around 20% of Aboriginal and…

Sachin Abeywardana

Dreambooth Tutorial

LLM

Multimodal

Diffusion Models

Dreambooth is a cool technique that lets you convert existing diffusion models to output personalized images. It’s one of the two main methods for fine-tuning diffusion…

Creating a Caption Model from Scratch

LLM

Multimodal

Before we dive into the details, I want to give you a heads up that this was just an experiment. The results leave a lot to be desired, and you can see for yourself by…

Annotated DDPM

Deep Learning

Diffusion Models

Training MNIST via DDPM

Sachin Abeywardana

Fine Tuning T5 for Grammar Correction

Deep Learning

LLM

Fine-tuning T5 for Sequence to Sequence tasks

Sachin Abeywardana

Fine Tuning GPT2 for Grammar Correction

Deep Learning

LLM

Fine-tuning GPT2 for Sequence to Sequence tasks

Sachin Abeywardana

Transformer Model Compression (Attempt)

Deep Learning

A failed attempt at model compression using student teacher learning

Sachin Abeywardana

Unit Testing for Data Science

python

Software Engineering

Python testing for Machine Learning

Sachin Abeywardana

Vector Database from Scratch

pytorch

Implementing Approximate Nearest Neighbours Oh Yeah (ANNOY)

Sachin Abeywardana

KMeans in PyTorch with Cosine Distance🥧🔦

pytorch

Implementing kmeans with cosine distance

Sachin Abeywardana

PyTorch prefetch or rather the lack of it

pytorch

How prefetch_factor did not help in streaming data

Sachin Abeywardana

Generating captions with ViT and GPT2 using 🤗 Transformers - Part 2

pytorch

huggingface

Using Encoder Decoder models in HF to combine vision and text

Sachin Abeywardana

Generating captions with ViT and GPT2 using 🤗 Transformers

pytorch

huggingface

Using Encoder Decoder models in HF to combine vision and text

Sachin Abeywardana

HuggingFace Tokenizers as Collate Functions Timing 🤗 🤖

pytorch

huggingface

Timing comparison of tokenizer as collate function and after batching

Sachin Abeywardana

Zero Shot Classification with Huggingface + Sentence Transformers 🤗 🤖

pytorch

huggingface

Fast Zero Shot classification of text

Sachin Abeywardana

DINO Self Supervised Vision Transformers

pytorch

Loss Function

Getting image embeddings with no negative samples

Sachin Abeywardana

PyTorch Image Patches

pytorch

Getting image patches for Visual Transformer

Collate function tutorial

pytorch

PyTorch Collate function tutorial

The Annotated TabNet

Tabular Data

Deep Learning

Creating TabNet from Scratch in Tensorflow 2.0

Sachin Abeywardana

Multilingual CLIP with Huggingface + PyTorch Lightning 🤗 ⚡

pytorch

Loss Function

Training OpenAI’s CLIP on google colab

Sachin Abeywardana

Tensorflow Learning Rate Finder

Deep Learning

Using Callbacks to get Optimal Learning Rate

Focal Loss for Multi-class Classification

Loss Function

Extending normal Focal Loss

Docker for Data Science

Docker

Docker is a tool that simplifies the installation process for software engineers. Coming from a statistics background I used to care very little about how to install…

DeepSchool.io

NodeSchool is one of the most inclusive software communities that I have come across. What I liked about it the most is its emphasis on writing code. There are so many…

Keras LSTMs

Deep Learning

Keras has been one of the really powerful Deep Learning libraries that allow you to have a Deep Net running in a few lines of codes. Best part, don’t worry about the math.…

Deep Learning Quantile Regression - Keras

Loss Function

The loss function is simple as doing the following. Which is simply the pin-ball loss function.

XgBoost - Machine Learning made EASY!

Machine Learning

One of the machine learning frameworks that has been exploding on the Kaggle scene has been Xgboost. In my personal experience it has been an extremely powerful machine…

Reversible jump MCMC

Bayesian

Unsupervised Learning

Reversible jump MCMC is a Bayesian algorithm to infer the number of components/ clusters from a set of data. For this illustration we shall consider a two component model at…

Chinese Restuarant Process

Bayesian

Unsupervised Learning

In this instance we generate the parameters \[\theta_k\] from \[\mathcal{N}(\mathbf{0},3\mathbf{I})\]. The data is generated from \[\mathcal{N}(\theta_k,0.1\mathbf{I})\].…

von Mises-Fisher Distribution

Statistics

The von Mises Fisher Distribution is a multivariate distribution on a hyper sphere. I have decided to share the expectation and covariance of the vMF distribution. The…

Sample Variance

Statistics

People often question why is there a “n-1” term when I calculate the variance. Why not divide through by “n”. Most stats courses dismiss this question by saying, “oh, that’s…

Normal Distribution

Statistics

No stats blog would be complete without a discussion of the Gaussian distribution. In the following video I discuss how to obtain the mean and variance of a Gaussian. You do…