About

van den Berg Analytics

Python, Data, and more

Mar 1, 2025

Yet Another LLM Wrapper

Yet Another LLM Wrapper So I vibe-coded a LLM wrapper: podcast generator. The web-app takes in a URL (for example: a wikipedia page) and returns a podcast based on the...

Mar 1, 2025

Even Faster String matching in Python

TL;DR String Grouper is now 8 times faster (than 5 years ago). String Grouper A few years ago I wrote a post about a method of String Matching were we...

Jan 2, 2020

String Grouper

Finding similar strings within large sets of strings is a problem many people run into. In a previous blog Super Fast String Matching I’ve explained a process of finding similar...

Apr 13, 2019

The rise of Newsletter Spam: A journey through my Gmail inbox

In the beginning there was spam. Cheap, unpersonalised, mass-sent junk mail, easily defeated by simple Bayesian Filters. Over the years spammers improved and an arms race between spammers and spam...

Oct 14, 2017

Super Fast String Matching in Python

Traditional approaches to string matching such as the Jaro-Winkler or Levenshtein distance measure are too slow for large datasets. Using TF-IDF with N-Grams as terms to find similar strings transforms...

Aug 29, 2017

1 Day of Citi Bike availability

After moving to New York from the Netherlands I was relieved to find out that biking in Manhattan is actually pretty do-able. It’s not really as common as it is...

Aug 1, 2017

PySpark Dist Explore

PySpark Dataframe Distribution Explorer I found myself using some half baked, quickly written functions to do data exploration in PySpark, every time using a similar but modified version of the...

Feb 1, 2017

ikbenwatlater.nl

I’m a bit later dot N L ikbenwatlater.nl (I’m a bit later dot N L) is a site I created to notify you when your train is delayed or cancelled....

This project is maintained by bergvca

Tweet
Star