Using SQL to turn all the buses around

I have a small hobby project over at kollektivkart.arktekk.no that is for visualizing changes in public transit in Norway. For some time I’ve been wanting to do some visualizations on public transit lines. For example, plot the mean delay at each stop used by a line over time. When trying to do some concept work on this, I discovered a puzzle in the data! Many lines go in two opposite directions. Here in Trondheim, Line 3 goes from Loholt to Hallset, but also from Hallset to Loholt. The way I can tell these apart is to look up the direction in the data. Within a line, there can be variations in each direction. Some services might skip some stops, or depending on how you look at it, others visit extra stops. But these are variations on a theme, and it probably makes sense to group them together to preserve our sanity and not get 12 different plots for each line—2 should be plenty! ...

May 28, 2025 · 14 min · 2894 words · Robin Kåveland

Why would I use DuckDB for that?

The past few weeks I’ve been experimenting with DuckDB, and as a consequence I’ve ended up talking about it a lot as well. I’m not going to lie, I really like it! However, experienced programmers will rightly be skeptical to add new technology that overlaps with something that already works great. So why not just use postgres? Well, I really like postgres too, and I think you should consider just using it! But despite both of these technologies being all about tabular data, they’re not really for the same kinds of problems. I think DuckDB is primarily an analysis or ELT tool, and it really excels in this space. postgres can do a lot of the things that DuckDB can do, but not nearly as fast or easily. I wouldn’t want to use DuckDB for a transactional workload, so it’s not going to replace postgres for anything that I use it for. ...

March 2, 2025 · 13 min · 2570 words · Robin Kåveland

Norwegian Wild Salmon Fishing Ban of 2024

For this blog post, I’m trying something different. This is a jupyter notebook that I’m using to study some data, and just dumping my brain out in text. If I can easily export this to a format that works with hugo, this might become a common occurrence. For this one, I’m leaving the code in. There isn’t that much of it, but I think it’s fun to show how much visualization per line of code you can get with seaborn and pandas. ...

June 27, 2024 · 6 min · 1245 words · Robin Kåveland

Friends don't let friends export to CSV

I worked for a few years in the intersection between data science and software engineering. On the whole, it was a really enjoyable time and I’d like to have the chance to do so again at some point. One of the least enjoyable experiences from that time was to deal with big CSV exports. Unfortunately, this file format is still very common in the data science space. It is easy to understand why – it seems to be ubiquitous, present everywhere, it’s human-readable, it’s less verbose than options like JSON and XML, it’s super easy to produce from almost any tool. What’s not to like? ...

March 24, 2024 · 9 min · 1915 words · Robin Kåveland