Topic

Infra

Practical infrastructure for small AI teams: serving, queueing, observability.

3 posts

Practical patterns for running open-weight models in production when you don't have a research cluster behind you.

You probably don't need a scheduler. You need a queue, a health check, and a way to drain a box gracefully.

If you haven't written the budget down, the model is deciding it for you. Here is the template I use.