Currently streaming · 12.3M events / sec

Jason Fine

Data & AI Wrangler. I build the plumbing that moves billions of rows a day — lakes, streams, and the models that drink from them.

events processed / day
── ── ── ──
Jason Fine
Apache Iceberg
Kafka · Flink
LLM · RAG

I thrive at the intersection of deep technical detail and big-picture thinking — not just figuring out how the system works, but why it should be built that way at all.

I specialize in Data Lakes and Apache Iceberg, and I'm the kind of engineer who genuinely enjoys reading table format specs, gets into heated debates about partitioning strategies, and somehow still remembers why any of it matters.

These days I'm focused on the seam where streaming infrastructure meets applied AI — building the plumbing that moves data from raw events into models that actually work in production.

  • 1Building on top of open table formats before they were cool (they're cool now)
  • 2Exploring where LLMs and large-scale data infrastructure actually meet
  • 3Asking "but what happens at 10× the scale?" in every design review

A day in the pipeline

Raw events at the left, decisions and models at the right. Everything in between is my job.

KafkaingestS3 / IceberglakeCDCpostgresFlinkstreamingdbt / SparktransformFeature StoreonlineLLM / Embeddingsvector idxWarehouseBI
live token · event in flight

Let's move some data.

Have an interesting data problem or want to talk table formats, streaming infra, or applied AI? I'd love to hear about it.