Loading…
11-12, August 2026
Seoul, South Korea
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for Open Source Summit Korea 2026 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Korea Standard Time (KST), UTC +9. To see the schedule in your preferred timezone, please select from the drop-down menu to the right.
Tuesday August 11, 2026 11:00 - 11:30 KST
Large language models are getting faster GPUs every year, yet users still notice the pause before the first word appears. That pause has a name: Time To First Token (TTFT). And in production LLM systems, shaving even a few hundred milliseconds from it can dramatically change how responsive an application feels.

This talk tells the story of where those milliseconds go.

We will walk through the lifecycle of a request in modern LLM serving systems and explore the practical techniques engineers use to reduce TTFT in real deployments. Using examples from open source stacks like vLLM, TensorRT-LLM, and Hugging Face TGI, we will examine four powerful optimization levers: KV cache strategies, speculative decoding, model quantization, and batching policies.

Instead of focusing only on theory, the session highlights the tradeoffs practitioners face. When does speculative decoding actually help? When does batching hurt latency? When does quantization reduce memory pressure enough to speed up the first token?

Attendees will leave with a practical playbook for diagnosing TTFT bottlenecks and choosing the right optimization strategy for their model, infrastructure, and workload.
Speakers
avatar for Hrittik Roy

Hrittik Roy

vCluster, Platform Advocate
Hrittik is a Platform Advocate at Loft Labs and a CNCF Ambassador, with expertise in cloud native technologies and open source communities. He has contributed extensively to developer advocacy, technical writing, and community engagement. Hrittik has been a featured speaker at events... Read More →
Tuesday August 11, 2026 11:00 - 11:30 KST
Grand Ballroom 2-3

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link