Fakta
"TensorRT‑LLM Optimization: Quantization, Kernel Fusion, and Throughput Engineering"
Built for experienced ML systems engineers, inference specialists, and GPU performance practitioners, this book is a deep guide to making large language models run faster, cheaper, and more predictably with TensorRT‑LLM. Rather than offering generic acceleration advice, it develops a precise mental model of the TensorRT‑LLM stack so readers can understand where performance is won or lost: in quantization choices, graph compilation, fused kernels, KV-cache policy, and serving scheduler behavior.
The book covers the full optimization path from precision strategy and post-training quantization pipelines to engine build configuration, plugin-enabled fusion, attention specialization, and throughput-oriented serving design. Readers will learn how to choose among FP16, BF16, FP8, INT8, and INT4 in hardware-aware ways; validate deployable quantized artifacts; realize fused execution paths in compiled engines; engineer KV-cache behavior for long-context workloads; and benchmark and profile systems with enough rigor to attribute gains to the right layer.
Structured as an advanced, implementation-minded text, the book emphasizes cross-layer tradeoffs rather than isolated tricks. It assumes solid familiarity with transformer inference, CUDA-era GPU concepts, and production deployment concerns, and rewards readers who want durable optimization judgment instead of version-fragile recipes."
© 2026 NobleTrex Press (E-bog): 6610001219079
Udgivelsesdato
E-bog: 8. maj 2026
Over 1 million titler
Download og nyd titler offline
Eksklusive titler + Mofibo Originals
Børnevenligt miljø (Kids Mode)
Det er nemt at opsige når som helst
For dig som lytter og læser ofte.
129 kr. /måned
Eksklusivt indhold hver uge
Fri lytning til podcasts
Ingen binding
For dig som lytter og læser ubegrænset.
159 kr. /måned
Eksklusivt indhold hver uge
Fri lytning til podcasts
Ingen binding
For dig som ønsker at dele historier med familien.
Fra 179 kr. /måned
Fri lytning til podcasts
Kun 39 kr. pr. ekstra konto
Ingen binding
179 kr. /måned
For dig som vil prøve Mofibo.
89 kr. /måned
Gem op til 100 ubrugte timer
Eksklusivt indhold hver uge
Fri lytning til podcasts
Ingen binding
Har du en rabatkode?
Indtast koden her