AI Portfolio Podcast
The AI Portfolio Podcast showcases Experts, Companies, and Communities that can accelerate your journey of taking machine learning products to market.
If you are a practitioner, investor, or data leader, you will get something from the show by becoming exposed to great companies to invest in or join and learn how experts navigate their careers.
My goal is to open doors and increase your sense of the possibility of what can be done with machine learning. Connect with me, share the show, and let me know how I can add value.
AI Portfolio Podcast
Kyle Kranen: End Points, Optimizing LLMs, GNNs, Foundation Models - AI Portfolio Podcast #011
Get 1000 free inference requests for LLMs on build.nvidia.com
Kyle Kranen, an engineering leader at NVIDIA, who is at the forefront of deep learning, real-world applications, and production. Kyle shares his expertise on optimizing large language models (LLMs) for deployment, exploring the complexities of scaling and parallelism.
📲 Kyle Kranen Socials:
LinkedIn: https://www.linkedin.com/in/kyle-kranen/
Twitter: https://x.com/kranenkyle
📲 Mark Moyou, PhD Socials:
LinkedIn: https://www.linkedin.com/in/markmoyou/
Twitter: https://twitter.com/MarkMoyou
📗 Chapters
[00:00] Intro
[01:26] Optimizing LLMs for deployment
[10:23] Economy of Scale (Batch Size)
[13:18] Data Parallelism
[14:30] Kernels on GPUs
[18:48] Hardest part of optimizing
[22:26] Choosing hardware for LLM
[31:33] Storage and Networking - Analyzing Performance
[32:33] Minimum size of model where tensor parallel gives you advantage
[35:20] Director Level folks thinking about deploying LLM
[37:29] Kyle is working on AI foundation models
[40:38] Deploying Models with endpoints
[42:43] Fine Tuning, Deploying Loras
[45:02] SteerLM
[48:09] KV Cache
[51:43] Advice for people for deploying reasonable and large scale LLMs
[58:08] Graph Neural Networks
[01:00:04] GNNs
[01:04:22] Using GPUs to do GNNs
[01:08:25] Starting your GNN journey
[01:12:51] Career Optimization Function
[01:14:46] Solving Hard Problems
[01:16:20] Maintaining Technical Skills
[01:20:53] Deep learning expert
[01:26:00] Rapid Round