Top Inference for Large Language Models Sessions at NVIDIA GTC 2024