As large language models (LLMs) continue to grow in size and complexity, the performance requirements for serving them quickly and cost-effectively continue to... As large...
Advancing Security for Large Language Models with NVIDIA GPUs and Edgeless Systems
Edgeless Systems introduced Continuum AI, the first generative AI (GenAI) framework that keeps prompts encrypted at all times with confidential computing by... Edgeless Systems introduced...
Checkpointing CUDA Applications with CRIU
Checkpoint and restore functionality for CUDA is exposed through a command-line utility called cuda-checkpoint. This utility can be used to transparently... Source
Phi-3-Medium: Now Available on the NVIDIA API Catalog
Phi-3-Medium accelerates research with logic-rich features in both short (4K) and long (128K) context. Phi-3-Medium accelerates research with logic-rich features in both short (4K) and...
StarCoder2-15B: A Powerful LLM for Code Generation, Summarization, and Documentation
Trained on 600+ programming languages, StarCoder2-15B is now packaged as a NIM inference microservice available for free from the NVIDIA API catalog. Trained on 600+...