LLM Inference Optimization

NVIDIA Enters Production With Dynamo, the Broadly Adopted Inference Operating System for AI Factories

NVIDIA Dynamo 1.0 provides a production-grade, open source foundation for inference at scale.Dynamo and NVIDIA TensorRT-LLM ...

Keysight Launches AI Inference Emulation Platform to Validate and Optimize AI Infrastructure

New platform validates and optimizes AI inference infrastructure at scale using real-world workload emulation; live ...

Semiconductor Engineering

HW-SW Co-Designed System With 3 Core Optimization Pathways For Long-Context Agentic LLM Inference (Cambridge, ICL)

A new technical paper titled “Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference” was published by researchers at University of Cambridge, Imperial College London ...

WFXG

Nota AI Reduces Memory Usage of Upstage's Solar LLM by 72%, Demonstrating Proprietary Quantization Technology

Nota AI, an AI optimization technology company behind the Nota AI brand, announced that it has developed a next-generation ...

Business Wire

ASC24 Finals Set for April in Shanghai: Focus on Cutting-Edge Large Language Model Inference and Seepage Simulation!

BEIJING--(BUSINESS WIRE)--On January 4th, the inaugural ceremony for the 2024 ASC Student Supercomputer Challenge (ASC24) unfolded in Beijing. With a global interest, ASC24 has garnered the ...

The Register on MSN

Unpacking the deceptively simple science of tokenomics

Inference at scale is much more complex than more GPUs, more tokens, more profits feature By now you've probably heard AI ...

Semiconductor Engineering

Detailed Study of Performance Modeling For LLM Implementations At Scale (imec)

A new technical paper titled “System-performance and cost modeling of Large Language Model training and inference” was published by researchers at imec. “Large language models (LLMs), based on ...

XDA Developers on MSN

Local LLMs are powerful, but cloud AI is still better at these 3 things

There are trade-offs when using a local LLM ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results