FP8

NVIDIA Announces New Switches Optimized for Trillion-Parameter GPU Computing and AI Infrastructure

Retrieved on: 
Monday, March 18, 2024

The world’s first networking platforms capable of end-to-end 800Gb/s throughput, NVIDIA Quantum-X800 InfiniBand and NVIDIA Spectrum™-X800 Ethernet push the boundaries of networking performance for computing and AI workloads.

Key Points: 
  • The world’s first networking platforms capable of end-to-end 800Gb/s throughput, NVIDIA Quantum-X800 InfiniBand and NVIDIA Spectrum™-X800 Ethernet push the boundaries of networking performance for computing and AI workloads.
  • “NVIDIA Networking is central to the scalability of our AI supercomputing infrastructure,” said Gilad Shainer, senior vice president of Networking at NVIDIA.
  • “NVIDIA X800 switches are end-to-end networking platforms that enable us to achieve trillion-parameter-scale generative AI essential for new AI infrastructures.”
    Initial adopters of Quantum InfiniBand and Spectrum-X Ethernet include Microsoft Azure and Oracle Cloud Infrastructure.
  • Behind this transformation is the evolution of data centers into high-performance AI engines with increased demands for networking infrastructure,” said Nidhi Chappell, Vice President of AI Infrastructure at Microsoft Azure.

DDN Delivers Four Terabytes per Second of Accelerated Storage Performance in Groundbreaking NVIDIA Eos AI Supercomputer

Retrieved on: 
Tuesday, March 19, 2024

SAN JOSE, Calif., March 19, 2024 /PRNewswire/ -- (GTC, DDN Booth #816) –  DDN®, the global leader in artificial intelligence (AI) and multi-cloud data management solutions, announces its successful implementation of DDN EXAScaler® AI storage (A3I®) in NVIDIA Eos, a TOP10 supercomputer based on NVIDIA DGX SuperPOD.  

Key Points: 
  • The result is 12 petabytes of DDN storage delivering four terabytes per second of data speed to serve an impressive 18.4 exaflops of FP8 NVIDIA AI performance.
  • NVIDIA Eos powers the largest AI workloads, large language models, recommender systems, and simulations delivering NVIDIA's latest AI innovations.
  • "NVIDIA DGX SuperPODs like Eos provide a full-stack blueprint for turnkey enterprise AI supercomputing, and DDN A3I delivers high performance, power efficient storage for Eos to deliver the data needed for advanced AI training."
  • DDN storage in NVIDIA Eos highlights what organizations and sovereign AI entities in data centers and the cloud can achieve.

Tachyum Democratizes AI for All with $5000 Prodigy ATX Platform

Retrieved on: 
Thursday, February 8, 2024

Tachyum ®, creator of Prodigy®, the world’s first Universal Processor, today released a white paper that details how its Prodigy ATX Platform will democratize AI for those who may not normally have access to sophisticated AI models.

Key Points: 
  • Tachyum ®, creator of Prodigy®, the world’s first Universal Processor, today released a white paper that details how its Prodigy ATX Platform will democratize AI for those who may not normally have access to sophisticated AI models.
  • The Prodigy ATX Platform allows everyone to run cutting-edge AI models for as low as $5,000.
  • Since the Prodigy ATX Platform is intended to leverage pre-trained models and focus on inference, Tachyum reviews the assumptions for the memory footprint required for inference.
  • This architecture provides Tachyum with increased yield for 96-core devices, lowering platform costs and helping make the Prodigy ATX Platform even more affordable.

OSS Ships New Gen 5 AI Edge Compute Accelerator

Retrieved on: 
Wednesday, February 7, 2024

ESCONDIDO, Calif., Feb. 07, 2024 (GLOBE NEWSWIRE) -- One Stop Systems, Inc. (Nasdaq: OSS), a leader in AI Transportable solutions at the edge, has begun shipping its latest Gen 5 4U Pro Accelerator System to a large composable infrastructure provider.

Key Points: 
  • ESCONDIDO, Calif., Feb. 07, 2024 (GLOBE NEWSWIRE) -- One Stop Systems, Inc. (Nasdaq: OSS), a leader in AI Transportable solutions at the edge, has begun shipping its latest Gen 5 4U Pro Accelerator System to a large composable infrastructure provider.
  • OSS expects shipments of this compute accelerator to the customer to total between $4 million and $6 million over the next three years.
  • For AI workflows at the edge, this latest Gen 5 4U Pro Accelerator delivers twice the interconnect bandwidth performance over Gen 4.
  • The accelerator also includes upgraded power and cooling to support multiple NVIDIA H100 Tensor Core GPUs with PCIe Gen 5, resulting in 4.8x the AI inference performance using FP8 precision compared to the previous generation.

Tachyum Well Positioned for an Exceptional 2024

Retrieved on: 
Tuesday, January 9, 2024

Increased Business Development – With the final reference chip available, Tachyum will target early adopter markets, including high performance computing and artificial intelligence applications.

Key Points: 
  • Increased Business Development – With the final reference chip available, Tachyum will target early adopter markets, including high performance computing and artificial intelligence applications.
  • Tachyum diligently worked throughout 2023 to ensure that it is positioned at the precipice of success for the coming year.
  • In 2023, Tachyum received a major purchase order from a U.S. company to build a large-scale system based on its 5nm Prodigy Universal Processor chip.
  • We look forward to fulfilling our commitment to transforming ordinary data centers into Universal Computing Centers in the near future.”

AMD Delivers Leadership Portfolio of Data Center AI Solutions with AMD Instinct MI300 Series

Retrieved on: 
Wednesday, December 6, 2023

SANTA CLARA, Calif., Dec. 06, 2023 (GLOBE NEWSWIRE) -- Today, AMD (NASDAQ: AMD) announced the availability of the AMD Instinct™ MI300X accelerators – with industry leading memory bandwidth for generative AI1 and leadership performance for large language model (LLM) training and inferencing – as well as the AMD Instinct™ MI300A accelerated processing unit (APU) – combining the latest AMD CDNA™ 3 architecture and “Zen 4” CPUs to deliver breakthrough performance for HPC and AI workloads.

Key Points: 
  • “AMD Instinct MI300 Series accelerators are designed with our most advanced technologies, delivering leadership performance, and will be in large scale cloud and enterprise deployments,” said Victor Peng, president, AMD.
  • Oracle Cloud Infrastructure plans to add AMD Instinct MI300X-based bare metal instances to the company’s high-performance accelerated computing instances for AI.
  • Dell showcased the Dell PowerEdge XE9680 server featuring eight AMD Instinct MI300 Series accelerators and the new Dell Validated Design for Generative AI with AMD ROCm-powered AI frameworks.
  • Supermicro announced new additions to its H13 generation of accelerated servers powered by 4th Gen AMD EPYC™ CPUs and AMD Instinct MI300 Series accelerators.

NVIDIA Supercharges Hopper, the World’s Leading AI Computing Platform

Retrieved on: 
Monday, November 13, 2023

DENVER, Nov. 13, 2023 (GLOBE NEWSWIRE) -- NVIDIA today announced it has supercharged the world’s leading AI computing platform with the introduction of the NVIDIA HGX™ H200.

Key Points: 
  • DENVER, Nov. 13, 2023 (GLOBE NEWSWIRE) -- NVIDIA today announced it has supercharged the world’s leading AI computing platform with the introduction of the NVIDIA HGX™ H200.
  • Based on NVIDIA Hopper™ architecture, the platform features the NVIDIA H200 Tensor Core GPU with advanced memory to handle massive amounts of data for generative AI and high performance computing workloads.
  • The NVIDIA H200 is the first GPU to offer HBM3e — faster, larger memory to fuel the acceleration of generative AI and large language models, while advancing scientific computing for HPC workloads.
  • NVIDIA’s accelerated computing platform is supported by powerful software tools that enable developers and enterprises to build and accelerate production-ready applications from AI to HPC.

AWS and NVIDIA Announce Strategic Collaboration to Offer New Supercomputing Infrastructure, Software, and Services for Generative AI

Retrieved on: 
Tuesday, November 28, 2023

The NVIDIA GH200 NVL32 multi-node platform connects 32 Grace Hopper Superchips with NVIDIA NVLink and NVSwitch technologies into one instance.

Key Points: 
  • The NVIDIA GH200 NVL32 multi-node platform connects 32 Grace Hopper Superchips with NVIDIA NVLink and NVSwitch technologies into one instance.
  • • NVIDIA and AWS will collaborate to host NVIDIA DGX Cloud , NVIDIA’s AI-training-as-a-service, on AWS.
  • DGX Cloud on AWS will accelerate training of cutting-edge generative AI and large language models that can reach beyond 1 trillion parameters.
  • NVIDIA DGX Cloud is an AI supercomputing service that gives enterprises fast access to multi-node supercomputing for training the most complex LLM and generative AI models, with integrated NVIDIA AI Enterprise software and direct access to NVIDIA AI experts.

Tachyum Cuts Cost of Large Language Models Up to 100x Bringing them to Mainstream

Retrieved on: 
Tuesday, November 14, 2023

Tachyum ® today announced the release of a white paper detailing how to use 4-bit Tachyum AI (TAI) and 2-bit effective per weight (TAI2) formats in Large Language Models (LLM) quantization without accuracy degradation.

Key Points: 
  • Tachyum ® today announced the release of a white paper detailing how to use 4-bit Tachyum AI (TAI) and 2-bit effective per weight (TAI2) formats in Large Language Models (LLM) quantization without accuracy degradation.
  • Tachyum hardware also enables workable LLM with 1-bit per weight with higher degradation than TAI2 and its AI scientists are continuing to further improve performance to reduce degradation as Tachyum looks to bring it to mainstream.
  • Tachyum addresses massive LLM models with capabilities that have dramatically increased by more than a thousand times over the past few years.
  • In contrast, a $23,000 single Prodigy socket system with 2TB DDR5 DRAM could fit and run such big models and bring them into the mainstream for generative AI applications.

exaBITS Successfully Integrates NVIDIA H100 to Provide High-Performance AI Services to Enterprises

Retrieved on: 
Monday, August 28, 2023

San Francisco, CA, Aug. 28, 2023 (GLOBE NEWSWIRE) -- Decentralized AI computing infrastructure provider exaBITS has announced the successful integration of the NVIDIA H100 Tensor Core GPU, offering AI enterprises on-demand access to the fastest GPU type on the market.

Key Points: 
  • San Francisco, CA, Aug. 28, 2023 (GLOBE NEWSWIRE) -- Decentralized AI computing infrastructure provider exaBITS has announced the successful integration of the NVIDIA H100 Tensor Core GPU, offering AI enterprises on-demand access to the fastest GPU type on the market.
  • This integration marks a significant milestone for exaBITS in driving innovation and accessibility in artificial intelligence and distributed computing and making "Affordable AI" a reality.
  • By introducing the NVIDIA H100 into its cloud computing platform, exaBITS offers unprecedented computational performance, catering to high-speed AI training and inference tasks.
  • This signifies that AI enterprises can harness the robust computing infrastructure of exaBITS to leverage the exceptional performance of the NVIDIA H100 fully.