

# PCIe Gen5 x16 PCB and Demonstrator System for the Stencil- and Tensor Accelerator (STX) STXDemo – Efficient High Performance Systems Made in Europe

**Contact: jens.krueger@itwm.fraunhofer.de** 

### **1 Project Summary**

The Stencil- and Tensor Accelerator originated in the IP developed within the European Processor Initiative (EPI). It is co-designed towards 3D stencil kernels which are the very common kernel in many scientific and industrial simulation codes.

The STXDemo project develops a PCIe Gen5 x16 PCB and assembles all parts to setup a system demonstrator based on 8 STX PCBs for a 19" rack insert. It's energy efficiency and performance will be demonstrated based on two important simulation codes: Lattice Quantum Chromo Dynamics and SU2.

### 2 The STX System

The STX processor package has an accelerator chiplet (GF12LPP) based on RISC-V and HBM memory. Its main advantages are energy efficiency, performance and TCO. The full system architecture is build for highly parallel workloads which are fully offloaded to the HBM memories.

### Simulations

#### Entwurf und Layout

4

von Teststrukturen und deren Übergänge zu den Signallagen



Fertigung von Testbords mit verschiedenen HF-Substraten und Metallisierungen

Messungen der temperaturabhängigen Leitungseigenschaften differentieller PCI-Signale



PCIe High Frequency Testsetup on Test PCB at Fraunhofer IZM



### **3** The STX PCB

- 4x STX packages @ ~50W TDP (each)
- ARM Board Controller
- Interfaces: PWM Signal f
  ür externe
   L
  üfteransteuerung, USB OTG, Ethernet 1000T
   Base (1Gb/s), 16 SPI Signale (1,125GHz), 2x I2C
   (Thermische Monitoring, Spannung), UART, JTAG,
   GPIO, PCIe Gen5 x16 + SM Bus
- Memories eMMC Flash Memory + Micro SD Card (300MB/s)
- Powertree (21 power supplies, Point of Load Concept)
- Testconcept incl. boundary scan
- Design (High Speed) Constraints / Simulation (16159 Rules)
  2423 Components, 10459 connections, 11878 drills (mech. 1967 / Laser 9911 incl. blind, buried vias)
  Dimension 289 x 68 x 1.55mm; ML 12; 85µm line/Space;
  Passiv cooler with Heat Sink Design

Package specific SI / PI challenges:

1. Huge difference in line dimensions between Interposer and PCB (very fine pitch lines to coarse PCB structures)

2. Long routing distances within lossy interposer traces due to the sparse BGA footprint

3. High inductivity in power supply path



## **5** Application

#### Lattice Quantum Chromodynamics (Lattice

QCD) is used to study the strong interactions between quarks and gluons, which are fundamental particles in the Standard Model of particle physics. It is a very widely used application that consumes large resources in HPC centres.

The **SU2** code is primarily used for computational fluid dynamics (CFD) and aerodynamic shape optimization, allowing for simulations of fluid flow and the optimization of aerodynamic designs. It is used for scientific as well as industrial use cases.



https://su2code.github.io/img/hl\_crm\_01.png

#### GEFÖRDERT VOM









