Invited_talk_Amit_Pandey_Abstract

Addressing Serviceability throughout device lifecycle with High Speed Access for Test

Amit Pandey
Amazon
Austin, USA
[email protected]

Brendan Tully
Amazon
Austin, USA
[email protected]

Karthikeyan Natarajan
Synopsys
Sunnyvale, California
[email protected]

Abstract

The design sizes and complexity of modern large scale SoCs continue to grow exponentially especially for the cutting-edge chips that are used in Artificial Intelligence (AI) applications. With the rapid growth in semiconductor complexity, higher expectations for SoC performance and longevity, there is a need to continuously monitor the device throughout its life cycle to maximize performance and identify defects before they impact system operations. In order to diagnose and potentially repair such failing devices while still in system, a variety of testing methodologies have been used in the industry. SCAN vectors when reused for Silicon Lifecycle Management (SLM) are effective at targeting specific parts of the devices and ease up the diagnosis process. These vectors along with BIST and miscellaneous sensors embedded into the silicon device allows the servers in the datacenter to be remotely monitored and repaired before they can impact functional system operations. However, the volume of such data generally tends to be very large which make it difficult to be applied in-system without a robust and high-speed network access mechanism. The High-Speed IO access mechanism provides plenty of bandwidth as the native protocol of these interfaces are used. In this presentation we will provide an overview of the High-Speed IO access solution which enables SCAN Vectors to be applied In-Field for an Amazon Web Services (AWS) Machine Learning (ML) and Artificial Intelligence (AI) Acceleration system. We will show how an embedded microcontroller sits at the heart of such a system and allows for remote monitoring and servicing. Examples of how High-speed access architecture can be applied to structures like BIST, JTAG, redundancy/repair, PVT sensors and monitors to address datacenter serviceability needs will also be presented.

Keywords

Silicon Lifecycle Management (SLM), Scan, Automatic Test Pattern Generation (ATPG), High-Speed I/O, Functional Protocol, System Level Test (SLT), In-System Test (IST), Built-in Self-Test (BIST), Sensors

Keynote

Corporate Vice President, General Manager, Data Center and AI Product Management, Intel Corporation