Title: Enhancing Computer Serviceability Through Error Telemetry
Speaker: John Holm, Principal Engineer, Intel Corporation
Abstract:
This presentation will focus on how error telemetry can be used to pinpoint faulty components. It will explore the nuances of error telemetry, highlighting that the reporting component is not always the component that needs servicing. A failure reported on one component in the system may be indicative of failures in other components, excessive heat, or inadequate power. The presentation will discuss some of the challenges of diagnosing failures and possible future mitigations. By understanding the multifaceted nature of error telemetry, we can improve serviceability, reduce downtime, and enhance the overall reliability of computer systems.
Dr. Zane A. Ball is a Corporate Vice President and General Manager of the Data Center and AI (DCAI) Product Management Group. DCAI Product Management is responsible for end-to-end stewardship of DCAI’s systems, SW, CPU, GPU, and custom product line through the entirety of the product lifecycle. Prior to his product management role, Ball was CVP and GM of platform engineering and architecture for Intel’s data center business. Ball has also served as Co-GM of Intel’s foundry effort as a VP in the Technology and Manufacturing group and VP of the Client Computing Group including roles as GM of the desktop client business and as GM of global customer engineering.
Ball has a bachelor’s degree, master’s degree, and Ph.D. in electrical engineering, all earned from Rice University. He also holds six patents in high-speed electrical design.