Sanjay Gongalore is the Architecture Lead for GPU Resiliency and Safety team at NVIDIA Corporation and champions resiliency initiatives in the company. He has over 15 years of experience in Architecture and ASIC leadership roles at NVIDIA and Intel. His current focus areas include resiliency modeling, architecting improved hardware resiliency, and end-end error handling including optimal fault attribution and recovery. Sanjay first worked on resiliency in FibreChannel Storage Network Switches at Brocade where he helped architect and validate the Enhanced Failure Detection feature. He also led the development of resiliency features on nForce5 – NVIDIA’s first Server Chipset.
Dr. Zane A. Ball is a Corporate Vice President and General Manager of the Data Center and AI (DCAI) Product Management Group. DCAI Product Management is responsible for end-to-end stewardship of DCAI’s systems, SW, CPU, GPU, and custom product line through the entirety of the product lifecycle. Prior to his product management role, Ball was CVP and GM of platform engineering and architecture for Intel’s data center business. Ball has also served as Co-GM of Intel’s foundry effort as a VP in the Technology and Manufacturing group and VP of the Client Computing Group including roles as GM of the desktop client business and as GM of global customer engineering.
Ball has a bachelor’s degree, master’s degree, and Ph.D. in electrical engineering, all earned from Rice University. He also holds six patents in high-speed electrical design.