Title: Challenges and OCP Community Progress towards Reliability, Availability, and Serviceability (RAS) with a Special Focus on Artificial Intelligence
Speaker: George Tchaparian, Chief Executive Officer, Open Compute Project (OCP) Foundation.
Abstract:
In an era where Artificial Intelligence (AI) is driving unprecedented computational demands, the need for scalable, efficient, and resilient IT infrastructure has never been more critical. This keynote will address the pressing challenges and significant advancements in Reliability, Availability, and Serviceability (RAS) within data centers, emphasizing the pivotal contributions of the Open Compute Project (OCP) Foundation.
George Tchaparian will provide an overview of the OCP Foundation, as well as, OCP’s initiatives and community-driven efforts to enhance RAS, focusing on the integration of AI and high-performance computing (HPC) workloads. The session will highlight the economic impacts of data center downtime, underscoring the importance of robust RAS frameworks to mitigate costs and ensure continuous operations.
Attendees will gain insights into the current state and future roadmap of OCP’s RAS developments, including the implementation of hardware and software co-design, enhanced error handling, and the establishment of standardized RAS metrics and APIs. Tchaparian, will also discuss the strategic alliances and collaborative efforts that are driving innovation and adoption of these critical standards.
The keynote will showcase the practical implications of OCP’s work in creating resilient Data Centers. Participants will be invited to join the movement, collaborate with OCP and contribute to advance RAS technologies, ensuring the reliability and efficiency of the next generation of data centers.
Join George Tchaparian as he explores the transformative journey of the OCP community in shaping the future of reliable, available, and serviceable data centers.
Dr. Zane A. Ball is a Corporate Vice President and General Manager of the Data Center and AI (DCAI) Product Management Group. DCAI Product Management is responsible for end-to-end stewardship of DCAI’s systems, SW, CPU, GPU, and custom product line through the entirety of the product lifecycle. Prior to his product management role, Ball was CVP and GM of platform engineering and architecture for Intel’s data center business. Ball has also served as Co-GM of Intel’s foundry effort as a VP in the Technology and Manufacturing group and VP of the Client Computing Group including roles as GM of the desktop client business and as GM of global customer engineering.
Ball has a bachelor’s degree, master’s degree, and Ph.D. in electrical engineering, all earned from Rice University. He also holds six patents in high-speed electrical design.