{"id":261,"date":"2024-06-03T03:21:57","date_gmt":"2024-06-03T03:21:57","guid":{"rendered":"https:\/\/ieee-ras.conferences.computer.org\/2024\/?page_id=261"},"modified":"2024-06-03T03:23:31","modified_gmt":"2024-06-03T03:23:31","slug":"invited_talk_drew_walton_abstract","status":"publish","type":"page","link":"https:\/\/ieee-ras.conferences.computer.org\/2024\/invited_talk_drew_walton_abstract\/","title":{"rendered":"Invited_talk_Drew_Walton_Abstract"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-page\" data-elementor-id=\"261\" class=\"elementor elementor-261\" data-elementor-post-type=\"page\">\n\t\t\t\t<div class=\"elementor-element elementor-element-58a31ed e-flex e-con-boxed e-con e-parent\" data-id=\"58a31ed\" data-element_type=\"container\" data-e-type=\"container\">\n\t\t\t\t\t<div class=\"e-con-inner\">\n\t\t\t\t<div class=\"elementor-element elementor-element-a1029ab elementor-widget elementor-widget-text-editor\" data-id=\"a1029ab\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p><strong>Title:\u00a0Efforts to Address Fault Management Challenges in OCP<\/strong><\/p><p><strong>Speaker:\u00a0 Drew Walton<\/strong><\/p><p><strong>Abstract:<\/strong><\/p><p>It is too hard for end users to understand and handle hardware errors correctly.\u00a0 We lack first principles understanding of the silicon design, we don\u2019t know what many of the errors mean and we don\u2019t know what data needs to be captured in order to understand what has failed.\u00a0\u00a0 Modern SoCs have RAS features that can mitigate hardware failures, but these features are too hard to use and end users lack data and therefore consensus on which of these features are most useful.<\/p><p>This presentation will give an overview of key fault management efforts in OCP to address these challenges and provide a glimpse of what fault management will look like in future platforms.\u00a0 It will discuss how we are working across the industry to simplify the work needed to log errors, analyze the error logs and take the appropriate action to mitigate the failure.\u00a0 It will cover the efforts of the overall OCP Hardware Fault Management Team, the Fleet Memory Fault Management team and the RAS API Team.\u00a0 It will show how these efforts fit together to meet the challenges end users face and present some initial thoughts on possible next efforts for the fault management community.<\/p><p>\u00a0<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Title:\u00a0Efforts to Address Fault Management Challenges in OCP Speaker:\u00a0 Drew Walton Abstract: It is too hard for end users to understand and handle hardware errors correctly.\u00a0 We lack first principles understanding of the silicon design, we don\u2019t know what many of the errors mean and we don\u2019t know what data needs to be captured in [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"elementor_canvas","meta":{"footnotes":""},"class_list":["post-261","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/ieee-ras.conferences.computer.org\/2024\/wp-json\/wp\/v2\/pages\/261","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ieee-ras.conferences.computer.org\/2024\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ieee-ras.conferences.computer.org\/2024\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ieee-ras.conferences.computer.org\/2024\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/ieee-ras.conferences.computer.org\/2024\/wp-json\/wp\/v2\/comments?post=261"}],"version-history":[{"count":0,"href":"https:\/\/ieee-ras.conferences.computer.org\/2024\/wp-json\/wp\/v2\/pages\/261\/revisions"}],"wp:attachment":[{"href":"https:\/\/ieee-ras.conferences.computer.org\/2024\/wp-json\/wp\/v2\/media?parent=261"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}