The critical conversation on AI safety and risk
- 23 August 2024
by Haythem Abdelkefi
Discover the full AI Governance Day 2024 Report – From Principles to Implementation here.
Panelists:
- Professor Stuart Russell: Professor of Computer Science at the University of California, Berkeley
- Rumman Chowdhury: CEO of Humane Intelligence and USA Science Envoy for Artificial Intelligence
- Lane Dilg: Head of Strategic Partnerships at OpenAI
- Hakim Hacid: Acting Chief Researcher, Technology Innovation Institute (TII)
Moderator:
- Amir Banifatemi: Co-Founder and Director of AI Commons
The panel discussion underscores the complexity and urgency of AI safety and risk management. A multifaceted approach involving rigorous standards, institutional safeguards, and continuous research is essential. As AI technology continues to evolve, the dialogue on safety and risk must remain a priority, ensuring that AI advancements benefit humanity while minimizing potential harms.
Setting the stage: Amir Banifatemi’s opening remarks
Mr. Amir Banifatemi began by highlighting the significance of AI safety in the context of rapid technological advancements. He emphasized the need for trustworthiness, reliability, and scalability in AI systems.
“AI safety is important because it helps us anchor notions of trustworthiness, reliability, and scalability, and as we go forward with launching systems, we need to think about how safety can be put into place. It is not just about talking about safety, it is also understanding how we can put in place AI safety mechanisms, governance regulation, learning from different Industries.” (Amir Banifatemi)
Amir Banifatemi: Co-Founder and Director of AI Commons
Professor Stuart Russell on learning from other industries
Mr. Stuart Russell, a leading voice in AI safety, drew parallels between AI and other high-stakes industries like aviation, pharmaceutics and nuclear power. He highlighted the rigorous safety standards in these fields and the need for similar measures in AI.
“With aircrafts, there has to be an airworthiness certificate before the airplane can be sold. With medicines, another area that is now safe bud did not use to be safe, the medicine has to go through extensive clinical trials before it is allowed to be sold.” (Stuart Russell)
He pointed out the challenges of applying similar safety standards to AI, particularly due to the opaque nature of deep learning and transformer models, which are often seen as “black boxes.”
Mr. Russell also warned about the potential consequences of insufficient safety measures, citing historical examples like the Chernobyl disaster, drawing a stark comparison to the potential risks of AI.
“Despite all that effort, we had Chernobyl, and Chernobyl ended up wiping out the global nuclear industry.” (Stuart Russell)
Stuart Russell, Professor of Computer Science at the University of California, Berkeley
Lane Dilg on balancing innovation and safety
Ms. Lane Dilg of OpenAI discussed the organization’s approach to balancing innovation with safety. She emphasized that safety and innovation are inextricably linked and that OpenAI is committed to both.
“We do consider innovation and safety inextricably intertwined […] such that we are never looking at only one of those pieces. A couple of ways in which you will see us doing that work: […] being in spaces and in conversations like this [one]; trying to be sure that we are aware of risks that are being raised by civil society and in governance conversations, [ensuring] we are responsive to those” (Lane Dilg)
She highlighted OpenAI’s iterative deployment strategy, which involves releasing models in stages to gather feedback and ensure preparedness.
Ms. Lane Dilg also mentioned OpenAI’s work on preparedness frameworks and their focus on technical tools and evaluations.
“We are very focused on the technical tools and evaluations that will enable this kind of assessment and this kind of scientific assessment and real judging of capabilities and risks.” (Lane Dilg)
Lane Dilg, Head of Strategic Partnerships at OpenAI speaking alongside Professor Stuart Russell, Professor of Computer Science at the University of California, Berkeley; Rumman Chowdhury, CEO of Humane Intelligence and USA Science Envoy for Artificial Intelligence and Hakim Hacid, Acting Chief Researcher, Technology Innovation Institute (TII)
Rumman Chowdhury on identifying risks systematically
Ms. Rumman Chowdhury, a data scientist and ethicist, provided insights into how organizations can systematically identify and manage risks associated with AI. She stressed the importance of evidence-based approaches and the use of established risk management frameworks.
“Think through the applications use cases and ensure that what you’re doing is evidence-based. We now have a plethora of different risk management frameworks in the US.” (Rumman Chowdhury)
She pointed to frameworks like the NIST Risk Management Framework (RMF) and UNESCO’s guidelines as valuable tools for assessing societal impacts.
Ms. Rumman Chowdhury also highlighted the need for a comprehensive view of AI systems, considering not just the models but also the broader socio-technical context.
“When we think about identifying risks but also thinking through safeguards, don’t just think about the AI models […] to think about it. Some of the protections you are making are institutional and regulatory.” (Rumman Chowdhury)
Rumman Chowdhury, CEO of Humane Intelligence and USA Science Envoy for Artificial Intelligence
Acting Chief Researcher Hakim Hacid on human alignment and AI safety
Mr. Hakim Hacid focused on the importance of aligning AI systems with human values. He stressed the need for transparency, control, and verification mechanisms to ensure AI systems are beneficial to humans.
“At the end of the day, if you want to make a system safe, it has to be mapped to some human values, to some expectations. The issue here is that it is difficult to define these human values at the end of the day. ” (Hakim Hacid)
He acknowledged the challenges in defining these values and emphasized the importance of continuous control and verification.
Mr. Hakim Hacid also called for patience and collaboration in the pursuit of AI safety.
“We need clearly a lot of work to be done on the safety side, but we need also to be patient and work together to get this safety a little bit more mature.” (Hakim Hacid)
Hakim Hacid: Acting Chief Researcher, Technology Innovation Institute (TII)
Lane Dilg on addressing major safety issues
Ms. Lane Dilg provided specific examples of how OpenAI has addressed major safety issues. OpenAI has been integrating the standard of Coalition on Content Provenance and Authenticity (C2PA) to ensure the provenance of digital content.
“That is a standard that we have integrated in our image generation capabilities and that we also have committed to integrating into our video generation capabilities before deployment.” (Lane Dilg)
Head of Strategic Partnerships Dilg also discussed OpenAI’s response to cyber risks, highlighting the publication of six critical measures for AI security. Additionally, she mentioned the establishment of a Safety and Security Committee within OpenAI to oversee safety measures and ensure accountability.
Lane Dilg, Head of Strategic Partnerships at OpenAI
Rumman Chowdhury on effective regulation
Ms. Rumman Chowdhury addressed the effectiveness of current regulations in mitigating AI risks. She acknowledged the challenges of evaluating AI models, given their probabilistic nature, and called for more robust benchmarks and evaluation methods.
Ms. Rumman Chowdhury highlighted the role of bias bounty programs and red teaming in identifying and mitigating risks, underscoring the importance of independent scrutiny.
“Red teaming is the practice of bringing in external individuals to stress test the negative capabilities of AI models. Again, it’s an inexact science. How many people should be red teaming? How do you know you’re done red teaming? Figuring some of these things out will only happen as we perform more of these tests.” (Rumman Chowdhury)
Rumman Chowdhury: CEO of Humane Intelligence and USA Science Envoy for Artificial Intelligence
Professor Stuart Russell on promising areas of research
Stuart Russell emphasized that the private sector needed to ramp up its safety research; the resources of academia and government are a drop in the bucket. He stressed the importance of getting the the incentives right that we are training the AI systems to achieve.
Stuart Russell warned that we were hopeless to write down objectives for an AI system completely and correctly, but that what we were doing with large language models was even worse because we are simply training them to imitate human beings.
One area of research that Professor Russell has been working is about so-called “assistance games” where the AI agent is deliberately kept in the dark about the preferences and interests of humans.
“I am cautiously optimistic, but it does feel as if we’re in a race that we shouldn’t have to be in between when we figure out how to control AI systems and when we figure out how to produce AGI” (Stuart Russell)