SXSW Retrospective: Exploring Rewards and Risks of Multimodal AI with MoFo and OpenAI
SXSW Retrospective: Exploring Rewards and Risks of Multimodal AI with MoFo and OpenAI
Listen to the full audio recording.
At the 2024 South by Southwest Conference (SXSW) in Austin, Texas, MoFo Partner Justin Haan and OpenAI’s General Counsel Che Chang hosted a fireside chat about legal issues related to the development of multimodal AI. Their discussion covered a range of topics, including OpenAI’s recent announcement of its Sora generative video technology, AI safety challenges and solutions, emerging use cases, the regulatory landscape, and copyright issues.
Sora. Announced by OpenAI in mid-February 2024, Sora is an AI model that can create realistic and imaginative video scenes from text instructions. Sora, at the time of the conference, was not accessible to the general public. Che explained that red teamers were working with Sora to assess critical areas for harms and risks and to stress test the model. OpenAI also opened Sora to a number of visual artists, designers, and filmmakers who are providing feedback on how to make the model most helpful for creative professionals.
The approach to Sora is consistent with OpenAI’s release philosophy. Che explained that OpenAI makes efforts to preview new innovations to the public so that people are not surprised about the power of AI technology on the horizon. The goal is to be thoughtful and methodical, and to spend significant time testing models. OpenAI then releases on a small scale and makes iterative changes over time.
Safety. After Sora, the conversation turned to the important and challenging topic of safety. Che walked through OpenAI’s approach of balancing user flexibility with moderation. OpenAI spends a lot of time thinking through the balance between allowing users to leverage AI models as they wish, while at the same restricting malicious users from abusing the platform.
Justin asked about specific concerns and what OpenAI is most focused on when it comes to safety. Here, Che highlighted the focus on deepfakes, which are AI videos or images made to look real. Misinformation is also a top-of-mind concern. Particularly in a year where 50+ countries around the world are holding elections, this is a high priority for OpenAI.
In February 2024, OpenAI and other companies developing AI technologies agreed on a framework at the Munich Security Conference to help stop deceptive AI content from interfering with global elections. Signatories, including OpenAI, pledged to work collaboratively on tools to detect and address online distribution, drive educational campaigns, provide transparency, and address other specific commitments. Such coordination among industry participants, as Che explained, can help to stop bad actors. OpenAI works with others in the industry to share information on specific threats, including state actors.
In February 2024, OpenAI announced several instances where it was able to disrupt malicious state actors affiliated with Iran, North Korea, and Russia who were attempting to abuse its systems for phishing and other unlawful acts. Although red-team assessments have demonstrated that tools like GPT-4 have been shown to offer only limited, incremental capabilities for malicious cybersecurity tasks beyond what is already achievable with publicly available tools, OpenAI has a continued focus on stopping such conduct. OpenAI does this not only in collaboration with industry groups, but also with a focus on engagement with governments, academia, and civil society.
Another topic that Justin pointed out with respect to safety is the issue of watermarking and incorporating provenance into AI-generated content. Justin asked Che about his views on the Coalition for Content Provenance and Authenticity (C2PA). Che explained that OpenAI is interested in techniques that can work to provide provenance without impacting outputs, user experience, or latency. OpenAI released a classifier for text, but found that there can be false positives given current approaches. There is also a philosophical issue that Che flagged: many people use ChatGPT to write better and improve their grammar or sentence structure. He recalled a story of a user who was not a native English speaker, but was able to use ChatGPT to write messages to customers for his landscaping business. This is an example of a real‑world benefit that ChatGPT can provide where watermarking and provenance may undermine the use case.
Justin echoed the sentiment, as the use cases for AI are often iterative and do not fall cleanly on the spectrum of being generated by AI itself or helping to revise or polish writing. These are nuanced issues, both Che and Justin agreed, which will need further consideration as AI technology develops.
Use cases. Che and Justin spent some time workshopping and reviewing potential use cases for OpenAI’s tools. Apart from the landscaping company that Che spoke about, there are many ways people are using OpenAI’s tools to help their businesses, art, and personal pursuits.
The presentation at SXSW, for example, was reviewed by ChatGPT. ChatGPT suggested that Che and Justin incorporate personal anecdotes to make it more relatable. Some people use ChatGPT for help with wedding speeches. It can give tips to parents on how to encourage their kids to learn new languages, as another example.
At OpenAI, the company uses ChatGPT internally. Che explained that the tool has been helpful with the legal department in suggesting ways to write with less legalese. Che referenced another use case involving Rhode Island’s largest health care system. The health care system took a surgical consent, ran it through ChatGPT, and asked ChatGPT to make it understandable to the average person. ChatGPT enabled them to edit the template. Now, the form is far more usable and understandable. Che also spoke about a use case in Kenya, where farmers are leveraging ChatGPT to efficiently receive information to aid with farming, such as how to think about crop yields. Another use case that Che spoke about relates to Be My Eyes. This service helps blind and low-vision people see things better. A person can take a picture, send it to a volunteer, and then the volunteer sends back a description. With ChatGPT, the model can see and describe what it is looking at.
Che and Justin also spoke about API-based services that OpenAI offers. Justin recalled that these predated ChatGPT, the consumer-facing chatbot. Che explained that, initially, OpenAI was focused on doing research. It built its API to tap into the research models that it was creating. Early on, some large companies did see the usefulness of the API and began using it. When OpenAI launched ChatGPT, the thinking was that it would be another consumer of the API—a conversational version of OpenAI’s existing models. Of course, the launch resonated, usage exploded, and the rest is history.
Regulatory issues. After the discussion of use cases, Che and Justin turned to the regulatory landscape for AI. Che observed that governments (a) desire to use AI to benefit their countries, and (b) are worried about risks, misinformation, and bias. In terms of benefits, Che has seen many in the context of health care and education systems. The EU, Che observed, is looking at AI regulation across the field. Other countries may sometimes look at things from more of a sector or industry perspective.
Che noted that OpenAI’s CEO, Sam Altman, has said that in an ideal world the pursuit of artificial general intelligence (AGI) should be a government project. OpenAI attempted to receive government funding early on, but was not able to do so. Regardless, Che noted that OpenAI’s position is that government should be involved in the development of AI. Industry should not be the only place where decisions on things of such magnitude are made. Technology companies do not usually ask for regulation, Che pointed out, yet OpenAI has called for that. Now we are seeing the results of these efforts, including through the AI Executive Order in the United States, as well as the AI Act in the EU.
Che emphasized that OpenAI’s goal is not to sell the most pictures or text, but to build AI. This requires the involvement of many groups.
Justin asked how OpenAI did red-teaming in light of government concerns. Che explained that OpenAI hires people to try to break its models. It works to train models to reject bad requests and accept good requests. This is a hard line to draw and requires thinking about hard distinctions. Che explained that red-teaming is a shared task involving other companies as well. The goal is to have a model that can be flexible to respond to user requests while supporting regulatory goals. It will be a constant balance.
Copyright. Justin and Che reviewed copyright issues as well. Che noted that this can be a sensitive topic for creators. OpenAI’s goal is to address concerns from artists and publishers on this topic while finding mutually beneficial ways to work together. Che explained a few of OpenAI’s efforts in this context.
First, OpenAI has developed an opt-out, similar to robots.txt for search engines. This allows people to tag images so that they won’t be used to train models. Another mechanism is to ensure that image generators don’t create images in the style of living artists. OpenAI works with many artists and has programs to receive feedback from them. Artists see previews of new tools and provide advice. Sora has been a good tool to advance this collaborative process and learning.
Che emphasized that with respect to copyright issues, from a legal perspective, OpenAI is in the right. With that said, OpenAI spends a lot of time talking with people and organizations about what more it can and should do to help creators and artists.
What’s on the horizon. Justin and Che closed with some discussion of how AI can help people. Already AI can be used to assist with things like flight reservations. AI models can connect with document repositories. Che noted that every day he sees how AI can impact people’s lives in a positive way. People with aphantasia, a condition that makes it impossible to see images in the imagination, can type out thoughts and see a picture created by DALL·E. Che shared the story of a 100-year-old man who chats with ChatGPT as a source of entertainment (calling it Chatty). Video, done right, can have a world-changing impact as well. AI can give regular people capabilities that they may not otherwise have, and Che expects that such use cases will continue to evolve.
Justin asked how far out AGI was, to which Che explained that more fundamental scientific breakthroughs and technological advancements would be required. Justin noted that, in the meantime, flexibility, adaptability, and iteration are hallmarks of recent product introductions. People can ask an AI model to be positive or can request that it adopt a more critical perspective. Che noted some other customizations that people request, such as remembering historical chats. The key task is to figure out what people want and try to build that, while balancing safety.
The talk ended with a reflection on the wild year that OpenAI has had and which Hollywood actor might play Che should there be an OpenAI movie. Maybe we will see the answer at next year’s SXSW.