Unpacking DeepSeek—distillation, ethics and national security

Celebrity Gig
Credit: Unsplash/CC0 Public Domain

Since the Chinese AI startup DeepSeek released its powerful large language model R1, it has sent ripples through Silicon Valley and the U.S. stock market, sparking widespread discussion and debate.

Ambuj Tewari, professor of statistics at the University of Michigan and a leading expert in artificial intelligence and machine learning shares his insights on the technical, ethical and market-related aspects of DeepSeek’s breakthrough.

OpenAI has accused DeepSeek of using model distillation to train its own models based on OpenAI’s technology. Can you explain how model distillation typically works, and under what circumstances it might be considered ethical or compliant with AI development best practices?

Model or knowledge distillation typically involves generating responses from the stronger model to train a weaker model so that the weaker model improves. It is a totally normal practice if the stronger model was released with a license that permits such use. But OpenAI’s terms of use of chatGPT explicitly forbid use of their model for purposes such as model distillation.

Is it possible that DeepSeek utilized other open-source models, such as Meta Platforms’ LLaMA or Alibaba’s Qwen, for knowledge distillation, rather than relying on OpenAI’s proprietary models?

It is hard to say. Even in the same family of models, say Llama or Qwen, not all models are released with the same license. If the license of a model permits model distillation, then there is nothing illegal or unethical in doing that. In the R1 paper, it is mentioned that the process actually worked in the opposite direction: knowledge was distilled from R1 to LLaMA and Qwen to enhance the reasoning capabilities of the latter models.

READ ALSO:  How travel among synthetic populations reveals gaps in essential services

What evidence could an AI company provide to demonstrate that its models were developed independently, without relying on proprietary technology from another organization?

Since there is the presumption of innocence in legal matters, the burden of proof will be on OpenAI to prove that DeepSeek did in fact violate their terms of service. Since only the final model developed by DeepSeek is public and not its training data, it might be hard to prove the accusation. Since OpenAI has not made its evidence public yet, it is hard to say how strong a case they have.

Are there industry standards or transparency measures that AI companies could adopt to build trust and demonstrate compliance with ethical AI development?

There are currently little universally accepted standards on development of AI models by companies. Proponents of open models say that openness leads to more transparency. But making the model weights open is not the same as making the entire process from data collection to training open. There are also concerns about whether use of copyrighted materials such as books for training AI models is fair use or not. A prominent example is the lawsuit filed by The New York Times against OpenAI, which highlights the legal and ethical debates surrounding this issue.

READ ALSO:  Porous material can capture 'hot' CO₂ from industrial exhaust

There are questions around social biases in training data affecting the model’s output. There are also concerns around increasing energy requirements and its implication for climate change. Most of these issues are being actively debated with little consensus.

Some U.S. officials have expressed concerns that DeepSeek could pose national security risks. What’s your take on this?

It would be deeply concerning if U.S. citizens’ data is stored on DeepSeek’s servers and the Chinese government gets access to it. However, the model weights are open and hence it can be run on servers owned by U.S. companies. In fact, Microsoft has already started hosting DeepSeek’s models.

Provided by
University of Michigan


Categories

Share This Article
Leave a comment