/ Aligning DeepSeek R1

Description

At Blue Beck, we were the first to publish re-aligned versions of DeepSeek’s R1 on Hugging Face, aligning the R1-Distil models with Meta’s Llama 3.x. Since then Perplexity AI have courted controversy upon the release of their R1-1776 model with statements about censorship in the original DeepSeek R1. This talk will focus on what it takes to re-align a language model to have a different bias to the original, without losing its capabilities, which is highly relevant in the polarised political environment while rapid advances in language models are occurring. There is a commonly held belief that whoever controls the models also controls the political bias of the output, and while that may hold true for proprietary APIs such as GPT4o & Claude, the emergence of powerful open source models that can be have their bias shifted with a relatively modest amount of compute upends that somewhat. The talk will look at some of the variants of open source models that have appeared, from the academic experiments with RightWingGPT and LeftWingGPT to some of the more disturbing examples that really highlight the risks.

Session 🗣 Introductory and overview ⭐ Track: AI, ML, Bigdata, Python

Machine Learning

Large Language Models

Richard Brough

Go back

This website uses cookies to enhance the user experience. Read here