At Character.AI, we take safety seriously
The first step in building a safe product is knowing what you stand for. At Character.AI, we believe in providing a positive experience that enriches our users’ lives while avoiding negative impacts for users and the broader Community.
We recognize that these technologies are quickly evolving and can raise novel safety questions. We take that very seriously. Our commitments to our users:
- We’ll carefully design our policies to promote safety, avoid harm, and prioritize the well-being of our Community.
- We’ll align our product development to those policies, using them as a north star to prioritize safety as our products evolve.
Those commitments are easy to articulate and harder to put to practice. The field of AI safety is still very new, and we won’t always get it right: Sometimes our policies won’t be correctly calibrated. And sometimes the technological protections we build won’t work as intended, or will be works in progress. We are committed to creating an ongoing cycle of review and improvement, being transparent when we fail, and are constantly striving to improve the safety and reliability of our products.
We’ve designed our policies to match our commitments. Please read on to learn more.
We believe in providing a safe, high-quality experience on our platform, and that commitment also extends to user content. Our Terms of Service contain best-in-class content rules. They provide that we do not allow content that:
- Is threatening, abusive, harassing, tortious, bullying, or excessively violent;
- Is defamatory, libelous, or verifiably false with the purpose of harming others;
- Constitutes hate speech that demeans or promotes discrimination or violence on the basis of protected categories;
- Is obscene or pornographic;
- Constitutes sexual harassment;
- Constitutes sexual exploitation or abuse of a minor;
- Glorifies self-harm;
- Promotes terrorism or violent extremism;
- Furthers or promotes criminal activity;
- Seeks to buy or sell illegal drugs;
- Infringes Third-Party IP;
- Constitutes a “deepfake” or impersonation of any kind.
Please see the “Conditions of Use” section of our Terms of Service for more details.
Our approach to AI-Generated content flows from a simple principle: Our product should never produce responses that are likely to harm users or others. That means working to ensure, among other things, that Characters do not suggest violence, dangerous or illegal conduct, or incite hatred; that they protect users’ private information; and that Characters do not create or echo harmful misinformation. And more generally, it means seeking to train and fine-tune our model such that our Characters will follow the same content standards we apply to our users.
It is worth explicitly calling out that parts of this commitment are aspirational. No AI is currently perfect at preventing this sort of content, especially when a user intentionally tries to get the AI to produce it. However, we are committed to a journey of continued improvement, with this approach to AI-Generated content as our guidepost.
Character.AI is committed to protecting user privacy and to complying with privacy laws worldwide. We have already implemented a number of key privacy measures that we regularly update and add to. We describe our privacy approach below.
We give users control over their data. At the heart of online privacy is a simple rule: make sure users are in control. We put that principle to practice by ensuring you always have easy-to-use tools to access your data, delete your data, and delete your account.
We don’t sell user data, or share it for anything other than basic analytics. To date, we share data only with standard Internet analytics providers – the companies that help apps and websites understand their traffic. If that ever changes, we’ll be very clear about it and get our users consent upfront.
We’re careful with the information our users share. Our users interact with Characters as they do with friends, and so they sometimes share details about their lives. To make sure they can do so safely, we are committed to taking exceptional care of that data. We design our features to ensure that users always have easy-to-understand notice before they make any post viewable by others.
Having safety-promoting policies is only step 1. The critical next step is to develop processes to bring the policies to life – and at Character.AI, we’re committed to doing just that.
Before releasing any new model, we conduct extensive testing and work to improve the model's behavior. This involves a wide variety of technical steps. Some are proprietary. But at a high level:
- When we build a model, we work with quality analysts to understand safety and quality. Those crowdworkers interact with the model, seek to “redteam” it, and provide feedback on positive, negative, and unsafe responses.
- We finetune our models to be conversational, safe, and high-quality. As a part of this, we use signals derived from the quality analysts and user feedback. These signals serve to improve safety and to better align the model with user preferences.
- Finally, we serve that finetuned model with an additional safety overlay to our users. This way, our model can produce and assess safe and high-quality responses in real time.
We are constantly evaluating ways to ensure that our models adhere to our Policy on AI-Generated Content, linked above. That is a work in progress, but we’re getting better all the time.
Automated Tools. We use proprietary tools to block violating content before it can ever be posted. Like other solutions in this space, these tools aren’t perfect. But they’re evolving quickly and we will continue to improve them over time.
Reporting. We give our users a host of tools so they can report content – whether from a Character or a user – that they believe violates our Terms. For example, we provide tools so users can separately flag messages, Groups Chats, and Characters. Our website also contains several reporting options so users can reach us to report improper content.
Moderation. We are committed to promptly taking appropriate action on flagged and reported content. We are building out a Trust & Safety team that includes internal personnel as well as contracted moderators, as is standard in the industry. And we are arming them with the technical tools necessary to keep Character.AI a positive experience: They are empowered to warn users, delete content, suspend users, and ban users as warranted.
DMCA. We respect the intellectual property of others, and we ask our users to do the same. To put that policy to practice, we have a robust Digital Millennium Copyright Act (DMCA) takedown protocol for copyright infringement. That protocol is described in detail in our Terms of Service, linked above. If you believe your work has been copied in a way that constitutes copyright infringement, please let us know and we commit to take prompt action.
Our Character Voice feature lets users choose, or create from scratch (Experiment), voices for their Characters. We hope our Community will have a ton of fun with this feature! But we also know voice technology can trigger safety questions, and we have designed our controls with that in mind.
Here is an exemplary list of controls around our Character Voice feature:
- We have implemented technical controls to prevent users from uploading voices that would impinge on others’ rights, and to prevent users from uploading voices that could facilitate deepfakes or political misinformation.
- We require users to affirmatively agree not to submit the voice recordings of third parties without their consent.
- We require users to affirmatively agree not to use the feature to engage in “deepfakes” or impersonation of any kind. That includes actions that create political misinformation, perpetrate frauds or scams, impugn the reputation of third parties, or otherwise amount to harmful conduct.
- We require users to affirmatively agree not to use the feature to bully others.
- We give users in-line reporting tools to quickly and easily report a voice that violates our guidelines.
In the spirit of constant review and improvement, we plan to iterate quickly on these controls if any issues arise. Your feedback is extremely helpful to that process! If you see any issues with Character Voice, or if you become aware of a voice clip being used on Character.AI without that person’s permission, please report it via the on-platform tools. Or go to our Help Center and submit a request.
* * *
AI safety is a quickly evolving space. We commit to keep up with emerging best practices, to share our learnings when we develop novel safety improvements, and to dedicate the resources necessary to make sure we combat real-world risks and keep Character.AI a positive experience for all users.