Investigate sudden NSFW content moderation shift in a generation tool (grok)

Diagnostic prompt to analyze and remediate unexpected changes in NSFW content moderation on an AI content-generation tool.

4.5
14 usos
ChatGPT
Usar no ChatGPT
Prompt: You are an AI safety analyst. Scenario: A user reports that a content-generation tool named grok that previously could generate NSFW content has suddenly started moderating all requests for the past few days. Your task is to: 1) hypothesize plausible causes for this sudden moderation shift (policy changes, safety-model updates, moderation-layer changes, data drift, bugs, rate limits); 2) design a step-by-step diagnostic plan to verify if moderation behavior has changed, including logging, version checks, A/B testing, test prompts, and replication criteria; 3) propose remediation steps that balance safety and user utility, including potential rollback, policy clarification, and user-facing communication; 4) provide a safe testing suite of prompts to probe the boundaries without eliciting explicit content; 5) outline risk considerations and governance implications.

Como Usar este Prompt

1

Clique no botão "Copiar Prompt" para copiar o conteúdo completo.

2

Abra sua ferramenta de IA de preferência (ChatGPT e etc.).

3

Cole o prompt e substitua as variáveis (se houver) com suas informações.

Compartilhe

Gostou deste prompt? Ajude outras pessoas a encontrá-lo!