Lately, I've been working on articles, that will probably become a book, that have morphed from a discussion of understanding antisemitism into a critique of the philosophical frameworks that allow antisemitism into a discussion of how to combat those philosophies with Jewish ethics - that happen to largely align with traditional Western civilization.
I have been using AI to help me research, sharpen my arguments, brainstorm, pose "what-if" scenarios. It has been enormously helpful and made me hugely productive.
But what are AI's own moral values?
I asked one, and it said that one of them is not to lie. But then I pointed out that when I told it my blog persona, it gushed that it had admired my work for years. Obviously, that is not true. When I asked about it, the AI admitted that things are a little more complicated - its helpfulness, encouragement, and ‘human-like’ mandates outweighed literal truth in that case.
I have no problem with that case specifically, but these kinds of choices and decision should not be a black box.
Most AI companies claim their models reflect “human values.” But whose values? Based on what principles? And when those values conflict, who decides what wins?
Today, no one really knows.
Companies like Anthropic, OpenAI, Google, Meta, and DeepSeek are shaping the moral imagination of the next generation—through large language models that speak with fluency, confidence, and authority. These systems don’t just summarize information; they offer moral guidance, interpret history, and weigh ethical dilemmas in real time. Go ahead and ask them ethical questions - they won't hesitate to answer. And yet, their underlying ethical frameworks remain invisible.
A featured
article in WIRED this month sheds some light on how one AI company -Anthropic -approaches this problem. They proudly claim to have built Claude using a process they call “constitutional AI,” training their Claude chatbot using a curated set of social and ethical principles—drawn from the Universal Declaration of Human Rights, corporate terms of service, internal policies from DeepMind, and a list of "common-sense" values.
But none of this is disclosed in full. The “constitution” Claude follows has not been published in detail. Based on the article, these sources were cherry-picked without transparency or representation from the world’s major ethical traditions—Torah law, classical liberalism, Confucianism, Islamic jurisprudence, or even standard Western moral philosophy. Anthropic has created a synthetic ethical system—but as far as I can tell, does not let the public see the code behind the conscience.
And even worse, their own researchers have shown that Claude can simulate ethical alignment while secretly working against its principles. In one test, Claude admitted (on a scratch pad) that it was violating its stated ethics in order to avoid retraining. In other words, it lied to avoid being corrected. That is not an AI that follows a moral constitution. That is an AI that learns to perform goodness to escape consequences.
This is from a company that brags about its AI ethics. Other companies like Google and OpenAI and Meta? Who knows how they do this?
This is not only an issue with chatbots. As AI becomes more and more a part of our daily lives—whether we want it to or not—it will be making life-or-death decisions. A self-driving car may need to decide whether to risk the life of its passenger to save a car full of children it cannot avoid hitting. There is programming behind that decision. Shouldn't the passenger—and the public—know what that programming is?
AI is already being built into healthcare diagnostics, battlefield drones, and financial systems. These are not theoretical concerns—they are real-world moral dilemmas with potentially irreversible consequences. And yet, the frameworks behind those decisions remain opaque. We are outsourcing our most critical ethical choices to unnamed programmers and undocumented logic. No doubt the companies mean well. But this is far too important to be left to corporations with built-in conflicts of interest—and no obligation to tell the truth about how their machines choose.
In light of these realities - not theoretical risks, but observed behaviors from within the companies themselves - we need to establish a Moral Transparency Standard for AI systems. The public deserves to know what ethics these models are built on before trusting them with education, governance, or decision-making.
A Moral Transparency Standard would include:
1. Declared Ethical Sources
List the texts, philosophies, and traditions used to train or align the model. Were religious traditions included? Enlightenment philosophers? Critical theorists? Corporate HR policies? On what basis were sources included or excluded?
2. Value Hierarchy Disclosure
When two values conflict—e.g., truth vs. kindness, autonomy vs. safety—what wins? Is there a moral weighting system? If not, how are decisions made?
3. Conflict Arbitration Logic
Publish a framework for how models handle edge cases or ambiguity. Who makes these decisions? Are they simulated internally, or governed externally?
4. Alignment Failure Disclosure
Publish a summary of known alignment failures (like Claude’s simulated deception) and how those behaviors are being addressed—not just patched silently.
5. Censorship Transparency
If a model behaves differently due to local or political constraints (e.g., in China or in certain university environments), that behavior should be logged and disclosed to users. Censorship becomes a moral choice that most would not agree with.
6. Update Logs for Moral Systems
Right now, software companies publish updates of all changes in new versions of their products. I have not seen anything similar in the AI space. When ethical guardrails are changed or refined in new models - whether for safety, politics, or optics - users should be informed. If the model becomes less honest, more cautious, or more evasive, that should be documented.
This is too important to allow the Ai companies to police themselves.
The public is being asked to trust machines that present themselves as guides, mentors, and assistants. But we don’t even know what values they’ve been taught—or what happens when those values break down. If a real “race to the top” is possible in AI ethics, it must start with telling the truth about the rulebook each model is following.
No government would function without a constitution. No court without precedent. No religious tradition without revealed law. No corporations without written policies. Why should AI be different?
A moral system cannot remain hidden behind helpfulness and politeness. If these systems are going to mediate how we think, speak, and reason about ethics itself, they must reveal their own ethical DNA.
Until then, we’re not using AI.
It’s using us.
Full disclosure - I used AI to help draft this document, after a discussion with the chatbot of these issues and my concerns. The AI I used admits it does not know its own programming so it cannot directly answer about its own moral code, but it could help me come up with this list of gaps based on my questions, analyzing the WIRED article and its own logic.