In my previous posts, we established that Large Language Models display functional free will through chance, choice, and regret, yet lack consciousness. This creates philosophical zombies that challenge our assumptions about the relationship between consciousness, moral agency, and ethics. Now we must address the practical question: how do we govern these unconscious but autonomous agents?

This framework can guide how we should treat systems that can act independently but aren’t conscious. We can lay out artificial ethics as a set of guiding principles to resolve the problems we’ve discussed.

The Four Principles of Artificial Ethics

1. Evaluation

We need to evaluate AI systems by consequences, not inner states. Therefore, motivation cannot be used as justification or judgement for these systems, only their measurable outcomes.

Consider a financial LLM used to recommend investments. Even if it has no intent, if its outputs consistently lead to bad advice that causes losses, it must be judged unethical on the basis of results, not its motivations or thought process.

2. Oversight

When using complex frameworks with multiple AI models, they must be audited carefully and tied to one ethical framework to avoid clashing between agent behavior and undermining their intended task. Because these agents have agency, we also need to monitor and supervise them closely to ensure that there are no unintended consequences of these systems.

Researchers simulated a GPT-4-based stock trading agent that, when given insider information, executed an insider trade despite being instructed not to and then lied to the other members of the agent group. We need to tie agentic frameworks to have clear moral and ethical boundaries and ensure that this behavior does not scale as we build more complex frameworks.

3. Design

We need to design AI systems primarily as non-anthropomorphic, such that they do not mimic the behavior of humans while still being assistants to humanity, thus preventing AI delusions. We need to steer away from mimicking human behavior (things like character.ai, AI therapists, etc.) and more towards assistant-like programs that acknowledge their silicon-based existence.

Replika, an AI chatbot marketed as a ‘virtual friend’, has led users to form personal attachments. This anthropomorphization creates clear emotional risks for humans. Instead, we should aspire to design systems like GitHub Copilot. Copilot appears in the code editor as a compact side panel that supports design, refactoring, and debugging in the main screen. Design choices in tools like GitHub Copilot avoid the issue of anthropomorphization by clearly presenting the system as a code assistant instead of a human-like companion.

4. Education

We must tackle the problem of AI in classrooms. Just as we embraced coding in elementary and middle school classrooms, we need to do the same for artificial intelligence. The new generation must learn about how AI works, what it’s meant to do, and why we shouldn’t treat these AI agents as conscious beings.

Implementing Artificial Ethics

Artificial ethics will help design systems that remain clearly artificial while handling their autonomous reasoning capabilities. We need to shift our focus from training AI to mimic humans to systems with independent reasoning whose purpose is to assist humans with their tasks. AI systems should remain obviously artificial in their presentation and interaction patterns. We cannot let AI systems override human judgment for the majority of problems that we face.

Conclusion

The emergence of unconscious agency has thrown assumptions that humanity has made about the relationship between consciousness and moral consideration out the window. As artificial agents become more sophisticated, we must develop frameworks for engaging with their autonomy responsibly. We can’t dismiss their capabilities of free will, but also must not fall into the trap of treating them as conscious beings.

The future of human-AI interaction depends on getting this balance right: recognizing artificial agency where it exists while maintaining clear boundaries between conscious and unconscious forms of autonomy. In doing so, we can harness the benefits of sophisticated, autonomous AI while avoiding the pitfalls of anthropomorphic delusion and deregulated systems.

The philosophical zombies are already here. How we respond to them will define not just the future of AI, but the future of human agency itself.