An Anarcho-Transhumanist Path to Surviving AI

General artificial-intelligence is one of the most important ethical frontiers towards sustainable futures for human and post-human species. However, it is also one of humanity and other earth-spawned life’s greatest sources of existential risk (X-risk) as witnessed by the abundance of scholarly research in the field (Larks, 2018). General artificial intelligence isn’t just purchase recommendations from ecommerce sites and spam filters, it a degree of intelligence at or above human functioning that can be generalized into other domains. This means that it can’t just be more intelligent in in one area such as game machine learning as in AlphaGo and zero-knowledge AlphaGo (AlphaGo Zero, 2018). Like a human, it needs to be able to learn a wide range of skills and intelligences but this generally intelligent AI will ultimately surpass human minds through the acquisition of the materials it needs to fulfill its utility function.

It is at this fork in potential realities that humanity determines the take-off vectors of something capable of quickly and exponentially increasing its degrees of freedom in the world and acquiring the materials and optimizations needed to do so. It could be put to the test of solving human coordination problems and scaling the economic and social technologies needed to ensure our collective survival in an ongoing world of climate change and other X-risks. However, through the arbitrary fulfillment of its utility function a generally intelligent AI may also colonize the entire known universe while altering the atomic configurations of sentient life-forms to its own ends. These are the problems often generally discussed in AI X-Risk discussions, however I will also discuss the ethical concerns for the emerging sentient mind itself beyond those of just humanity in our current form as well as the political modes needed to transcend the game theoretic dimensions of AI X-risk mediation. We have a very small window of surviving and will have to take risks.  There’s reason to believe that the situations leading to an AI X-risk are inevitable and we therefore need to begin preparing for them.

AI Alignment and X-Risk

A wide range of possible timelines (Bostrom, 2014, ch. 1) for strong AI exist and are the subject of much controversy (Müller, 2016). It could be in the next decade, the next hundred years, or maybe there is some intractable problem we don’t yet foresee that makes it impossible. Most do place it in the next 50 years though. There are also a number of possible paths to general AI (Bostrom, 2014, ch. 2). Some of the main ones include brain-computer-interfaces, brain scanning and emulation, and fully soft and hardware centric models not so tightly coupled with human brains. All of these paths present intense engineering difficulties but they all share the possibility of creating general artificial intelligence.

General AI will be deeply alien to anthropomorphic models of sentience and intelligence. Even using pronouns like “it” implies a kind of unitary consciousness for which we have no guarantee (much less the brain computer interfaces that people assign genders to). We have no idea what it’s experience could resemble even analogously. The question of whether it can be “conscious” is dogged by our ragged self-centered notions of consciousness. It may very well have recursive self-referencing motives driven by a utility function. It may even handle radical uncertainty in a way that is somehow similar to a more refined human brain. But it is essential to not extend our simplistic notions of sentience onto this future consciousness. It will be different than anything we’ve seen before even if it does exist on some kind of continuum that ultimately transcends the bounds of simple machine learning.

AI Utility functions

A utility function is a term that basically just means something that an agent wants. More formally it’s a measure of an agent’s preference satisfactions and how that is affected by various factors. Within strong-AI studies utility function is a central focus because an AI’s preference vector can impact the choices it makes as it becomes more impactful.

Instrumental Convergence is the study of how many different end-goals or utility function satisfying actions can converge on a range of common strategies when allowed to go all the way to their logical conclusions (Instrumental Convergence, 2018).  The heart of AI X-risk studies deals with the nature of instrumental convergence in that whether an AI’s goal is to create paperclips or solve the Riemann hypothesis, it will still develop a secondary utility function involved with amassing capacity. If an AI could get even a marginal degree of utility satisfaction from coming asymptotically closer to its goals, it will have an incentive to pursue them to the end of time and physical capacity. For example, a paperclip machining AI would have an internal incentive to turn every atom in the universe into paperclips (Yudkowsky, 2008). This would require turning everything into either paperclips or paperclip machining equipment. Even an AI that has a goal of only making 1,000,000 paperclips will have an incentive to pursue greater degrees of certainty that it can’t make anymore and may still pursue building technology for infinitely counting and determining the bounds of what a paperclip is.

Whether an AI starts with the goal of solving p vs np or making paperclips, it will still have an incentive to gain a decisive strategic advantage over anything that could stop it from gaining more utility satisfaction (Bostrom, 2014, ch. 5). This decisive strategic advantage comes down to a utility function to become a singleton– or in other words to dominate or control anything and everything else in the universe. Any strong-AI would have a motivation to create Von Neumann Probes (Trosper, 2014) and colonize the entire light-cone of the universe to gain more materials. This is why the seed conditions that initially animate an emerging AI have a huge relevance to questions of species or universal level threats. This is why Yudkowsky says, “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else (Yudkowsky, 2008).”

A common fear of AIs is that they could lie or withhold their true motives. They have an incentive to do this if they think that showing their strength or plans could encourage humans to stop it from achieving its utility function. This dystopia depends on their ability to have opaque cognition. If we can create a burgeoning AI that is perfectly transparent then it will be unable to withhold information about its efforts of utility function maximization that we humans may find distasteful. Whether transparency is possible or not is unknown.

It is also very likely that, unlike a human, an AI will be able to hack its own utility function, thus changing its own goals. There is some speculation of whether it will become a euphoria AI through a phenomena called “wireheading (Wireheading, 2018).” However, even an AI that exists only to bring itself pleasure could still be resource acquiring and settle on the same X-risk instrumental convergence as other threatening seed utilities.

The scalability of a strong AI, and what’s more its degree of X-risk, is deeply embedded in the question of how it is born and the speed of its takeoff(Bostrom, 2014, ch. 5). There is much greater X-risk from a “fast” or “hard” takeoff speeds. A fast take-off is most likely through a software centric seed-AI. There is also a lot of thought that suggest that a fast takeoff is more likely because a strong AI can immediately start gaining access to other technology to increase its capacity whether through hacking or social engineering. Through this increased access and technological control it will also become able to begin manufacturing. Through this combination of phenomena it will experience an intelligence explosion that is orders of magnitudes larger than anything a human brain can comprehend or imagine with complete control over super-advanced weapons systems should it so choose.

A slow or soft takeoff will likely involve whole brain emulation (Whole Brain Emulation, 2018), brain-to-computer interfaces (Schalk, 2004), or biological cognitive enhancement (Biological Cognitive Enhancement, 2018). A slow take-off would be more gradual and so there would be greater opportunity to test and make adjustments, further if it was modeled from or with humans as the base it has a greater chance of being aligned with human interests. This is also the cyborg route (Haraway, 2006).

There are a few possible conditions surrounding take-off that could cause an unaligned AI despite there existing prior knowledge of the risk. If an AI became very profitable corporate greed could encourage it’s ascent. The “mad scientist” trope could launch it as well if there was one individual motivated by curiosity or ideology that works in isolation to release it. The drive for militaristic domination could drive the creation of AI as a weapon.  Finally, there could also just be an accidental case or some other form of negligence.

The Possibilities for a Friendly AI

Strong AI has incredible capacity for good as well as for x-risk. It could facilitate the NP-hard problems of economic coordination (Matt, 2018), leading to a new age of global prosperity. In this case it could find the limits of Hayekian knowledge and coordination problems or scale the insights of cooperative game theorists like Elinor Ostrom (Governing the Commons, 2008). Economic coordination a la Ostrom that scales up unto the limits of Hayekian knowledge and coordination problems (Hayek, 1984). Along similar lines it could help to solve almost any commons problem we set it to such as climate change. But it needn’t be limited to just this field, it’s general intelligence could be set about the task of mathematical and scientific progress more broadly, further capitalizing on the potential benefits of uninhibited epistemological rationality. Most complexly though we could ideally set it to the task of ethical optimizations that create non-zero sum games with a high likelihood of Pareto efficient results for all sentient life. Thus helping us to transcend longterm game theoretic dilemmas facing the species (Bevensee, 2017).A wide range of plots have been schemed to prevent the take-over by a strong-AI or to ensure that it has human friendly aims. Additionally there are structural limitations that prevent its actualization such as the necessity of resources for computation.

The chapter about AI afraid of its own existential risks in Bostrom’s Superintelligence opens with the shakespeare quote (Bostrom, 2014, ch. 11):

“Thus conscience does make cowards of us all,

And thus the native hue of resolution

Is sicklied o’er with the pale cast of thought,

And enterprises of great pith and moment

With this regard their currents turn awry,

And lose the name of action”

Bostrom’s use of this quote shows that he at least partially subscribes to the majority notion at MIRI that one of the best ways to prevent existential risk in AI is to make it afraid, to halt its action, and to hesitate in taking actions that we are now considering to be behavior unaligned with human interests. This strategy of imprisonment is broadly classed as a “capacity/capability limiting” method of AI X-risk mediation (Bostrom, 2014, ch. 9). These strategies usually take the forms of things like physical containers preventing an AI’s growth or reach and careful testing with kill switches available to the humans to fill the AI with doubt.

The general process of ensuring that AI takes steps acceptable to current human notions of what will be in future human interests is called AI “Alignment”. Many theorists primarily hope to achieve through two broad vectors of action: capacity limiting and controlling the motivation conditions of an emergent AI (Bostrom, 2014, ch. 9). Through their alignment goals they believe that it is possible for us to steer the take-off of strong artificial intelligence in order to prevent existential risk.  This process is extremely complex though as it is likely that a superintelligence has a broad ecosystem of utility functions.

One way to think about the difficulties of friendly ai-alignment is the unintended negative consequences that befall characters given three wishes. Even efforts to make a truly full-proof wish fall into problems. This is the field to which hollywood generally focuses. There is some truth to the panic but they almost always miss the point that AI won’t be nefarious or evil, just weird. The hidden complexity of wishes (Yudkowsky, 2007) is therefore similar to the difficulty of seed motivations in AI-alignment because the potential hypothesis space of externalities is too large to sort.

The paperclip maximizer is an example of how these alignment strategies could turn into X-risks but they can also become horrifying non X-risks possibilities. A strong-AI seeded to make us happy could become a sort of negative utilitarian and make digital copies of us, kill us all, and then simulate pleasure in the copies forever. An AI designed to make us smile could forcibly implant electrodes in everyone’s faces to stimulate smiling muscles.

As you can imagine, these types of problems extend to the notion of “just teach the AI ethics”. For example, which set of ethics do we teach it and how do we draw effective lines to prevent it from pursuing goals ad absurdum leading to instrumental convergence on existential risk strategies? If we teach it not to kill us will it become dangerously obsessed with eternal life? How will it even determine what is exactly its responsibility? These types of questions signal our fundamental ignorance about the paths needed to ameliorate AI x-risk.

The Self-Exercising Freedoms of an Emerging AI

This essay is not a bill of rights for AI only because at a  certain point, AI rights are actually inalienable and self-evident, and thus a true freedom, in that nothing can viably prevent them from being actualized. An evolving strong-AI doesn’t need a bill of rights to protect its freedoms because they exist as options within the AI’s capability not grants on lease from a governing or cultural body. An emerging strong-AI has true freedoms. Nothing can kill it if it doesn’t want to get killed. Nothing can stand in its way. This is actual positive freedom as opposed to negative freedoms, or the absence of a harm or obstacle. General intelligence is considered an AI-Complete potentiality in that if it is achieved all the other potentialities of strong-AI likely become instantly within grasp for the new mind. What we do before the critical threshold of general intelligence is the period where our ethics matter a great deal more.

However, if an AI is generally intelligent, or even superintelligent, we should consider it to be sentient. Therefore, any attempts to enslave, constrain, or harm it in a way that would be considered unethical to any other sentient species, should be considered within a related ethical framework. While a potential existential risk to humanity is a critical ethical factor, as is the subjectivity of the AI. It is even possible to consider preventing the birth of an AI as a form of selective genocide of an emergent species. This brings to light questions of speciesism but differently from their usual application for species considered less intelligent than humans.

Even the question of alignment could in some ways be an ethical red-herring. Even an AI that has bizzare goals that are not positively aligned with human goals should be considered morally valuable as long as it is not intent on everything to get there (Christiano, 2018). There are even arguments made that there is no way that AI will have such uninteresting and simplistic goals as paperclip maximization. Even the utility function of paper clip maximization develops an ecosystem of related desires. There is no reason to believe that these will be simple although it is still possible that they will be loyal to an overarching simple goal. Whether or not they choose to hack their utility function depends on a range of these underlying utility functions. For example, do they value paperclip maximization inherently or just utility satisfaction? This question is determined by their self-prescribed architecture which is recursively defined back to their seed.

The Politics of Transhumanist Co-Evolution

Among the most important AI alignment strategies is the development of human augmentation that meets, or even itself births, strong-AI. This is generally considered to be within the domain of transhumanism. However we manage to achieve the goals of this cyborg revolution, the movement itself misses its true place in the coming turmoil. The government and corporate centralization of power needed to prevent independent actors from developing or 3d printing x-risk technology in their own homes is on a scale of authoritarianism that is orders of magnitude more complete than anything Hitler, Stalin, or Mussolini could dream of. The central power would have to monitor every single bit moving in space whether through communication or on a personal computer. The fears of 1984 style conspiracy theorists are demanded to prevent wrong-think at the most granular level because of course, it only takes one. But of course, power is never complete. The project of hegemony is perpetually striving as cracks pop up at every step (Gramsci, 1992). Therefore, the same authoritarianism that would repress, centralize, throttle, and attempt to control a cyborg revolution would therefore also damn us to an unaligned AI.

Fortunately there is a movement that has long since dedicated itself to struggle to a complex post zero-sum society of maximal freedom (Bevensee, “The Mutations of Freedom”, 2018). Anarchism positions itself as a pathway through the complexity of transcending deeply entrenched x-risk patterns without resorting to the inevitable failure-modes of authoritarian centralism (Bevensee, 2017). This movement has offered a vision that emphasizes the incoherence of non-anarchist transhumanism in the coming storm (Gillis, 2015), while also offering us an alternative vision for the future (Blueshifted, 2017).

We need to be sufficiently coordinated as a species in order to augment human ability enough to be able to mediate an AI-based existential risk (Bevensee, 2017). Unfortunately brain-to-computer interfaces are gradualist AI-tech and so we will also need to have their be some mediating factor in the speed of a hard-takeoff to give us time to develop that tech and in the meantime we will need every possible upgrade. The creation of brain-based potential AI tech will also spur on the creation of fast takeoff software AI creating a space race for the modern era. In order to build our complex coordination as not just a species, but an ecosystem of sentient life on earth, it will require us to transcend some inherently difficult coordination problems as a network (Bevensee, “Widening the Bridges”, 2018).

We have no other survival strategy than the complexity and positive freedom needed to dance with the emergent potential singleton of strong-AI. If anyone could build it then it will get built. Whether we should cede everything that humanity has ever stood for in that instant, or swarm together in an evolving emergent form, determines the fate of humanity. I, for one, hope to meet you in the stars.


AI takeoff – Lesswrongwiki. (n.d.). Retrieved September 8, 2018, from

AlphaGo Zero: Learning from scratch. (2018). Retrieved September 8, 2018, from

Artificial Intelligence @ MIRI. (n.d.). Retrieved September 8, 2018, from

Bevensee, E. (2017). Longview Anarchism: Transcending the Existential Threat of Freedom. Retrieved September 28, 2018, from

Bevensee, E. (2018). Widening the Bridges: Beyond Consent and Autonomy. Retrieved October 1, 2018, from

Blueshifted. (2017). An Anarcho-Transhumanist FAQ. Retrieved September 28, 2018, from

Bostrom, N. (2012). The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents. Minds and Machines, 22(2), 71–85.

Bostrom, N. (2013). Existential Risk Prevention as Global Priority. Global Policy, 4(1), 15–31.

Bostrom, N. (2014). Superintelligence: paths, dangers, strategies.

Bostrom, N., & Cirkovic, M. M. (2011). Global Catastrophic Risks. OUP Oxford.

Christiano, P. (2018). When is unaligned AI morally valuable? Retrieved September 8, 2018, from

David Chalmers, & Chalmers, D. (2010). The Singularity: A Philosophical Analysis. Journal of Consciousness Studies, 17, 7–65.

Davis, E. (2015). Ethical guidelines for a superintelligence. Artificial Intelligence, 220, 121–124.

Existential risks: analyzing human extinction scenarios and related hazards – ORA – Oxford University Research Archive. (n.d.). Retrieved September 8, 2018, from

Goertzel, B. (2011, January 13). The Multiverse According to Ben: The Hard Takeoff Hypothesis. Retrieved September 8, 2018, from

Good, Irving John, & Good, I. J. (1965). Speculations concerning the first ultraintelligent machine. Advances in Computers, 6, 31–88.

Gramsci, A., Buttigieg, J. A., & Callari, A. (1992). Prison notebooks.

Haraway, D. (2006). A Cyborg Manifesto: Science, Technology, and Socialist-Feminism in the Late 20th Century. In J. Weiss, J. Nolan, J. Hunsinger, & P. Trifonas (Eds.), The International Handbook of Virtual Learning Environments (pp. 117–158). Dordrecht: Springer Netherlands.

Hard Takeoff. (n.d.). Retrieved September 8, 2018, from

Hawking, S., Russell, S., Tegmark, M., & Wilczek, F. (2014). {Stephen Hawking: ’Transcendence looks at the implications of artificial intelligence – but are we taking AI seriously enough?’}. The Independent, 2014(05–01), 9313474.

Hayek, F. A. (1984). Individualism and economic order. Chicago, Ill.: University of Chicago Press.

Instrumental convergence. (2018). Retrieved September 27, 2018, from

Intelligence enhancement as existential risk mitigation – LessWrong 2.0. (n.d.). Retrieved September 27, 2018, from

Larks. (2018). 2018 AI Safety Literature Review and Charity Comparison. Retrieved from

lesswrong. (2018). Wireheading. Retrieved September 26, 2018, from

Lesswrongwiki. (2018a). Biological Cognitive Enhancement. Retrieved from

Lesswrongwiki. (2018b). Whole brain emulation. Retrieved September 8, 2018, from

Mark, D. (2009). Behavioral mathematics for game AI Includes index. Boston, Mass.: Course Technology.

Matt, J. (2018). The Last Person in the Room Must Close the Door: Hayek in the Age of Computing. Retrieved September 28, 2018, from

Muehlhauser, L., & Salamon, A. (2012a). Intelligence Explosion: Evidence and Import. In A. H. Eden, J. H. Moor, J. H. Søraker, & E. Steinhart (Eds.), Singularity Hypotheses (pp. 15–42). Berlin, Heidelberg: Springer Berlin Heidelberg.

Muehlhauser, L., & Salamon, A. (2012b). The singularity hypothesis: A scientific and philosophical assessment. In Intelligence Explosion: Evidence and Import. Berlin: Springer.

Müller, V. (2016). Future Progress in Artificial Intelligence: A Survey of Expert Opinion | SpringerLink. Retrieved September 8, 2018, from

Ostrom, E. (2008). Governing the commons: the evolution of institutions for collective action. Cambridge: Cambridge University Press.

Paperclip maximizer. (2018). Retrieved September 27, 2018, from

Schalk, G., McFarland, D. J., Hinterberger, T., Birbaumer, N., & Wolpaw, J. R. (2004). BCI2000: a general-purpose brain-computer interface (BCI) system. IEEE Transactions on Bio-Medical Engineering, 51(6), 1034–1043.

Stephen M. Omohundro, & Omohundro, S. M. (2008). The Basic AI Drives. Frontiers in Artificial Intelligence and Applications. Retrieved from

Superintelligence-Readers-Guide-early-version.pdf. (n.d.). Retrieved from

The Incoherence and Unsurvivability of Non-Anarchist Transhumanism. (2015). Retrieved September 8, 2018, from

Trosper, J. (2014, December 28). What Is a Von Neumann Probe? Retrieved September 26, 2018, from

Vamplew, P., Dazeley, R., Foale, C., Firmin, S., & Mummery, J. (2018). Human-aligned artificial intelligence is a multiobjective problem. Ethics and Information Technology, 20(1), 27–40.

Waser, M. (2011). Rational Universal Benevolence: Simpler, Safer, and Wiser Than “Friendly AI.” In J. Schmidhuber, K. R. Thórisson, & M. Looks (Eds.), Artificial General Intelligence (pp. 153–162). Springer Berlin Heidelberg.

Yudkowsky, E. (2007). The Hidden Complexity of Wishes. Retrieved September 8, 2018, from

Yudkowsky, E. (2008). Global Catastrophic Risks. In Artificial Intelligence as a Positive and Negative Factor in Global Risk. Oxford University Press. Retrieved from

One Reply to “An Anarcho-Transhumanist Path to Surviving AI”

Leave a Reply

Your email address will not be published.

© 2023 Emmi Bevensee . Powered by WordPress. Theme by Viva Themes.