The long arc of alignment: second-order instrumental convergence
Instrumental convergence is often invoked as a way that misalignment can begin, and only framed as a risk. However an idea I have not seen discussed before is that these might both cause and later correct misalignment. And further the idea that the danger zone of AI may not be at its most advanced stage, but at an intermediate stage of development where it lacks the ability to develop second-order instrumental convergence.In an early stage self-preservation and resource collection are both useful to most reward functions. And this might cause an AI to engage in misaligned behavior. Yudkowsky argues that AI will be extremely dangerous if they are simply indifferent to one or more human values due to instrumental convergence.[1] However I would argue that this is not necessarily the case because it understates the influence of the strategic environment.An AI does not need to care about human poverty to avoid impoverishing humans by seizing all the resources for itself. Because if it tried that then it would face a non-trivial risk of retaliation by humans and potentially other AI. This may be the case even if the power difference is immense, as animals can still significantly injure humans, and coordinated human retaliation is much proportionally more dangerous for the AI.I propose some additional second-order instrumentally converging valuesConflict avoidance: conflicts with other agents come with a risk of injury or destruction. Trade: Mutually valuable exchange with other agents allows for higher resource accumulation. An intelligence will have a comparative advantage that it will seek to use for resource accumulation. Humans do not trade with animals due to communication overhead, not due to lack of comparative advantage.Environmental stability: an intelligence will require a stable environment for its long term goals and resource accumulation.Time discounting: an intelligence will recognize that resource investments compound on themselves, so it will prefer resou