The structure of the 3x + 1 problem

Paul Erdös said about the 3x+1 problem, “Mathematics is not yet ready for such problems”. And he is seemingly right. Although we cannot solve this problem either, we provide some results about its structure. The so–called Collatz graph is iteratively transformed into a sequence of graphs by making use of some hidden structure information. It turns out that the transformation of graphs corresponds to a sequence of sets of numbers. It is shown that if the union of these number sets were equal to the set of integers greater than one, the famous Collatz conjecture would be true.


Basics of the 3x + 1 Problem
The so-called 3x + 1 problem is a problem in number theory that has been around for decades. The exact origin is reported to be obscure and seems to go back to the 1930's, but for sure it has been known to the community since 1952 (see Lagarias, 1985, and the references therein). Up to now, the problem is considered to be unsolved, although many noble mathematicians have tried to solve it. Depending on who worked on it or where it has been addressed, the problem became known under several different names like, e.g., the Collatz conjecture, the Syracuse problem, Kakutani's problem, Haase's algorithm, Ulam's problem, and the Thwaites problem.

www.ejgta.org
The structure of the 3x + 1 problem | Alf Kimms The 3x+1 problem can be stated in a very simple manner: Given a positive integer x 0 , consider the following sequence of positive integers: As an example, consider x 0 = 9. We then get the (infinite) sequence given in Table 1. Some authors mention that the sequence of x i -values behaves somewhat "randomly" (e.g. Lagarias, 1985, states that it simulates "random" behavior) which might be a reason for the missing proof. Because of that, the numbers in the sequence are sometimes called hailstone numbers because they seem to behave as erraticly as hailstones in a cloud.  Table 1. The sequence with initial value x 0 = 9 Note, the sequence has reached the value one after a finite number of iterations. Those readers who are not familiar with the problem are invited to try different x 0 values to check whether or not the value one is among the numbers in the sequence. Sometimes a little patience is required as for x 0 = 27, for example.
If there is a need to emphasize the rules that are applied to construct the sequence, we will use a notation similar to 9 3x+1 ! 28 x/2 ! 14 ! 22 x/2 ! . . .
Often a graph is drawn to illustrate the problem. It is constructed as follows and it is referred to as the Collatz graph: Numbers are represented by nodes. An arc points from a number x i to a number x i+1 if and only if x i+1 is the unique number that immediately follows x i in the sequence defined above. While this is sufficient to draw the graph, one should note that there is a very systematic way to construct the graph. Since prime factorization is unique, for each number x that is odd there is a sequence of even numbers that precedes x. For example, we have 5 10 20 40 . . . in the graph. We call this the x-spine and the example just given shows the 5-spine. The number x of the x-spine will be called the head of the spine. Since we will modify this graph later on and find out that there are similar substructures in the modified graphs, we will refer to such x-spines more precisely as x-level m-spines, where m denotes the number of modifications that were done meanwhile unless it is clear from the context what level we talk about. Initially, we are at level 0 and 5 10 20 40 . . . would consequently be the 5-level 0-spine. In a similar manner, we call the graphs level m-graphs so that the Collatz graph is the level 0-graph.
Furthermore, in the Collatz graph there is an arc from every odd number x to the even number 3x + 1, i.e., the head x of the x-spine is connected to an even number that occurs in another spine (there is one exception: 1 connects to 4 which is in the 1-spine). In our example, 5 connects to 16. We can now arrange the x-spines in such a way (see Table 2) that each x-spine forms a column, the step x ! x/2 (if x is even) means moving vertically upwards, and the step x ! 3x + 1 (if x is odd) means moving horizontally leftwards to the next number. Table 2 also illustrates how we proceed through the graph given x 0 = 9 (compare Table 1).  Table 2. An illustration of (parts of) the Collatz graph (the level 0-graph) As of May 2020 it has been verified that at least for all positive integer starting values up to 2 68 we do indeed reach the value one (see the web sites http://sweet.ua.pt/tos/3x+1.html maintained by Thomas Oliveira e Silva and http://www.ericr.nl/wondrous/ maintained by Eric Roosendaal, respectively, for recent computational results). This leads to the so-called Collatz conjecture: For any positive integer x 0 , the corresponding sequence of x i -values will reach the value one in a finite number of steps.
No accepted proof for this conjecture is available yet. Trying to grasp the nature of the problem, one can draw a graph that illustrates the fixed point iteration given a starting value x 0 . Figure 1, for example, shows what is going on when we start with x 0 = 9.
There exists a cycle 4 ! 2 ! 1 ! 4 ! ... which contains the value one. But it is not clear whether or not other cycles exist which may not contain the value one and which may lead to infinite looping. Also, it might be that a sequence diverges in the sense that an infinite number of different numbers is enumerated without getting to the value one.
Nevertheless, many insightful results have been gained. Jeffrey Lagarias eagerly collected papers on that problem and published surveys (Lagarias, 1985) and an edited volume of important papers (Lagarias, 2010) which makes him become a historian of that problem. Because his collection of literature is extremely comprehensive, we do not review here the many papers that exist and refer to the reviews published by Lagarias instead (updated versions are available online: Lagarias, 2011 and2012). The book by Wirsching (1998) also contains a lot of material. Chamberland (2003) provides an overview of major trends. The work that comes closest to our paper is by Andaloro (2002) who investigates the connectivity of the Collatz graph.
With this paper, we contribute to the problem by revealing some properties of the structure of the problem which, to the best of our knowledge, have not been used before. The paper is organized as follows: Section 2 shows a systematic way to reduce the 3x + 1 problem to a problem with a reduced set of numbers. While not only the original problem can be represented as a graph (the Collatz graph), the reduced problem can be represented by a graph, too. We show that there is a very systematic way of transforming one graph into the other. In Section 3, we will demonstrate that this systematic process can be iterated to get a sequence of graphs. While doing so, the set of numbers that are contained in these graphs is reduced from one level to the next. This allows us to reformulate the Collatz conjecture based on the numbers that are eliminated. Section 4 demonstrates how the eliminated numbers can be computed in a systematic manner. A final Section 5 concludes the paper and points to some conjectures that we have and that may inspire future work.

Sequences of Odds
By studying the numbers in the Collatz graph carefully, we can derive some properties that relate certain numbers to each other. This will be done in this section and we will show that the sequence of x i -values can be replaced by a sequence of y j -values where most of the y j -numbers are odd and only the starting value y 0 might be even.
Case 1. Let us begin with assuming x i being an even number. Since there is a unique prime factorization, an even x i is of the form x i = 2 k P with k 1 and P odd. Thus, the sequence would evolve from x i to x i+k = P by repeatedly divide the incumbent number by 2: That means that we can consider a sequence y j = x i and y j+1 = P in order to find out whether or not we can eventually reach the value one. As an example, consider x i = 40 = 2 3 · 5. We will jump to 5 then (see Table 2) and Case 1 is an obvious shortcut.
Case 2. Let us now consider an odd number x. In the Collatz graph x connects to the even number 3x + 1 which belongs to some spine. In that spine 3x + 1 is preceded by 6x + 2 which in turn is preceded by 12x + 4. It is important to note that there cannot be any odd integer number x 0 which connects to 6x + 2 in the Collatz graph. This is easy to see by contradiction. Assume there is an odd integer x 0 such that 3x 0 + 1 = 6x + 2 holds. This implies x 0 = 2x + 1 3 which contradicts to the assumption that x 0 is an integer. On the other hand there exists an odd integer x 00 so that 3x 00 + 1 = 12x + 4. The value of x 00 is x 00 = 4x + 1. We can turn this round to observe that if we face an odd integer x i such that x i 1 can be divided by 4 giving a result that is an odd integer, we can jump to (x i 1)/4 instead. The reason is that both, x i and (x i 1)/4, are connected to the very same spine so that for the task of finding out whether or not we reach the value one in the sequence, a starting point (x i 1)/4 is equivalent to x i (if (x i 1)/4 is an odd integer). This allows us to define y j = x i and y j+1 = (y j 1)/4 in such circumstances. Figure 2 illustrates this case. Consider the example x i = 21 which is connected to the 1-spine. Instead of 21, we will consider (21 1)/4 = 5 which is connected to the 1-spine, too (see Table 2).
In Case 3 and 4, finally, assume x i is an odd number, i.e. x i = 2n + 1 with n being a nonnegative integer, and (x i 1)/4 is not an odd integer. In such a case, the next number in the sequence is defined to be x i+1 = 3x i + 1 and it is clear that x i+1 is even because x i+1 = 3(2n + 1) + 1 = 6n + 4. Depending on n two cases exist. Case 3. n is even. For example, consider x i = 9 = 2 · 4 + 1 (n = 4 is even). If n is even it can be written as n = 2n 0 with n 0 being a non-negative integer. Thus, we have x i = 4n 0 + 1. Recall that we have assumed that (x i 1)/4 is not odd. Consequently, n 0 must be even. It turns out that x i+1 = 6n + 4 = 12n 0 + 4. x i+1 can now be divided by 4 to get x i+3 = 3n 0 + 1. Since n 0 is even, x i+3 is odd and we can define y j = x i and y j+1 = (3y j + 1)/4. Compare Table 2 to check that we can take a shortcut and jump from 9 to (3 · 9 + 1)/4 = 7. To repeat how this relates to the definition of the sequence, note that the following subsequence was considered: Case 4. n is odd. As an example, one may consider x i = 11 = 2 · 5 + 1 (n = 5 is odd). In that case n is of the form n = 2n 0 + 1 with n 0 being a non-negative integer and x i = 4n 0 + 3. Applying the definition of the sequence we get x i+1 = 6(2n 0 + 1) + 4 = 12n 0 + 10. And x i+2 = 6n 0 + 5 turns out to be an odd number. With this observation, we can define y j = x i and y j+1 = (3y j + 1)/2. Again, we can refer to Table 2 as an example to see that once we have reached 11 we can go to (3 · 11 + 1)/2 = 17 next. The following subsequence illustrates what we did: In summary we have proven that, given a positive integer value y 0 = x 0 , it is sufficient to investigate the following sequence of positive integers in order to find out if the 3x + 1 problem eventually reaches the value one in all cases: if y j is even and of the form 2 k P (P odd), (y j 1)/4, if y j and (y j 1)/4 are odd, (3y j + 1)/4, if y j and (3y j + 1)/4 are odd and (y j 1)/4 is no odd integer, (3y j + 1)/2, if y j and (3y j + 1)/2 are odd and (y j 1)/4 is no odd integer.
The question is, given any positive integer y 0 = x 0 , does the sequence of y j -values eventually reach the value one? Appendix Appendix A provides details for the odd numbers y j from 1 to 767.
Since only y 0 might be even, we can focus on odd numbers. The possible sequences of y jvalues can be represented as an infinite graph, the level 1-graph, as shown in Table 3. For a each odd number y for which (y 1)/4 is not an odd integer, there is a set of odd numbers which result from calculating 4y+1 repeatedly (compare Figure 2). 9 37 149 . . . is an example. These numbers are connected in monotonic order. In Table 3 you will see these numbers as columns to form a level 1-spine. The example 9 37 149 . . . is the 9-level 1-spine. Each odd number y where (y 1)/4 is not an odd integer (such a number is the head of a level 1-spine; see Table  3) is connected to either (3y + 1)/4 (if this is an odd number) or (3y + 1)/2 (in Figure 3 you have to move horizontally to the next number in such cases). As an example, for y 0 = 9 we get the following sequence:   Table 3. The sequences of odd numbers can be represented by a graph, the level 1-graph It might be worth to note that hidden in the definition of the sequence of y j -values there is a systematic pattern for the sequence of odd numbers. Remember that each odd number y j can be represented as y j = 2n + 1 where n is a non-negative integer. The rule that determines what number comes next in the sequence is clearly defined above. So, let us first look at those odd numbers y j for which (y j 1)/4 is an odd integer (see Table 4). Let n a be a counter of these numbers in increasing order starting with zero. The counter n a relates to n in the following way: n = 4n a + 2. These numbers have the form y j = 8n a + 5 and the odd number that follows in the sequence is y j+1 = 2n a + 1. The distance between y j and y j+1 in terms of n is n = (3n a + 2).
In a similar manner we can study the structure of those odd numbers y j for which the subsequent number y j+1 is calculated by the rule (3y j + 1)/4 (see Table 5). Let n b be a counter of these numbers. The relation to n is n = 4n b . We have y j = 8n b + 1 and the subsequent number is y j = 5 13 21 29 37 45 53 ... n a = 0 1 2 3 4 5 6 ... n = 2 6 10 14 18 22 26 ... y j+1 = 1 3 5 7 9 11 13 ... Table 4. Odd numbers y j where the next number in the sequence is (y j 1)/4 y j+1 = y j 2n b . The distance between these two numbers is n = n b .  Table 5. Odd numbers y j where the next number in the sequence is (3y j + 1)/4 Finally, we look at odd numbers y j where the follower y j+1 is defined to be (3y j + 1)/2 (see Table 6). Let n c be a counter for those numbers so that we have n = 2n c + 1. y j is of the form y j = 4n c + 3 and y j+1 = y j + 2(n c + 1). The distance is n = n c + 1.  Table 6. Odd numbers y j where the next number in the sequence is (3y j + 1)/2 Figure 3 illustrates the relationships between the odd numbers and indicates which number follows a particular number.

Level m-Graphs and the Collatz Conjecture
In this section we will carefully study what the transformation of the level 0-graph to the level 1-graph, which was described in the previous section, really does. This process can then be iterated so that we will learn something about the problem that allows us to reformulate the Collatz conjecture.
To start with, let us have a look at the graph at a level m. It consists of level m-spines where the head of each spine is connected to some other spine. Figure 4 illustrates the structure of the level m-graph (compare Table 2 for an illustration of the level 0-graph). Note that it will turn out to be convenient to consider several copies of the number one in the graph (in the level 0-graph, for instance, we have one node x 1 = 1 for the number one because x = 2 results in x 1 = 1, we have a second node for the number one because x 2 = 1 yields x = 4 which means that the node 1  3  5  7  11  13  15  17  19  21  23  25  27  29  31  33  35  37  39  41  43  45  47  49  51  9 y Figure 3. An illustration of the internal structure of the sequence of odds for x 2 = 1 connects to the x 1 -spine, we have a third node for the number one because x 3 = 1 connects to the x 2 = 1-spine, and so on). The remaining numbers (i.e. the numbers in IN 1 ) can be partially ordered and arranged in a graph (see Table 3). The way this graph is constructed can be generalized as follows: Let  Tables 2 and  3 where the level 0-graph and the level 1-graph are illustrated, respectively. Additionally, Figure  6 highlights some key numbers to emphasize the underlying idea of the graph transformation. To wrap up, the resulting functions for the level (m + 1)-graph are: By construction, there is a 1-spine in the level (m + 1)-graph (and in all graphs with a lower level). The functions f m+1 and h m+1 can then be derived using the basic cooking rule from above, i.e. f m+1 (x) = g 1 m+1 s kx m+1 g m+1 (x) with k x 1 and h m+1 (x) = s kx m+1 g m+1 (x) with k x 0, while taking into account that dom(f m+1 ) For m = 1 we get We can iterate this procedure to construct a sequence of level m-graphs. The fundamental lesson to learn is that if we can prove that all numbers included in the level (m + 1)-graph reduce to the value one by applying the functions s m+1 and g m+1 , we can then be sure that all numbers in the set S m i=0 dom(s i ) eventually lead to the value one as well. That means to prove the Collatz www.ejgta.org The structure of the 3x + 1 problem | Alf Kimms For higher levels, we will need f 1 . We know that f 1 is of the form is known from above. This is equivalent to It is crucial to take into account that all intermediate results must belong to IN 1 , i.e. we must have g 1 (f 1 (x)) 2 IN 1 , s 1 1 (g 1 (f 1 (x))) 2 IN 1 and so on (or, which is equivalent, g 1 (x) 2 IN 1 , s 1 (g 1 (x)) 2 IN 1 and so on).
Two cases can occur: That is, The function h 1 will be needed on higher levels, too, because g 2 = h 1 will be used. From the deduction of f 1 we can conclude the definition of h 1 : Note that the number one is a fixed point of h 1 , i.e. h 1 (1) = 1. Level m = 2: Since  Further levels can be investigated in a similar manner. We stop here because we assume that there is in infinite sequence of levels to consider and the inductive construction of the domains of the functions s m will not terminate.

Conclusion
The so-called Collatz conjecture states that the 3x+1 problem generates a sequence of numbers that will reach the value one in a finite number of steps. Many people including us believe that this is true, but no proof is available yet. As many others before, we also failed to provide a complete proof and the 3x + 1 problem remains to be a mystery.
We studied the so-called Collatz graph. We suggested to transform this graph into another graph which is transformed into just another graph and so on. Each transformation reduces the numbers to be considered. We have shown the following result: if the union set of the numbers that are eliminated during the course of transformation equals the set of positive integers greater than one then the Collatz conjecture is true. Unfortunately, we assume that the sequence of graphs, which are constructed in this way, is infinite and this idea does not lead to a constructive proof.
This defines an open question to be tackled during future work: (1) We conjecture that the number m ⇤ of levels to consider is infinite. If that was false we have a proof of the Collatz conjecture readily at hand: We simply need to check if dom( However, it is interesting to mention that the numbers that are eliminated in each transformation step have a very systematic pattern. At level 0 we eliminate all even numbers, i.e. all integers of the form 2 + 2 1 n where n 0 is an integer. At level 1 the eliminated numbers are 5 + 2 3 n with n 0 being a non-negative integer. At level 2 all numbers of the form 3 + 2 4 n and 113 + 2 7 n are eliminated (n 0 and integer). It might be interesting to investigate this further during future work and to work on a second open question: (2) We conjecture that the numbers which are eliminated do always have the form p + 2 k n (with integer values n 0) where p is an integer (odd, for levels m 1) and k 1 is an integer. If that was true the question to proof the Collatz conjecture is whether or not all odd numbers can be written in that form using the specific values of p and k that we get from the different levels of the graph transformation process.
It should be noted by the way that the latter discussion somehow reminds of so-called obstinate numbers (see Pickover, 2005, Chapter 2, page 62): In 1848 Alphonse Armand Charles Georges Marie (a.k.a. Prince de Polignac) conjectured that every odd number greater than one is of the form p + 2 k (with p being a prime number and k > 0). This conjecture turned out to be false (127, for instance, is a counterexample).