In number theory, Dixon's factorization method (also Dixon's random squares method[1] or Dixon's algorithm) is a general-purpose integer factorization algorithm; it is the prototypical factor base method. Unlike for other factor base methods, its run-time bound comes with a rigorous proof that does not rely on conjectures about the smoothness properties of the values taken by a polynomial.
The algorithm was designed by John D. Dixon, a mathematician at Carleton University, and was published in 1981.[2]
Basic idea
Dixon's method is based on finding a congruence of squares modulo the integer N which is intended to factor. We explain the basic idea here by first describing Fermat's factorization method, then how Dixon's method proceeds differently.
Fermat's factorization method finds such a congruence by selecting random or pseudo-random x values and hoping that the integer x2 mod N is a nontrivial perfect square (in the integers):
-
x
2
≡
y
2
(
mod
N
)
,
x
≢
±
y
(
mod
N
)
.
{\displaystyle x^{2}\equiv y^{2}\quad ({\hbox{mod }}N),\qquad x\not \equiv \pm y\quad ({\hbox{mod }}N).}
For example, if N = 84923, (by starting at 292, the first number greater than √N and counting up) the 5052 mod 84923 is 256, the square of 16. So (505 − 16)(505 + 16) = 0 mod 84923. Computing the greatest common divisor of 505 − 16 and N using Euclid's algorithm gives 163, which is a factor of N.
In practice, selecting random x values will take an impractically long time to find a congruence of squares, since there are only √N squares less than N.
Dixon's method replaces the condition "is a nontrivial perfect square" with the much weaker one "has only small prime factors"; for example, there are 292 squares smaller than 84923; 662 numbers smaller than 84923 whose prime factors are only 2,3,5 or 7; and 4767 whose prime factors are all less than 30. (Such numbers are called 7-smooth and 30-smooth, respectively, since they are B-smooth with respect to some bound B.)
Once enough of these B-smooth values are found, linear algebra can be used to search for nontrivial perfect squares for Fermat's factorization method. If there are many numbers
a
1
…
a
n
{\displaystyle a_{1}\ldots a_{n}}
whose squares can be factorized as
a
i
2
mod
N
=
∏
j
=
1
m
b
j
e
i
j
{\displaystyle a_{i}^{2}\mod N=\prod _{j=1}^{m}b_{j}^{e_{ij}}}
for a fixed set
b
1
…
b
m
{\displaystyle b_{1}\ldots b_{m}}
of small primes, linear algebra modulo 2 on the matrix
e
i
j
{\displaystyle e_{ij}}
will give a subset of the
a
i
{\displaystyle a_{i}}
whose squares combine to a product of small primes to an even power — that is, a subset of the
a
i
{\displaystyle a_{i}}
whose squares multiply to the square of a (hopefully different) number mod N.
Method
Suppose the composite number N is being factored. Bound B is chosen, and the factor base is identified (which is called P), the set of all primes less than or equal to B. Next, positive integers z are sought such that z2 mod N is B-smooth. Therefore we can write, for suitable exponents ai,
-
z
2
mod
N
=
∏
p
i
∈
P
p
i
a
i
{\displaystyle z^{2}{\text{ mod }}N=\prod _{p_{i}\in P}p_{i}^{a_{i}}}
When enough of these relations have been generated (it is generally sufficient that the number of relations be a few more than the size of P), the methods of linear algebra, such as Gaussian elimination, can be used to multiply together these various relations in such a way that the exponents of the primes on the right-hand side are all even:
-
z
1
2
z
2
2
⋯
z
k
2
≡
∏
p
i
∈
P
p
i
a
i
,
1
+
a
i
,
2
+
⋯
+
a
i
,
k
(
mod
N
)
(
where
a
i
,
1
+
a
i
,
2
+
⋯
+
a
i
,
k
≡
0
(
mod
2
)
)
{\displaystyle {z_{1}^{2}z_{2}^{2}\cdots z_{k}^{2}\equiv \prod _{p_{i}\in P}p_{i}^{a_{i,1}+a_{i,2}+\cdots +a_{i,k}}\ {\pmod {N}}\quad ({\text{where }}a_{i,1}+a_{i,2}+\cdots +a_{i,k}\equiv 0{\pmod {2}})}}
This yields a congruence of squares of the form a2 ≡ b2 (mod N), which can be turned into a factorization of N, N = gcd(a + b, N) × (N/gcd(a + b, N)). This factorization might turn out to be trivial (i.e. N = N × 1), which can only happen if a ≡ ±b (mod N), in which case another try must be made with a different combination of relations; but if a nontrivial pair of factors of N is reached, the algorithm terminates.
Pseudocode
This section is taken directly from Dixon (1981).
Dixon's algorithm
Initialization. Let L be a list of integers in the range [1, n], and let P = {p1, ..., ph} be the list of the h primes ≤ v. Let B and Z be initially empty lists (Z will be indexed by B).
Step 1. If L is empty, exit (algorithm unsuccessful). Otherwise, take the first term z from L, remove it from L, and proceed to Step 2.
Step 2. Compute w as the least positive remainder of z2 mod n. Factor w as:
w
=
w
′
∏
i
p
i
a
i
{\displaystyle w=w'\prod _{i}p_{i}^{a_{i}}}
where w′ has no factor in P. If w′ = 1, proceed to Step 3; otherwise, return to Step 1.
Step 3. Let a ← (a1, ..., ah). Add a to B and z to Z. If B has at most h elements, return to Step 1; otherwise, proceed to Step 4.
Step 4. Find the first vector c in B that is linearly dependent (mod 2) on earlier vectors in B. Remove c from B and
z
c
{\displaystyle z_{c}}
from Z. Compute coefficients
f
b
{\displaystyle f_{b}}
such that:
c
≡
∑
b
∈
B
f
b
b
(
mod
2
)
{\displaystyle \mathbf {c} \equiv \sum _{b\in B}f_{b}\mathbf {b} {\pmod {2}}}
Define:
d
=
(
d
1
,
…
,
d
n
)
←
1
2
(
c
+
∑
f
b
b
)
{\displaystyle \mathbf {d} =(d_{1},\dots ,d_{n})\gets {\frac {1}{2}}\left(\mathbf {c} +\sum f_{b}\mathbf {b} \right)}
Proceed to Step 5.
Step 5. Compute:
x
←
z
c
∏
b
z
b
f
b
,
y
←
∏
i
p
i
d
i
{\displaystyle x\gets z_{c}\prod _{b}z_{b}^{f_{b}},\quad y\gets \prod _{i}p_{i}^{d_{i}}}
so that:
x
2
≡
∏
i
p
i
2
d
i
=
y
2
mod
n
.
{\displaystyle x^{2}\equiv \prod _{i}p_{i}^{2d_{i}}=y^{2}\mod n.}
If
x
≡
y
{\displaystyle x\equiv y}
or
x
≡
−
y
(
mod
n
)
{\displaystyle x\equiv -y{\pmod {n}}}
, return to Step 1. Otherwise, return:
gcd
(
n
,
x
+
y
)
{\displaystyle \gcd(n,x+y)}
which provides a nontrivial factor of n, and terminate successfully.
Step-by-step example
In this example, we factorize (n = 84923) using Dixon's algorithm. This example is lightly adapted from the LeetArxiv substack.[3] Credit is given to the original author.
- Initialization:
- Define a list of numbers L, ranging from 1 to 84923:
-
L
=
{
1
,
…
,
84923
}
{\displaystyle L=\{1,\dots ,84923\}}
-
L
=
{
1
,
…
,
84923
}
{\displaystyle L=\{1,\dots ,84923\}}
- Define a value v, which is the smoothness factor:
-
v
=
7
{\displaystyle v=7}
-
v
=
7
{\displaystyle v=7}
- Define a list P containing all the prime numbers less than or equal to v:
-
P
=
2
,
3
,
5
,
7
{\displaystyle P={2,3,5,7}}
-
P
=
2
,
3
,
5
,
7
{\displaystyle P={2,3,5,7}}
- Define B and Z, two empty lists. B is a list of powers, while Z is a list of accepted integers:
-
B
=
[
]
{\displaystyle B=[]}
-
Z
=
[
]
{\displaystyle Z=[]}
-
B
=
[
]
{\displaystyle B=[]}
- Step 1: Iterating
z
{\displaystyle z}
values
- Start a for loop that indexes the list
L
{\displaystyle L}
. The current element in L {\displaystyle L}
is labeled as z {\displaystyle z}
. The for loop exits at the end of the list, or when Step 5 breaks the loop.
- Start a for loop that indexes the list
L
{\displaystyle L}
int n = 84923; for (int i = 1; i <= n; i++) { int z = i; // Remaining steps go here. // Step 4 can trigger Step 5. // Step 5 can break the loop. }
- Step 2: Computing
z
2
mod
n
{\displaystyle z^{2}\mod n}
and v-smooth Prime Factorization
- To proceed, compute
z
2
mod
84923
{\displaystyle z^{2}\mod 84923}
for each z, then express the result as a prime factorization.
- To proceed, compute
z
2
mod
84923
{\displaystyle z^{2}\mod 84923}
-
1
2
mod
84923
≡
1
mod
84923
=
2
0
⋅
3
0
⋅
5
0
⋅
7
0
mod
84923
{\displaystyle 1^{2}\mod 84923\equiv 1\mod 84923=2^{0}\cdot 3^{0}\cdot 5^{0}\cdot 7^{0}\mod 84923}
-
⋮
{\displaystyle \vdots }
-
513
2
mod
84923
=
8400
mod
84923
=
2
4
⋅
3
1
⋅
5
2
⋅
7
1
mod
84923
{\displaystyle 513^{2}\mod 84923=8400\mod 84923=2^{4}\cdot 3^{1}\cdot 5^{2}\cdot 7^{1}\mod 84923}
-
⋮
{\displaystyle \vdots }
-
537
2
mod
84923
=
33600
mod
84923
=
2
6
⋅
3
1
⋅
5
2
⋅
7
1
mod
84923
{\displaystyle 537^{2}\mod 84923=33600\mod 84923=2^{6}\cdot 3^{1}\cdot 5^{2}\cdot 7^{1}\mod 84923}
-
538
2
mod
84923
=
34675
mod
84923
=
5
2
⋅
19
1
⋅
73
1
mod
84923
{\displaystyle 538^{2}\mod 84923=34675\mod 84923=5^{2}\cdot 19^{1}\cdot 73^{1}\mod 84923}
-
1
2
mod
84923
≡
1
mod
84923
=
2
0
⋅
3
0
⋅
5
0
⋅
7
0
mod
84923
{\displaystyle 1^{2}\mod 84923\equiv 1\mod 84923=2^{0}\cdot 3^{0}\cdot 5^{0}\cdot 7^{0}\mod 84923}
- This step continues for all values of z in the range.
- Step 3: Appending v-smooth results
- If
z
2
mod
84923
{\displaystyle z^{2}\mod 84923}
is 7-smooth, then append its powers to list B {\displaystyle B}
and append z {\displaystyle z}
to list Z {\displaystyle Z}
.
- If
B
{\displaystyle B}
has at most h = 4 {\displaystyle h=4}
elements, return to Step 1. Otherwise, proceed to Step 4.
- For example, after 537 iterations, we have:
- If
z
2
mod
84923
{\displaystyle z^{2}\mod 84923}
-
Z
=
{
1
,
…
,
513
,
537
}
{\displaystyle Z=\{1,\ldots ,513,537\}}
-
B
=
{
[
0
,
0
,
0
,
0
]
,
…
,
[
4
,
1
,
2
,
1
]
,
[
6
,
1
,
2
,
1
]
}
{\displaystyle B=\{[0,0,0,0],\ldots ,[4,1,2,1],[6,1,2,1]\}}
- and we proceed on to Step 4 since we would have more than
h
=
4
{\displaystyle h=4}
vectors in B {\displaystyle B}
.
-
Z
=
{
1
,
…
,
513
,
537
}
{\displaystyle Z=\{1,\ldots ,513,537\}}
- Step 4: This step is split into two parts.
- Part 1: Finding
B
{\displaystyle B}
modulo 2
- Part 1: Finding
B
{\displaystyle B}
-
B
=
(
0
0
0
0
4
1
2
1
6
1
2
1
)
mod
2
≡
B
=
(
0
0
0
0
0
1
0
1
0
1
0
1
)
{\displaystyle B={\begin{pmatrix}0&0&0&0\\4&1&2&1\\6&1&2&1\end{pmatrix}}\mod 2\equiv B={\begin{pmatrix}0&0&0&0\\0&1&0&1\\0&1&0&1\end{pmatrix}}}
-
B
=
(
0
0
0
0
4
1
2
1
6
1
2
1
)
mod
2
≡
B
=
(
0
0
0
0
0
1
0
1
0
1
0
1
)
{\displaystyle B={\begin{pmatrix}0&0&0&0\\4&1&2&1\\6&1&2&1\end{pmatrix}}\mod 2\equiv B={\begin{pmatrix}0&0&0&0\\0&1&0&1\\0&1&0&1\end{pmatrix}}}
- Part 2: Finding a row combination of
B
{\displaystyle B}
that sums to even numbers
- Part 2: Finding a row combination of
B
{\displaystyle B}
- For example, summing Row
2
{\displaystyle 2}
and Row 3 {\displaystyle 3}
gives us a vector of even numbers.
- For example, summing Row
2
{\displaystyle 2}
-
R
2
=
{
0
,
1
,
0
,
1
}
{\displaystyle R_{2}=\{0,1,0,1\}}
and R 3 = { 0 , 1 , 0 , 1 } {\displaystyle R_{3}=\{0,1,0,1\}}
-
R
2
=
{
0
,
1
,
0
,
1
}
{\displaystyle R_{2}=\{0,1,0,1\}}
- then
-
R
2
+
R
3
=
{
0
,
1
,
0
,
1
}
+
{
0
,
1
,
0
,
1
}
{\displaystyle R_{2}+R_{3}=\{0,1,0,1\}+\{0,1,0,1\}}
-
R
2
+
R
3
=
{
0
,
2
,
0
,
2
}
{\displaystyle R_{2}+R_{3}=\{0,2,0,2\}}
.
- Step 5: This step is split into four parts.
- Part 1: Computing x
- Multiply the corresponding
z
{\displaystyle z}
values for the rows found in Step 4, mod n {\displaystyle n}
to get x {\displaystyle x}
.
- Multiply the corresponding
z
{\displaystyle z}
- Part 1: Computing x
- Rows 2 and 3 correspond with 513 and 537, so
x
=
(
513
⋅
537
)
=
20712
mod
84923
{\displaystyle x=(513\cdot 537)=20712\mod 84923}
- Rows 2 and 3 correspond with 513 and 537, so
x
=
(
513
⋅
537
)
=
20712
mod
84923
{\displaystyle x=(513\cdot 537)=20712\mod 84923}
- Part 2: Computing
y
{\displaystyle y}
- Add the rows found in Step 4.
- Divide by 2. (Since these represent exponents, this effectively takes a square root.)
- Apply as exponents to the primes in
P
{\displaystyle P}
to get y {\displaystyle y}
.
- Return to Step 1 if
x
=
±
y
{\displaystyle x=\pm y}
.
- Find the half the sum of Rows 2 and 3:
-
4
1
2
1
+
6
1
2
1
10
2
4
2
÷
2
5
1
2
1
{\displaystyle {\begin{array}{ccccc}&4&1&2&1\\+&6&1&2&1\\\hline &10&2&4&2\\\div 2\\\hline &5&1&2&1\end{array}}}
- Apply those as exponents to
P
=
{
2
,
3
,
5
,
7
}
{\displaystyle P=\{2,3,5,7\}}
:
-
y
=
2
5
⋅
3
1
⋅
5
2
⋅
7
1
=
16800
{\displaystyle y=2^{5}\cdot 3^{1}\cdot 5^{2}\cdot 7^{1}=16800}
- Part 2: Computing
y
{\displaystyle y}
- Since
x
≠
±
y
{\displaystyle x\neq \pm y}
, we proceed to Part 3.
- Since
x
≠
±
y
{\displaystyle x\neq \pm y}
- Part 3: Computing
x
+
y
{\displaystyle x+y}
and x − y {\displaystyle x-y}
where x = 20712 {\displaystyle x=20712}
and y = 16800 {\displaystyle y=16800}
-
x
+
y
=
20712
+
16800
=
37512
{\displaystyle x+y=20712+16800=37512}
-
x
−
y
=
20712
−
16800
=
3912
{\displaystyle x-y=20712-16800=3912}
-
x
+
y
=
20712
+
16800
=
37512
{\displaystyle x+y=20712+16800=37512}
- Part 3: Computing
x
+
y
{\displaystyle x+y}
- Part 4: Computing
gcd
(
x
+
y
,
n
)
{\displaystyle \gcd(x+y,n)}
and gcd ( x − y , n ) {\displaystyle \gcd(x-y,n)}
where n = 84923 {\displaystyle n=84923}
, x + y = 292281 {\displaystyle x+y=292281}
and x − y = 258681 {\displaystyle x-y=258681}
- Part 4: Computing
gcd
(
x
+
y
,
n
)
{\displaystyle \gcd(x+y,n)}
-
gcd
(
37512
,
84923
)
=
521
gcd
(
3912
,
84923
)
=
163
{\displaystyle {\begin{array}{ll}\gcd(37512,84923)=521\\\gcd(3912,84923)=163\end{array}}}
-
gcd
(
37512
,
84923
)
=
521
gcd
(
3912
,
84923
)
=
163
{\displaystyle {\begin{array}{ll}\gcd(37512,84923)=521\\\gcd(3912,84923)=163\end{array}}}
Quick check shows
84923
=
521
⋅
163
{\displaystyle 84923=521\cdot 163}
.
The pseudocode says to only calculate
x
+
y
{\displaystyle x+y}
and return
gcd
(
x
+
y
,
n
)
{\displaystyle \gcd(x+y,n)}
. This is because it is the faster GCD to compute, and once you have it, dividing
n
{\displaystyle n}
by the result is faster than computing
gcd
(
x
−
y
,
n
)
{\displaystyle \gcd(x-y,n)}
.
Optimizations
The quadratic sieve is an optimization of Dixon's method. It selects values of x close to the square root of N such that x2 modulo N is small, thereby largely increasing the chance of obtaining a smooth number.
Other ways to optimize Dixon's method include using a better algorithm to solve the matrix equation, taking advantage of the sparsity of the matrix: a number z cannot have more than
log
2
z
{\displaystyle \log _{2}z}
factors, so each row of the matrix is almost all zeros. In practice, the block Lanczos algorithm is often used. Also, the size of the factor base must be chosen carefully: if it is too small, it will be difficult to find numbers that factorize completely over it, and if it is too large, more relations will have to be collected.
A more sophisticated analysis, using the approximation that a number has all its prime factors less than
N
1
/
a
{\displaystyle N^{1/a}}
with probability about
a
−
a
{\displaystyle a^{-a}}
(an approximation to the Dickman–de Bruijn function), indicates that choosing too small a factor base is much worse than too large, and that the ideal factor base size is some power of
exp
(
log
N
log
log
N
)
{\displaystyle \exp \left({\sqrt {\log N\log \log N}}\right)}
.
The optimal complexity of Dixon's method is
-
O
(
exp
(
2
2
log
n
log
log
n
)
)
{\displaystyle O\left(\exp \left(2{\sqrt {2}}{\sqrt {\log n\log \log n}}\right)\right)}
in big-O notation, or
-
L
n
[
1
/
2
,
2
2
]
{\displaystyle L_{n}[1/2,2{\sqrt {2}}]}
in L-notation.
References
- Kleinjung, Thorsten; et al. (2010). "Factorization of a 768-Bit RSA Modulus". Advances in Cryptology – CRYPTO 2010. Lecture Notes in Computer Science. Vol. 6223. pp. 333–350. doi:10.1007/978-3-642-14623-7_18. ISBN 978-3-642-14622-0. S2CID 11556080.
- Dixon, J. D. (1981). "Asymptotically fast factorization of integers" (PDF). Math. Comp. 36 (153): 255–260. doi:10.1090/S0025-5718-1981-0595059-1. JSTOR 2007743.
- Kibicho, Murage (2025). Hand-Written Paper Implementation: Asymptotically Fast Factorization of Integers.