"how numbers are stored and used in computers"

Elliptic Curve Factorization

Elliptic Curve Factorization (ECM) is a factorization algorithm that uses elliptic curves to find small prime factors of large integers. It is particularly effective for finding factors between 20 and 50 digits in length. The algorithm is based on the properties of elliptic curves over finite fields and their group structure.

An elliptic curve over a field K is defined by the equation:

where and . The algorithm uses the fact that the order of the group of points on an elliptic curve over is between and .

Algorithm

The ECM algorithm works by choosing a random elliptic curve and a point on that curve. It then computes scalar multiples of this point and looks for a point at infinity, which indicates that a factor has been found. The algorithm is similar in structure to Pollard's p-1 algorithm but uses the group structure of elliptic curves instead of multiplicative groups.

Time and Space Complexity

The expected time complexity of ECM is , where is the smallest prime factor of . The space complexity is as it only needs to store a few points on the curve.

Implementation

code.js
1function ellipticCurveFactorization(n) {
2    if (n === 1) return [];
3    if (isPrime(n)) return [n];
4    
5    // Choose random curve parameters
6    const a = BigInt(Math.floor(Math.random() * n));
7    const b = BigInt(Math.floor(Math.random() * n));
8    
9    // Choose random point on curve
10    const x = BigInt(Math.floor(Math.random() * n));
11    const y = BigInt(Math.floor(Math.random() * n));
12    
13    // Compute scalar multiplication
14    let point = { x, y };
15    let factor = 1n;
16    
17    for (let i = 2; i < 1000; i++) {
18        try {
19            point = scalarMultiply(point, BigInt(i), a, b, n);
20        } catch (e) {
21            if (e instanceof Error && e.message === 'Factor found') {
22                factor = BigInt(e.factor);
23                break;
24            }
25        }
26    }
27    
28    if (factor === 1n) {
29        // Try a different curve
30        return ellipticCurveFactorization(n);
31    }
32    
33    return [...ellipticCurveFactorization(factor), 
34            ...ellipticCurveFactorization(n / factor)];
35}
36
37function scalarMultiply(point, k, a, b, n) {
38    let result = point;
39    let temp = point;
40    
41    for (let i = 1; i < k; i++) {
42        try {
43            result = addPoints(result, temp, a, b, n);
44        } catch (e) {
45            if (e instanceof Error && e.message === 'Factor found') {
46                throw e;
47            }
48        }
49    }
50    
51    return result;
52}
53
54function addPoints(p1, p2, a, b, n) {
55    if (p1.x === p2.x && p1.y === p2.y) {
56        return doublePoint(p1, a, b, n);
57    }
58    
59    const lambda = ((p2.y - p1.y) * modInverse(p2.x - p1.x, n)) % n;
60    const x3 = (lambda * lambda - p1.x - p2.x) % n;
61    const y3 = (lambda * (p1.x - x3) - p1.y) % n;
62    
63    return { x: x3, y: y3 };
64}
65
66function doublePoint(p, a, b, n) {
67    const lambda = ((3n * p.x * p.x + a) * modInverse(2n * p.y, n)) % n;
68    const x3 = (lambda * lambda - 2n * p.x) % n;
69    const y3 = (lambda * (p.x - x3) - p.y) % n;
70    
71    return { x: x3, y: y3 };
72}
73
74function modInverse(a, n) {
75    // Extended Euclidean algorithm for modular inverse
76    let [t, newt] = [0n, 1n];
77    let [r, newr] = [n, a];
78    
79    while (newr !== 0n) {
80        const quotient = r / newr;
81        [t, newt] = [newt, t - quotient * newt];
82        [r, newr] = [newr, r - quotient * newr];
83    }
84    
85    if (r > 1n) {
86        throw new Error('Factor found');
87    }
88    
89    return t < 0n ? t + n : t;
90}

Example Usage

Let's factorize a number using ECM:

code.js
1const n = 1234567890123456789012345678901234567890n;
2const factors = ellipticCurveFactorization(n);
3console.log(factors);

Optimization Techniques

Several optimizations can significantly enhance the performance of the Elliptic Curve Method (ECM). One key optimization involves using Montgomery curves. These curves, represented by the equation , allow for faster point arithmetic, particularly the x-coordinate computations which are central to ECM. This reduces the number of required modular multiplications, thereby speeding up the overall factorization process.

Another significant improvement is the implementation of Stage 2 of the ECM algorithm. Stage 1 aims to find factors where the order of the chosen point on the elliptic curve modulo is smooth with respect to a bound . Stage 2 extends this search to find factors where the order has one large prime factor between and a second, larger bound . Techniques like Pollard's rho or discrete logarithm methods are employed in Stage 2, increasing the likelihood of finding a factor without drastically increasing the computational cost compared to simply raising the bound.

ECM is inherently parallelizable, making parallel processing a highly effective optimization strategy. Since the computation for each randomly chosen elliptic curve is independent, multiple curves can be processed simultaneously across different processor cores or machines. This concurrent execution drastically reduces the real time required to find a factor, as numerous attempts can be made in parallel.

Optimizing the scalar multiplication algorithm, which computes (a point multiplied by a large scalar ), is also crucial as it forms the computational core of ECM. Techniques such as the binary method (double-and-add), Non-Adjacent Form (NAF) representation of the scalar , or sliding window methods can reduce the total number of point additions and doublings needed, leading to faster computation of .

Finally, the use of precomputed tables can accelerate common operations within ECM. For instance, during scalar multiplication using windowed methods, precalculating and storing multiples of the base point (like , , ..., up to a certain limit determined by the window size) allows these values to be quickly retrieved from memory instead of being computed repeatedly. This trades increased memory usage for reduced computation time during the critical scalar multiplication phase.

Limitations

The effectiveness of the Elliptic Curve Method (ECM) is primarily dictated by the size of the smallest prime factor, denoted as , of the number being factorized. The algorithm's runtime is roughly proportional to the effort required to find an elliptic curve group whose order modulo is smooth (has only small prime factors). Consequently, ECM performs exceptionally well when possesses a relatively small prime factor, but its efficiency diminishes significantly as the size of the smallest prime factor increases. This contrasts with methods like the Quadratic Sieve or GNFS, whose runtime depends mainly on the size of itself.

ECM is a probabilistic algorithm, meaning it doesn't guarantee finding a factor within a specific timeframe or after a set number of attempts. Its success hinges on finding an elliptic curve and a point whose order modulo a prime factor is smooth with respect to chosen bounds. For certain numbers or specific choices of curves and parameters, the algorithm might fail to find such a smooth order, even after many iterations. While trying more curves or adjusting the smoothness bounds usually overcomes this, it highlights the non-deterministic nature of the method.

The implementation of ECM requires meticulous handling of various edge cases inherent in elliptic curve arithmetic. The formulas for point addition and doubling have exceptions, such as when adding a point to its inverse (resulting in the point at infinity) or when the calculation of the modular inverse fails. This failure occurs when the greatest common divisor (GCD) of the number being inverted and the modulus is greater than 1. Critically, encountering this GCD is precisely how ECM discovers a non-trivial factor of . Therefore, the algorithm must correctly identify and manage these scenarios, distinguishing between computational exceptions and successful factorization events.

Due to its runtime dependency on the smallest prime factor, ECM is generally not the preferred method for factoring numbers composed exclusively of large prime factors, such as RSA moduli used in cryptography. When all prime factors are large, the probability of the elliptic curve group order modulo being smooth within practical computational limits becomes exceedingly small. In such cases, algorithms like the Quadratic Sieve (for numbers up to around 100 digits) or the General Number Field Sieve (for larger numbers) typically offer better performance.

Applications

The Elliptic Curve Method (ECM) excels at finding small prime factors of very large composite numbers. Its runtime depends primarily on the size of the smallest factor , rather than the size of the number itself. This characteristic makes it significantly more efficient than other methods like Trial Division or Pollard's Rho when has at least one relatively small prime factor, even if itself is astronomically large.

In the field of cryptanalysis, ECM serves as a valuable tool. While factoring the large semi-primes used in systems like RSA typically requires more powerful algorithms like the GNFS, ECM can be employed to probe for weaknesses. If a large number used in a cryptographic context happens to possess smaller factors due to improper generation or other vulnerabilities, ECM is often the most efficient way to discover them, potentially compromising the system's security.

ECM is a standard algorithm featured prominently in integer factorization competitions and large-scale computational projects like the Cunningham project. Participants frequently use ECM as an initial step to quickly sift out any small or medium-sized prime factors from the target numbers. Removing these smaller factors efficiently reduces the size of the remaining composite, paving the way for the application of more resource-intensive algorithms suited for numbers with only large prime factors.

The study and application of ECM contribute significantly to research in number theory. It provides a practical link between the abstract theory of elliptic curves and the fundamental problem of integer factorization. Investigating the properties of elliptic curve groups over finite fields, particularly their order's smoothness properties, is driven partly by the desire to improve ECM, leading to deeper theoretical insights and new computational techniques.

Furthermore, ECM often functions as an essential component within more sophisticated factorization strategies. Algorithms like the Quadratic Sieve (QS) and the General Number Field Sieve (GNFS), while powerful for numbers with large factors, benefit greatly from preprocessing. Running ECM first to eliminate smaller factors can substantially reduce the size of the number fed into QS or GNFS, thereby decreasing the overall time and computational resources required for the complete factorization.