repeated squaring

by **Roger|Dyalog** on Sat May 16, 2020 12:51 am

If function f “multiplies” then f⍨ squares; squaring n times raises to power 2*n and is therefore a fast way to exponentiate.

      ×⍨ 3
9
      ×⍨ ×⍨ ×⍨ ×⍨ 3 
43046721
      ×⍨⍣4 ⊢ 3
43046721
      3 * 2*4
43046721

One way to compute the n-th power for an arbitrary n (when n may be not a power of 2), is to find the squares corresponding to 1 bits in the binary representation of n, and then compute the product.

      3*25
847288609443

      2⊥⍣¯1⊢25
1 1 0 0 1
      ⍸ ⊖ 2⊥⍣¯1⊢25
0 3 4
      {×⍨⍣⍵⊢3}¨ 0 3 4
3 6561 43046721

      ×⌿ {×⍨⍣⍵⊢3}¨ 0 3 4
847288609443

The examples show that it would be nice if ⍣ accepts an array right operand. (In that case, the squaring would be done 4 times, rather than 0 times and 3 times and 4 times.) Until that blessed day, the repeated squaring computation can be coded as follows:

      RS← {⊃ ⍺⍺⌿ b⌿ (⊢,⊂∘(⍺⍺⍨)∘⊃∘⊖)⍣(¯1+≢b) ⊂⍺ ⊣ b←⊖2⊥⍣¯1⊢⍵}
              ⍝ └──────────────────────────┘
              ⍝    ←→    ⍺⍺⍨⍣(⍸b)

      3 ×RS 25
847288609443

      3 +RS 25
75

In the opening sentence above, multiplication is in quotes because the method works for other kinds of multiplication, such as matrix multiplication or ⍵[⍵] (as in the text fitting problem). The last example 3 +RS 25 illustrates that it works for + as “multiplication”, whose “squaring” is doubling and whose “exponentiation” is multiplication.

An Application: Fibonacci Numbers

We illustrate repeated squaring by some computations on the Fibonacci numbers.

      fib← {1≥⍵:⍵ ⋄ (∇ ⍵-2)+∇ ⍵-1}

      fib¨ ⍳10
0 1 1 2 3 5 8 13 21 34

The double recursion is quite inefficient: The number of function calls for fib n is fib n+1, which is exponential in n. More efficient formulations exist:

      fib2← {⊃ +⍀∘⊖⍣⍵ ⍳2}
      fib2¨ ⍳10
0 1 1 2 3 5 8 13 21 34

      fib3← {⌊0.5+r÷⍨(2÷⍨1+r←5*0.5)*⍵}
      fib3¨ ⍳10
0 1 1 2 3 5 8 13 21 34

      cmpx 'fib 30' 'fib2 30' 'fib3 30'
  fib 30  → 1.56E0  |    0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  fib2 30 → 0.00E0  | -100%                               
  fib3 30 → 0.00E0  | -100%                               
      cmpx 'fib2 30' 'fib3 30'
  fib2 30 → 9.75E¯6 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  fib3 30 → 1.81E¯6 | -82% ⎕⎕⎕⎕⎕⎕

Whither repeated squaring? The relationship (fib n)=(fib n-2)+(fib n-1) can be written in matrix form, and:

      ⊢ A←2 2⍴0 1 1 1
0 1
1 1
      A +.× 0 1
1 1
      A +.× A +.× 0 1
1 2
      A +.× A +.× A +.× 0 1
2 3
      A +.× A +.× A +.× A +.× 0 1
3 5
      A +.× A +.× A +.× A +.× A +.× 0 1
5 8
     (A +.× A +.× A +.× A +.× A) +.× 0 1
5 8
      fib 6
8
      ¯1 ¯1↑(A+.×A+.×A+.×A+.×A)
8

fib n for n≥1 is the entry at the lower right corner of the (n-1)-th power of A. The last parenthesized expression, the (A +.× … +.× A), is where repeated squaring can be used.

      fibm← {1≥⍵:⍵ ⋄ ⊃ ¯1 ¯1↑ A +.×RS ⍵-1}
      fibm¨ ⍳10
0 1 1 2 3 5 8 13 21 34

      cmpx 'fib2 50' 'fib3 50' 'fibm 50'
  fib2 50 → 1.58E¯5 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕  
  fib3 50 → 1.73E¯6 | -90% ⎕⎕⎕                           
  fibm 50 → 1.70E¯5 |  +7% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The last benchmark shows that the matrix method with repeated squaring is competitive with the other methods. We next show that it is superior to the other methods for large n.

For large n, fib3, based on computations on irrational numbers (5*0.5), is not in the running, at least not without a large amount of work (which we won’t do here). For large n, we first develop functions for extended precision arithmetic:

      carry ← {(+⌿∧⍀0=t)↓t←{1↓+⌿1 0⌽0,0 10⊤⍵}⍣≡0,⍵}
      xp    ← {carry (m↑⍺)+m↑⍵ ⊣ m←-(≢⍺)⌈≢⍵}              ⍝ plus
      xt    ← {carry +⌿ (m-0 1) ⍴ (m←(≢⍺)+0,≢⍵) ↑ ⍺∘.×⍵}  ⍝ times

      3 1 4 1 5 9 2 6 5 3 xp 2 7 1 8 2 8
3 1 4 1 8 6 4 4 8 1
      3141592653 + 271828
3141864481

      3 1 4 1 5 9 2 6 5 3 xt 2 7 1 8 2 8
8 5 3 9 7 2 8 4 7 6 7 9 6 8 4
      3141592653 × 271828
853972847679684

“Numbers” are vectors of decimal digits, most significant digit first. xp is extended precision plus and xt is extended precision times. (The FFT-based xtimes in the dfns workspace is faster for large numbers, but we don’t need it to prove the point.) Modifying fib2 and fibm for extended precision computations:

      xfib2← {⊃ xp⍀∘⊖⍣⍵ ,¨⍳2}

      xfib2¨ ⍳10
┌─┬─┬─┬─┬─┬─┬─┬───┬───┬───┐
│0│1│1│2│3│5│8│1 3│2 1│3 4│
└─┴─┴─┴─┴─┴─┴─┴───┴───┴───┘

      ⊢xA←,¨A
┌─┬─┐
│0│1│
├─┼─┤
│1│1│
└─┴─┘
      xfibm← {1≥⍵:,⍵ ⋄ ⊃ ¯1 ¯1↑ xA xp.xt RS ⍵-1}

      xfibm¨ ⍳10
┌─┬─┬─┬─┬─┬─┬─┬───┬───┬───┐
│0│1│1│2│3│5│8│1 3│2 1│3 4│
└─┴─┴─┴─┴─┴─┴─┴───┴───┴───┘

Let us compute a Fibonacci number with 200 decimal digits. From fib3, we know that

      r←5*0.5
      (2÷⍨1+r)⍟r×1e200
958.6666692945176

      fib3 958
7.255617127345179E199

Release the hounds!

      (xfib2 ≡ xfibm) 958
1
      t←xfibm 958
      ⍴t
200
      t
7 2 5 5 6 1 7 1 2 7 3 4 4 9 4 5 7 8 3 4 9 5 1 3 … 7 9 1 8 0 7 9

      (54⍴10 1/1 0) \ 4 50 ⍴ ⎕d[t]
7255617127 3449457834 9513639664 8012357907 4308201883
3452265955 2369453901 0959098708 8044945653 6465064308
7041102422 5291077532 1041899600 8519495018 0745825717
5818735941 5023021847 2032795681 1455446059 3757918079

      {⎕fr←1287 ⋄ ¯40 ⍕ r÷⍨(2÷⍨1+r←5*0.5)*958} ⍬    ⍝ sanity check
 7.255617127344945783495136396648071______E199

How effective is repeated squaring (xfibm)?

      cmpx 'xfib2 958' 'xfibm 958'
  xfib2 958 → 1.53E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  xfibm 958 → 4.07E¯3 | -74% ⎕⎕⎕⎕⎕⎕⎕⎕                      

      cmpx 'xfib2 2000' 'xfibm 2000'
  xfib2 2000 → 4.50E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  xfibm 2000 → 6.50E¯3 | -86% ⎕⎕⎕⎕

Collected Definitions

      ⍝
RS    ← {⊃ ⍺⍺⌿ b⌿ (⊢,⊂∘(⍺⍺⍨)∘⊃∘⊖)⍣(¯1+≢b) ⊂⍺ ⊣ b←⊖2⊥⍣¯1⊢⍵}

fib   ← {1≥⍵:⍵ ⋄ (∇ ⍵-2)+∇ ⍵-1}
fib2  ← {⊃ +⍀∘⊖⍣⍵ ⍳2}
fib3  ← {⌊0.5+r÷⍨(2÷⍨1+r←5*0.5)*⍵}

A     ← 2 2⍴0 1 1 1
fibm  ← {1≥⍵:⍵ ⋄ ⊃ ¯1 ¯1↑ A +.×RS ⍵-1}

carry ← {(+⌿∧⍀0=t)↓t←{1↓+⌿1 0⌽0,0 10⊤⍵}⍣≡0,⍵}
xp    ← {carry (m↑⍺)+m↑⍵ ⊣ m←-(≢⍺)⌈≢⍵}              ⍝ plus
xt    ← {carry +⌿ (m-0 1) ⍴ (m←(≢⍺)+0,≢⍵) ↑ ⍺∘.×⍵}  ⍝ times
xA    ← ,¨A

xfib2 ← {⊃ xp⍀∘⊖⍣⍵ ,¨⍳2}
xfibm ← {1≥⍵:,⍵ ⋄ ⊃ ¯1 ¯1↑ xA xp.xt RS ⍵-1}

The initial portion of this post is translated from the J Wiki Essay Repeated Squaring, 2005-09-28.

by **Roger|Dyalog** on Sun May 17, 2020 1:02 am

Roger|Dyalog wrote:For large n, fib3, based on computations on irrational numbers (5*0.5), is not in the running, at least not without a large amount of work (which we won’t do here).

It turns out that it is not a large amount of work and it was done already in 2005 in the J Wiki Essay Fibonacci Sequence, section Q and Z ring extensions. There, the explanation is rather cryptic, if I do say so myself:

Based on Binet’s formula … operations are done in Q[√5] and Z[√5] with powers computed by repeated squaring.

In this note we will provide a more leisurely derivation.

Binet’s formula states that:

      fib n  ←→  (√5) ÷⍨ ((2÷⍨1+√5)*n) - (2÷⍨1-√5)*n
      fib n  ←→  (2*n) ÷⍨ (√5) ÷⍨ ((1+√5)*n) - (1-√5)*n     ⍝ (*)

We will do computations in the ring extension Z[√5] -- the integers extended with √5. A “number” is an ordered pair of integers (a,b) whose real value (value as a real number) is (a+b×√5). Plus and times for such numbers are done as follows:

      ⍝
rp← +                    ⍝ plus:  (a,b) rp (c,d) ←→ ((a+c),b+d)
rt← (1 5 +.× ×) , +.×∘⊖  ⍝ times: (a,b) rt (c,d) ←→ (((a×c)+5×b×d),(a×d)+b×c)

      3 4 rp 6 ¯2
9 2
      3 4 rt 6 ¯2
¯22 18

(Misalignments in the display are due to defects in the APL Chat Forum software.)

In (*), the key computations are (1+√5)*n and (1-√5)*n or, in the ordered pair representation, 1 1 and 1 ¯1 to the power n. We can use the RS (repeated squaring) operator from the previous post to do the power:

      RS← {⊃ ⍺⍺⌿ b⌿ (⊢,⊂∘(⍺⍺⍨)∘⊃∘⊖)⍣(¯1+≢b) ⊂⍺ ⊣ b←⊖2⊥⍣¯1⊢⍵}

      1 1 rt RS 13
2134016 954368
      1 ¯1 rt RS 13
2134016 ¯954368

      fib3 ← {⌊0.5+r÷⍨(2÷⍨1+r←5*0.5)*⍵}

      fib3 13
233
      (2 × 954368) ÷ 2*13
233

We observe that the first item of the ordered pairs are the same and the second item is the negation of each other. Whence in (*), ((1+√5)*n) - (1-√5)*n ↔ (1 1 rt RS n) - (1 ¯1 rt RS n) ↔ (0 2×1 1 rt RS n) whose real value is √5 × ⊃⌽(0 2×1 1 rt RS n). But this value is divided by √5, so we are left with just ⊃⌽(0 2×1 1 rt RS n), twice the value of the second item of 1 1 rt RS n. No computation involving √5 is required. The final division by 2*n can be avoided by halving at each step. Therefore:

      fib4 ← {0=⍵:0 ⋄ ⊃⌽ 1 1 (÷∘2 rt) RS ⍵}

      fib4¨ ⍳10
0 1 1 2 3 5 8 13 21 34
      fib3¨ ⍳10
0 1 1 2 3 5 8 13 21 34

For exact calculations of fib n for large n using this idea, we work with ordered pairs of extended precision numbers, that is, ordered pairs of vectors of decimal digits, and extend the operations in fib4. (carry, xp, xt, and xfibm are from the previous post, reproduced here.)

      ⍝
carry ← {(+⌿∧⍀0=t)↓t←{1↓+⌿1 0⌽0,0 10⊤⍵}⍣≡0,⍵}
xp    ← {carry (m↑⍺)+m↑⍵ ⊣ m←-(≢⍺)⌈≢⍵}              ⍝ plus
xt    ← {carry +⌿ (m-0 1) ⍴ (m←(≢⍺)+0,≢⍵) ↑ ⍺∘.×⍵}  ⍝ times
xA    ← ,¨A
xfibm ← {1≥⍵:,⍵ ⋄ ⊃ ¯1 ¯1↑ xA xp.xt RS ⍵-1}

halve ← {carry (⌊⍵÷2)+¯1↓0,5×2|⍵}
qt    ← ((,¨1 5) xp.xt xt¨) , xp.xt∘⊖ 
xfib4 ← {1≥⍵:,⍵ ⋄ ⊃⌽ (,¨1 1) (halve¨ qt) RS ⍵}

      xfib4¨ ⍳10
┌─┬─┬─┬─┬─┬─┬─┬───┬───┬───┐
│0│1│1│2│3│5│8│1 3│2 1│3 4│
└─┴─┴─┴─┴─┴─┴─┴───┴───┴───┘
      (xfib4 ≡ xfibm)¨ ⍳10
1 1 1 1 1 1 1 1 1 1

      ⎕d[xfib4 300]
222232244629420445529739893461909967206666939096499764990979600
      fib3 300
2.2223224462942268E62

xfib4 is faster than xfibm:

      cmpx 'xfibm 958' 'xfib4 958'
  xfibm 958 → 3.87E¯3 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  xfib4 958 → 2.94E¯3 | -25% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕    
   
      cmpx 'xfibm 2000' 'xfib4 2000'
  xfibm 2000 → 6.40E¯3 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  xfib4 2000 → 3.75E¯3 | -42% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

Finally, an aside: fib3 is defined as ⌊0.5+r÷⍨(2÷⍨1+r)*⍵ where r←5*0.5, which replaces the r÷⍨(2÷⍨1-r)*⍵ term in Binet’s formula by ⌊0.5+… . The maneuver derives from that the magnitude of the replaced term is always less than 0.5 (and diminishes exponentially with ⍵):

      12 ¯6 ⍕ r÷⍨(2÷⍨1-r)*4 5⍴⍳20
 4.47214E¯1 ¯2.76393E¯1  1.70820E¯1 ¯1.05573E¯1  6.52476E¯2 
¯4.03252E¯2  2.49224E¯2 ¯1.54029E¯2  9.51949E¯3 ¯5.88337E¯3 
 3.63612E¯3 ¯2.24725E¯3  1.38888E¯3 ¯8.58372E¯4  5.30503E¯4 
¯3.27869E¯4  2.02634E¯4 ¯1.25235E¯4  7.73994E¯5 ¯4.78354E¯5

The tool of thought for

software solutions

repeated squaring

repeated squaring

Re: repeated squaring

Who is online

QUICK LINKS