⍷ follies

APL-related discussions - a stream of APL consciousness.
Not sure where to start a discussion ? Here's the place to be
Forum rules
This forum is for discussing APL-related issues. If you think that the subject is off-topic, then the Chat forum is probably a better place for your thoughts !

⍷ follies

Postby Roger|Dyalog on Tue Feb 16, 2021 11:47 pm

I had occasion to write a more comprehensive QA suite for the ⍷ (find) function. Of course, that requires that an understanding what ⍷ is supposed to do. I soon came to the following examples:

      x  ← 1 4⍴'abab'
x0 ← 0 4⍴'abab'
y ← 4 9⍴'abab'

x ⍷ y
1 0 1 0 1 0 0 0 0
0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 0 0 0
0 1 0 1 0 1 0 0 0

x0 ⍷ y
1 1 1 1 1 1 0 0 0
1 1 1 1 1 1 0 0 0
1 1 1 1 1 1 0 0 0
1 1 1 1 1 1 0 0 0

It is fairly obvious that first result is correct. But is the second result correct? Why or why not?

Definition

I believe the definition of ⍺⍷⍵ posits that a template of shape ⍴⍺ is moved over each valid position in ⍵, and ⍺ is matched against the templated items in ⍵. (A position is invalid if the template there placed would extend beyond the confines of ⍵.) This definition suffices to explain both of the above results:

      ⍳⍴y
┌───┬───┬───┬───┬───┬───┬───┬───┬───┐
0 0│0 1│0 2│0 3│0 4│0 5│0 6│0 7│0 8│
├───┼───┼───┼───┼───┼───┼───┼───┼───┤
1 0│1 1│1 2│1 3│1 4│1 5│1 6│1 7│1 8│
├───┼───┼───┼───┼───┼───┼───┼───┼───┤
2 0│2 1│2 2│2 3│2 4│2 5│2 6│2 7│2 8│
├───┼───┼───┼───┼───┼───┼───┼───┼───┤
3 0│3 1│3 2│3 3│3 4│3 5│3 6│3 7│3 8│
└───┴───┴───┴───┴───┴───┴───┴───┴───┘

x ≡ (⍴x) ↑ 0 1 ↓ y
0
x ≡ (⍴x) ↑ 1 3 ↓ y
1

(Misalignments in the display are due to defects in the APL Chat Forum software.)

For x⍷y, the valid positions are those marked in red above (4 6↑⍳⍴y). For a position i, the templated items are (⍴x)↑i↓y. For example, (x⍷y)[⊂0 1] is 0 because x≡(⍴x)↑0 1↓y is 0, and (x⍷y)[⊂1 3] is 1 because x≡(⍴x)↑1 3↓y is 1. Likewise, for x0⍷y, the valid positions are also those marked in red, and for each such position i the templated items are (⍴x0)↑i↓y. Since x0≡(⍴x0)↑i↓y, the x0⍷y result is correct.

Model

The definition can be codified as follows:

      
ebar←{
r←(≢⍴⍺)⌈≢⍴⍵ ⍝ maximum rank
r>≢⍴⍺:(⍺⍴⍨(⍴⍺),⍨(r-≢⍴⍺)⍴1)∇ ⍵ ⍝ if ⍺ has lesser rank, make it the same rank
(⍴⍺)∨.>r↑(⍴⍵),¯1:(⍴⍵)⍴0 ⍝ return 0s if ⍺ has greater rank or is longer
ww←⍵
(⍴⍵) ↑ ⍺∘{⍺≡(⍴⍺)↑⍵↓ww}¨ ⍳(×⍴⍺)+(⍴⍵)-⍴⍺
}

Monsters

I proceeded to compare the model against ⍷ itself, and found some disagreements:

      (0 3⍴'a') (⍷ ≡ ebar) 3 4⍴'b'
1
(0 3⍴'a') (⍷ ≡ ebar) 3 4⍴5
0
(0 3⍴2) (⍷ ≡ ebar) 3 4⍴'b'
0
(0 3⍴2) (⍷ ≡ ebar) 3 4⍴5
1

Side-by-side comparisons where the results disagree:

      (0 3⍴'a') ⍷ 3 4⍴5             (0 3⍴'a') ebar 3 4⍴5
1 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0

(0 3⍴2) ⍷ 3 4⍴'b' (0 3⍴2) ebar 3 4⍴'b'
1 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0

So which of ⍷ or ebar is correct? I submit that ⍷ is wrong, because ⍷ and ebar are fundamentally based on ≡, and it is not possible for an array of characters to match an array of numbers.

      (0 3⍴'a') ≡ 0 3⍴5
0
(0 3⍴2) ≡ 0 3⍴'a'
0

How about a more monstrous monster?

      a←0 3⍴⊂⍬
b←1 13⍴(3⍴⊂⍬),3⍴⊂''

a ebar b ⍝ correct
1 1 1 0 0 0 1 1 1 0 0 0 0

a ⍷ b ⍝ incorrect
1 1 1 1 1 1 1 1 1 1 1 0 0

Here, we can not say that either argument is an array of characters or an array of numbers. Instead, the analysis is based on that ⍺ is matched against the templated items in ⍵. The following are the first 6 applications of ≡ in a⍷b:

      (0 3⍴⊂⍬) ≡ 0 3⍴(⊂⍬ ),(⊂⍬ ),⊂⍬
1
(0 3⍴⊂⍬) ≡ 0 3⍴(⊂⍬ ),(⊂⍬ ),⊂''
1
(0 3⍴⊂⍬) ≡ 0 3⍴(⊂⍬ ),(⊂''),⊂''
1
(0 3⍴⊂⍬) ≡ 0 3⍴(⊂''),(⊂''),⊂''
0
(0 3⍴⊂⍬) ≡ 0 3⍴(⊂''),(⊂''),⊂⍬
0
(0 3⍴⊂⍬) ≡ 0 3⍴(⊂''),(⊂⍬ ),⊂⍬
0

A key point about ⍺≡⍵ (in the Dyalog/APL2 style of arrays) is that, for ⍺≡⍵ to be true, the shapes must match, and the items must match or (if there are no items, that is, if the arrays are empty) the prototypes must match.

Missteps

Several missteps were made in the APL model before arriving at the current one. A subtle one was using indexing to do the templating.

      ⍺∘{⍺≡ww[⍵+⍳⍴⍺]}⍤0  ⍝ misstep
⍺∘{⍺≡(⍴⍺)↑⍵↓ww}¨ ⍝ correct

It was a misstep to use indexing to get the templated items, because if ⍺ is empty, ⍵+⍳⍴⍺ loses the information that the template is positioned at ⍵. The "more monstrous monster" brings this flaw to the fore.

Another misstep was one of those "off-by-1" errors. (And, this being APL, several off-by-1 errors can be committed by the one expression.) At one point ⍳1+(⍴⍵)-⍴⍺ was used to compute the valid positions, producing invalid positions if ⍴⍺ had a 0 in any dimension. This was fixed by using ⍳(×⍴⍺)+(⍴⍵)-⍴⍺.
Roger|Dyalog
 
Posts: 236
Joined: Thu Jul 28, 2011 10:53 am

Return to APL Chat

Who is online

Users browsing this forum: No registered users and 1 guest