⊃⍋ and ⊃⍒

General APL language issues

⊃⍋ and ⊃⍒

Postby jayfoad on Tue Dec 10, 2019 2:35 pm

Please speed up ⊃⍋⍵ and ⊃⍒⍵! They are really neat ways to find where the smallest or largest item in a vector is. In Dyalog 17.1 I get:
Code: Select all
      cmpx'⊃⍋a' 'a⍳⌊/a'⊣a←?1e6/0
  ⊃⍋a   → 1.7E¯1 |    0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  a⍳⌊/a → 3.1E¯4 | -100%
      cmpx'⊃⍋a' 'a⍳⌊/a'⊣a←1e6?1e7
  ⊃⍋a   → 2.2E¯2 |    0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  a⍳⌊/a → 1.7E¯4 | -100%

For a motivating example see: https://www.reddit.com/r/adventofcode/comments/e7pkmt/2019_day_8_solutions/faaw6lr/?context=3
Posts: 3
Joined: Wed May 29, 2019 1:51 pm

Re: ⊃⍋ and ⊃⍒

Postby Vince|Dyalog on Wed Dec 11, 2019 1:32 pm

Hi Jay,

Roger has previously logged this as RFE 15962.

I have added your support for this to its notes.


Posts: 329
Joined: Wed Oct 01, 2008 9:39 am

Re: ⊃⍋ and ⊃⍒

Postby Roger|Dyalog on Wed Dec 11, 2019 5:55 pm

Hello Jay!

a. Special code for ⊃⍋ v a⍳⌊/a should be even better than your benchmark indicates, because ⍋ uses exact comparisons whereas ⍳ uses tolerant comparison. (Likewise ⊃⍒ v a⍳⌈/a.)

b. i⊃⍋x and i⊃⍒x are also worthwhile. See for example https://code.jsoftware.com/wiki/Essays/Order_Statistics.
Posts: 205
Joined: Thu Jul 28, 2011 10:53 am

Re: ⊃⍋ and ⊃⍒

Postby Marshall|Dyalog on Wed Dec 11, 2019 7:16 pm

Roger, remember that tolerant comparison with a scalar no longer imposes a penalty relative to intolerant comparison (https://www.dyalog.com/blog/2018/11/tol ... on-part-1/).

However, a⍳⌊/a has the problem that if the minimum is found near the end of a, then it takes over twice as long as if it is near the beginning. This can be mitigated by processing a in blocks (of perhaps 1KB), and recording the index of the block containing the minimum. For each block, find its minimum, and record that block's index if its minimum is smaller than the previous minimum (if we have found the minimum possible value, we can stop early here). After finishing all blocks, search the block with the recorded index to find the exact index of the minimum.

I'm unwilling to implement these special combinations using idioms, which support only one way of writing the combination and can cause problems elsewhere (for example, ⊢⌿⍤2 isn't converted to ⊢⌿ with axis, and misses special code, because ⊢⌿ is an idiom). Since thunks won't be in 18.0, I could implement special code for the Atop forms (⊃⍋) (⊃⍒) ⊃⍤⍋ ⊃⍤⍒. That would at least allow access to this algorithm.
Posts: 4
Joined: Fri Jan 27, 2017 1:34 pm

Re: ⊃⍋ and ⊃⍒

Postby petermsiegel on Wed Dec 11, 2019 8:01 pm

I've always wondered why there wasn't a (default) option for the editor or ⎕FX/⎕FIX process being allowed to make small changes in the user code, in order to make it accessible to idioms or other improvements; other mechanisms could be considered (eg hidden source code, when the actual executed code is different, but equivalent) and I'll bet such equivalence mechanisms are used behind the scenes anyway. Over the years, APLs have sometimes made small changes to programs to reflect the actual internal forms rather than what was entered, e.g. with numbers:

I type: .21E34
APL shows: 2.1E33

(I recall other more jarring editor-induced changes in older IBM mainframe APLs but can't reconstruct them off the top of my head).

I'd expect most users would not mind making a tradeoff between optimizations and literal consistency. Those with pedagogical goals (wanting the format exactly as typed in) could set a flag-- I doubt anyone would use it.

Having idioms that are missed or channeling APLers to write in exactly one -- otherwise equivalent -- way simply to get performance seems contrary to the spirit of APL. Thunks are another path to reaching similar goals.
Posts: 77
Joined: Thu Nov 11, 2010 11:04 pm

Re: ⊃⍋ and ⊃⍒

Postby Adam|Dyalog on Fri Dec 13, 2019 10:27 am

As an aside, you can ask APL to keep your function exactly as typed. You just just need to change the way you create the function. Instead of creating it with
      )ed foo

do it with
      ⎕ED 2⎕FIX,⊂'foo'
User avatar
Posts: 82
Joined: Thu Jun 25, 2015 1:13 pm

Re: ⊃⍋ and ⊃⍒

Postby kai on Mon Dec 16, 2019 9:00 am

Is that really a good idea?

Other programming languages enforce certain styling rules, and with good reasons.

Why should APL go the other way?
User avatar
Posts: 115
Joined: Thu Jun 18, 2009 5:10 pm
Location: Stevenage, UK

Re: ⊃⍋ and ⊃⍒

Postby Phil Last on Mon Dec 16, 2019 10:26 am

In general I can't agree with Kai's point. Something I've always disliked is that switching off AutoFormat doesn't prevent my deliberate coding of m←1e3 being changed to m←1000 or the removal of deliberately inserted blanks after colons. But AutoFormat is preventable so the horse has already bolted from that stable.

What I worry about is that what Adam suggests appears to be a bit of magic where the difference between two functions, even in the same namespace, is programmatically undiscoverable. So a flag is being hidden somewhere and we don't know what repercussions there might be in the future where a developer at Dyalog inadvertently changes something in one case but not the other.

Are functions created with the two methods identical in all other respects. Even Now?
User avatar
Phil Last
Posts: 543
Joined: Thu Jun 18, 2009 6:29 pm

Re: ⊃⍋ and ⊃⍒

Postby Morten|Dyalog on Tue Dec 17, 2019 12:47 pm

Phil Last wrote:What I worry about is that what Adam suggests appears to be a bit of magic where the difference between two functions, even in the same namespace, is programmatically undiscoverable.

So... as usual this sort of inconsistency was caused by accidents of history. If I am remembering this correctly, ⎕FIX was originally invented in order to allow you to create namespaces and classes only. Subsequently, ⎕FIX was extended to support the use of Unicode text files outside the workspace, and in conjunction with this, also extended to support the definition of functions and operators (⎕FIX 'file://blah.aplf').

One of the advantages of ⎕FIX is it retains the source text separately from the tokenised objects in the workspace, which allows us to preserve source code as typed. The downside of this is that the double-byte Unicode source text typically consumes around twice as much space as the (single-byte) tokenised function itself (bringing the total to 3x the original space consumption). For this reason, I would not generally recommend using Adam's trick UNLESS you are also using external source files.

It is currently difficult to determine whether a function has ⎕FIX source available. This is something that we plan to fix (no pun intended) in the near future. There will still be a difference in behaviour between functions which only have tokens and those that have separate source, but you will be able to detect the difference even if the source is only held in the workspace rather than in a file.
User avatar
Posts: 380
Joined: Tue Sep 09, 2008 3:52 pm

Re: ⊃⍋ and ⊃⍒

Postby Phil Last on Tue Dec 17, 2019 11:10 pm

Morten wrote:This is something that we plan to fix (no pun intended) in the near future.
I'm sorry Morten but "planning to fix something in the near future" is what was proposed re. discovering the difference between namespaces created with ⎕NS and those with ⎕FIX - i.e. container and scripted namespaces - without having to set a trap.

And we've been waiting more than a decade for that.
User avatar
Phil Last
Posts: 543
Joined: Thu Jun 18, 2009 6:29 pm


Return to Language

Who is online

Users browsing this forum: No registered users and 1 guest