Dyadic grade up (X ⍋ Y)
9 posts
• Page 1 of 1
Dyadic grade up (X ⍋ Y)
Dear All,
i need to sort word list with some diacriticals ( ie äéàèùç ...).
with something like that
To help you to experiment it :
i am not sure... is it better/efficient/elegant with
or with
local help (with F1) about dyadic grade up show the result i need. i just need to evolve with diacritical sign.
Thanks you for your helps
Yves
i need to sort word list with some diacriticals ( ie äéàèùç ...).
with something like that
]display tabTri3D[ 1 ; ; ]
┌→──────────────────────────┐
↓ abcdefghijklmnopqrstuvwxyz│
│ ã │
│ à │
│ á │
│ â │
│ ä │
│ å │
└───────────────────────────┘
]display tabTri3D[ 2 ; ; ]
┌→──────────────────────────┐
↓ ABCDEFGHIJKLMNOPQRSTUVWXYZ│
│ Á │
│ Â │
│ Ã │
│ À │
│ Ä │
│ Å │
└───────────────────────────┘
To help you to experiment it :
⎕ucs 'aãàáâäå'
97 227 224 225 226 228 229
⎕ucs 'AÁÂÃÀÄÅ'
65 193 194 195 192 196 197
i am not sure... is it better/efficient/elegant with
Plan1 ,[1] Plan2
or with
tabTri3D←2 7 27⍴''
tabTri3D[ 1 ; ; ] ← Plan1
tabTri3D[ 2 ; ; ] ← Plan2
local help (with F1) about dyadic grade up show the result i need. i just need to evolve with diacritical sign.
Thanks you for your helps
Yves
- Yves
- Posts: 39
- Joined: Mon Nov 30, 2015 11:33 am
Re: Dyadic grade up (X ⍋ Y)
I am not sure what you are asking exactly (e.g. does "A" precede "a"?), but there is a recent post on the Dyalog blog on dyadic grade which includes an example involving diacritical marks.
- Roger|Dyalog
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
Re: Dyadic grade up (X ⍋ Y)
Hi Roger,
Nice to hear you.
your link is very interesting, i study it.
i come back with next question, specially for you :)
Regards,
Yves
Nice to hear you.
your link is very interesting, i study it.
i come back with next question, specially for you :)
Regards,
Yves
- Yves
- Posts: 39
- Joined: Mon Nov 30, 2015 11:33 am
Re: Dyadic grade up (X ⍋ Y)
Dear Roger & All,
For language not using latin characters, we use systematically latin sign diacritical ornament. it is transliteration.
first example (exist in unicode) :
with ṭ (⎕ucs 7789), we have only one code.
but this glyph is writable with t follow by ̣ (⎕ucs 116 803).
same glyph, 1 or 2 code.
second example (not exist in unicode) :
same glyph with accent. not exist in unicode.
i do t with underpoint, follow by accent ?
or t with accent, follow by underpoint ?
or t follow by accent, follow by underpoint ?
or t follow by underpoint, follow by accent ?
in this case, we have 2 or 3 code, and more combinations.
how is it possible to put ṭ in array with all combinations (all combination give the same weight for ⍋) ?
all combination give one weight for the same glyph, and return the better combination, and the same each time for this glyph.
the translitteration for sanskrit need 15 letters with different ornament.
i hope ⍺⍋⍵ is as simply as letters with diacritical.
Regards,
Yves
For language not using latin characters, we use systematically latin sign diacritical ornament. it is transliteration.
first example (exist in unicode) :
with ṭ (⎕ucs 7789), we have only one code.
but this glyph is writable with t follow by ̣ (⎕ucs 116 803).
same glyph, 1 or 2 code.
second example (not exist in unicode) :
same glyph with accent. not exist in unicode.
i do t with underpoint, follow by accent ?
or t with accent, follow by underpoint ?
or t follow by accent, follow by underpoint ?
or t follow by underpoint, follow by accent ?
in this case, we have 2 or 3 code, and more combinations.
how is it possible to put ṭ in array with all combinations (all combination give the same weight for ⍋) ?
all combination give one weight for the same glyph, and return the better combination, and the same each time for this glyph.
the translitteration for sanskrit need 15 letters with different ornament.
i hope ⍺⍋⍵ is as simply as letters with diacritical.
Regards,
Yves
- Yves
- Posts: 39
- Joined: Mon Nov 30, 2015 11:33 am
Re: Dyadic grade up (X ⍋ Y)
You have described an interesting problem. If I understand correctly, the problem you described is not one for dyadic ⍋. One way to solve it is as follows:
0. Identify the "symbols". From your description a symbol can be denoted by multiple characters, for example "t" or ⎕ucs 116 803.
1. Transform each symbol to an integer value, or pair of integers if that make things easier. Beforehand, you can make a table of symbols and corresponding numeric value. For example,
(The values depend on how you want to order the symbols.)
2. Grade the array of integer values.
Putting it all together: {⎕io←0 ⋄ ⍋0 2 1⍉Value[Symbol⍳⍪symbolize ⍵;]} (Based on the "Alternatives" section of the Dyadic Grade blog post.) Of these steps, by far the trickiest will be step 0, the "symbolize" step.
0. Identify the "symbols". From your description a symbol can be denoted by multiple characters, for example "t" or ⎕ucs 116 803.
1. Transform each symbol to an integer value, or pair of integers if that make things easier. Beforehand, you can make a table of symbols and corresponding numeric value. For example,
- Code: Select all
Symbol Value
A 97 0
À 97 1
a 97 0
à 97 1
...
t 116 0
⎕ucs 116 803 116 1
...
(The values depend on how you want to order the symbols.)
2. Grade the array of integer values.
Putting it all together: {⎕io←0 ⋄ ⍋0 2 1⍉Value[Symbol⍳⍪symbolize ⍵;]} (Based on the "Alternatives" section of the Dyadic Grade blog post.) Of these steps, by far the trickiest will be step 0, the "symbolize" step.
- Roger|Dyalog
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
Re: Dyadic grade up (X ⍋ Y)
Hi Roger,
you wellunderstanding the difficulty.
To help you, here it is all vowels, in official order, for sanskrit alphabet in transliteration.
for t sample :
confusion and difficulty are increased when the letter H herself play in the game.
i suggest to see https://unicode-table.com/fr/blocks/combining-diacritical-marks/ and more here https://unicode-table.com/fr/blocks/combining-diacritical-marks-supplement/.
i try your suggestions, and i come back.
Regards,
Yves
you wellunderstanding the difficulty.
To help you, here it is all vowels, in official order, for sanskrit alphabet in transliteration.
chn ← (97) (97 772) (105) (105 772) (117) (117 772)parenthesis are just delimiters of group.
(114 803) (114 803 772) (108 803) (108 803 772) (101) (97 105) (111) (97 117)
(109 775) (58)
for t sample :
⎕ucs¨ (116 803) (116 803 104) (116 803 769)this H is not independant. it is glue at the T to indicate "hard breath". i prefere the third option : flexion is indicate by diacritical sign, not a letter.
┌→─────────────────┐
│ ┌→─┐ ┌→──┐ ┌→──┐ │
│ │ṭ │ │ṭh │ │ṭ́ │ │
│ └──┘ └───┘ └───┘ │
└∊─────────────────┘
confusion and difficulty are increased when the letter H herself play in the game.
i suggest to see https://unicode-table.com/fr/blocks/combining-diacritical-marks/ and more here https://unicode-table.com/fr/blocks/combining-diacritical-marks-supplement/.
i try your suggestions, and i come back.
Regards,
Yves
- Yves
- Posts: 39
- Joined: Mon Nov 30, 2015 11:33 am
Re: Dyadic grade up (X ⍋ Y)
Good luck.
Dyadic grade ⍺⍋⍵ works on individual characters in ⍵, but you want to compare (for example) 't' vs. ⎕ucs 116 803. Therefore you have to use something other than dyadic grade.
Dyadic grade ⍺⍋⍵ works on individual characters in ⍵, but you want to compare (for example) 't' vs. ⎕ucs 116 803. Therefore you have to use something other than dyadic grade.
- Roger|Dyalog
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
Re: Dyadic grade up (X ⍋ Y)
Bonjour Yves, in case it is helping I have contributed a function to 'normalize' some text using .Net at the APL Wiki: https://aplwiki.com/netUpperLowerCase#R ... diacritics)
The goal would be to apply the sorting index of the 'normalize' text to the 'non-normalize' text.
Bonne chance,
Pierre Gilbert
The goal would be to apply the sorting index of the 'normalize' text to the 'non-normalize' text.
Bonne chance,
Pierre Gilbert
-
PGilbert - Posts: 436
- Joined: Sun Dec 13, 2009 8:46 pm
- Location: Montréal, Québec, Canada
Re: Dyadic grade up (X ⍋ Y)
You may be able to do some preprocessing which replaces the appropriate character sequences with single placeholder characters. Then use dyadic ⍋. Look e.g. here.
-
Adam|Dyalog - Posts: 134
- Joined: Thu Jun 25, 2015 1:13 pm
9 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 1 guest
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group