Suleiman Alrosan: Generate a unique N letters word sequence (by using relationship between Factoradic base system and Lexicographic Permutation Order of a String)

Lets say that you want to generate a unique 3 letters code sequence to be used as a primary key for some data recodes. Also let assume that you are allowed to use alphanumeric characters. For example:

ABC -> Data Recored 1
CBA -> Data Recored 2
Z1T -> Data Recored 3
R39 -> Data Recored 4
etc ..

One way to do is is to calculate and store all permeations of 3 letters codes that can be produced. to put them in numbers :

Character set (S): 36 characters (A-Z + 0-9 case sensitive)

Word size (N): 3 letters

Number of permutations(NP): S!/(S-N)! => (S-0) * (S-1) .. *(S- N-1) => 36 * 35 * 34 => 42840

For 3 letters code it might be okay to calculate and store all these permutation a head of time. However, this will become harder to accomplish once of code size became larger (12 letter code instead of 3).

Fortunately there is a way to get the value of specific permutation in O(1) time and without having calculate and store all permutation. For that we have to understand the Factoradic base system and relationship to the Lexicographical permutation order (Wikipedia ).

Factoradic base system: s a mixed radix numeral system adapted to numbering permutations. It is also called factorial base, although factorials do not function as base, but as place value of digits. By converting a number less than n! to factorial representation, one obtains a sequence of n digits that can be converted to a permutation of n in a straightforward way.

From number Factoradic base to Decimal Base:

For example 341010_! stands for 3₅4₄1₃0₂1₁0₀, whose value is

= 3×5! + 4×4! + 1×3! + 0×2! + 1×1! + 0×0!
= ((((3×5 + 4)×4 + 1)×3 + 0)×2 + 1)×1 + 0
= 463₁₀.

From number Decimal Base to Factoradic base:

For example, 463₁₀ can be transformed into a factorial representation by these successive divisions until quotient reaches zero:

463 ÷ 1 = 463, remainder 0
463 ÷ 2 = 231, remainder 1
231 ÷ 3 = 77, remainder 0
77 ÷ 4 = 19, remainder 1
19 ÷ 5 = 3, remainder 4
3 ÷ 6 = 0, remainder 3

Lexicographical permutation:

For example, here are the permutations of the string “abcd” in lexicographical order:

0  abcd       6  bacd        12  cabd       18  dabc
1  abdc       7  badc        13  cadb       19  dacb
2  acbd       8  bcad        14  cbad       20  dbac
3  acdb       9  bcda        15  cbda       21  dbca
4  adbc       10  bdac       16  cdab       22  dcab
5  adcb       11  bdca       17  cdba       23  dcba

It turns out that there is natural mapping between the index and permutation in place. Basically, if convert the index value from decimal base into factorial base, then you will get mapping to generate the permutation in place.

For example, with n = 3, such a mapping is

decimal	factorial	permutation
0₁₀	000_!	(0,1,2)
1₁₀	010_!	(0,2,1)
2₁₀	100_!	(1,0,2)
3₁₀	110_!	(1,2,0)
4₁₀	200_!	(2,0,1)
5₁₀	210_!	(2,1,0)

The leftmost factoradic digit 0, 1, or 2 is chosen as the first permutation digit from the ordered list (0,1,2) and is removed from the list. Think of this new list as zero indexed and each successive digit dictates which of the remaining elements is to be chosen. If the second factoradic digit is "0" then the first element of the list is selected for the second permutation digit and is then removed from the list. Similarly if the second factoradic digit is "1", the second is selected and then removed. The final factoradic digit is always "0", and since the list now contains only one element it is selected as the last permutation digit.

For example, here is how the digits in the factoradic 4041000_! (equal to 2982₁₀) pick out the digits in (4,0,6,2,1,3,5), the 2982nd permutation of the numbers 0 through 6.

                                 4041000_! → (4,0,6,2,1,3,5)
factoradic:  4          0                        4        1          0          0        0_!
             |          |                        |        |          |          |        |
    (0,1,2,3,4,5,6) -> (0,1,2,3,5,6) -> (1,2,3,5,6) -> (1,2,3,5) -> (1,3,5) -> (3,5) -> (5)
             |          |                        |        |          |          |        |
permutation:(4,         0,                       6,       2,         1,         3,       5)

Now with this in mind we can implement a small program to return the unique permutation value any index in the lexicographical order.

Suleiman Alrosan

Pages

Sunday, December 10, 2017

Generate a unique N letters word sequence (by using relationship between Factoradic base system and Lexicographic Permutation Order of a String)

No comments:

Post a Comment

Generate a unique N letters word sequence (by using relationship between Factoradic base system and Lexicographic Permutation Order of a String)

Search This Blog