Sunday, December 10, 2017

Generate a unique N letters word sequence (by using relationship between Factoradic base system and Lexicographic Permutation Order of a String)



Lets say that you want to generate a unique 3 letters code sequence to be used as a primary key for some data recodes. Also let assume that you are allowed to use alphanumeric characters. For example:


  • ABC ->  Data Recored 1
  • CBA -> Data Recored 2
  • Z1T  -> Data Recored 3
  • R39  -> Data Recored 4
  • etc ..


One way to do is is to calculate and store all permeations of 3 letters codes that can be produced. to put them in numbers : 

Character set (S): 36 characters (A-Z + 0-9 case sensitive)
Word size (N): 3 letters

Number of permutations(NP):     S!/(S-N)! => (S-0) *  (S-1) .. *(S- N-1)  => 36 * 35 * 34 =>  42840 

For 3 letters code it might be okay to calculate and store all these permutation a head of time. However, this will become harder to accomplish once of code size became larger (12 letter code instead of  3). 


Fortunately there is a way to get the value of  specific permutation  in O(1) time and without having calculate and store all permutation. For that we have to understand the Factoradic base system and relationship to  the Lexicographical permutation order (Wikipedia ).       

   

Factoradic base system: s a mixed radix numeral system adapted to numbering permutationsIt is also called factorial base, although factorials do not function as base, but as place value of digits. By converting a number less than n! to factorial representation, one obtains a sequence of n digits that can be converted to a permutation of n in a straightforward way. 


From number  Factoradic base  to Decimal Base: 


For example  341010! stands for 354413021100, whose value is
  • = 3×5! + 4×4! + 1×3! + 0×2! + 1×1! + 0×0! 
  • = ((((3×5 + 4)×4 + 1)×3 + 0)×2 + 1)×1 + 0
  • =  46310.

From number  Decimal Base to Factoradic base: 

For example, 46310 can be transformed into a factorial representation by these successive divisions until quotient reaches zero:

  • 463 ÷ 1 = 463, remainder 0
  • 463 ÷ 2 = 231, remainder 1
  • 231 ÷ 3 = 77, remainder 0
  • 77 ÷ 4 = 19, remainder 1
  • 19 ÷ 5 = 3, remainder 4
  • 3 ÷ 6 = 0, remainder 3


Lexicographical permutation: 
For example, here are the permutations of the string “abcd” in lexicographical order: 
0  abcd       6  bacd        12  cabd       18  dabc
1  abdc       7  badc        13  cadb       19  dacb
2  acbd       8  bcad        14  cbad       20  dbac
3  acdb       9  bcda        15  cbda       21  dbca
4  adbc       10  bdac       16  cdab       22  dcab
5  adcb       11  bdca       17  cdba       23  dcba



It turns out that there is natural mapping between the index  and permutation in place. Basically,  if convert the index value from decimal base into factorial base, then you will get mapping to generate the permutation in place.

For example, with n = 3, such a mapping is
decimalfactorialpermutation
010000!(0,1,2)
110010!(0,2,1)
210100!(1,0,2)
310110!(1,2,0)
410200!(2,0,1)
510210!(2,1,0)

The leftmost factoradic digit 0, 1, or 2 is chosen as the first permutation digit from the ordered list (0,1,2) and is removed from the list. Think of this new list as zero indexed and each successive digit dictates which of the remaining elements is to be chosen. If the second factoradic digit is "0" then the first element of the list is selected for the second permutation digit and is then removed from the list. Similarly if the second factoradic digit is "1", the second is selected and then removed. The final factoradic digit is always "0", and since the list now contains only one element it is selected as the last permutation digit.

For example, here is how the digits in the factoradic 4041000! (equal to 298210) pick out the digits in (4,0,6,2,1,3,5), the 2982nd permutation of the numbers 0 through 6.
                                 4041000! → (4,0,6,2,1,3,5)
factoradic:  4          0                        4        1          0          0        0!
             |          |                        |        |          |          |        |
    (0,1,2,3,4,5,6) -> (0,1,2,3,5,6) -> (1,2,3,5,6) -> (1,2,3,5) -> (1,3,5) -> (3,5) -> (5)
             |          |                        |        |          |          |        |
permutation:(4,         0,                       6,       2,         1,         3,       5)




Now with this in mind we can implement a small program to return the unique permutation value any index in the lexicographical order.







No comments:

Post a Comment

Generate a unique N letters word sequence (by using relationship between Factoradic base system and Lexicographic Permutation Order of a String)

Lets say that you want to generate a unique 3 letters code sequence to be used as a primary key for some data recodes. Also let assume ...