Abstract:
A binary de Bruijn sequence (dB sequence) of order k is a circular binary string that contains each k-length word exactly once as a substring. It is well known that the number of de Bruijn sequences of order k is doubly exponential in k. Most existing algorithms construct a specific dB sequence, or members of a specific class of dB sequences, representing only a tiny fraction of the complete set. The only algorithms capable of generating all dB sequences are based on finding Euler cycles in de Bruijn graphs.
Here, we present an algorithm for constructing random binary dB sequences which uses the extended Burrows-Wheeler Transform [Mantaci et al., 2007]. Our method is simple to implement (less than 120 lines of C++ code) and can produce random dB sequences of any order.
Even though our algorithm does not output dB sequences uniformly at random, it provably outputs each dB sequence with positive probability. The algorithm runs in linear space and near-linear time in the length of the dB sequence and needs less than one second on a laptop computer for orders up to 23, including outputting the sequence. It can be straightforwardly extended to any constant-size alphabet.
To the best of our knowledge, this is the first practical algorithm for generating random dB sequences which is capable of producing all dB sequences. Apart from its immediate usefulness in contexts where it is desirable to use a dB sequence that cannot be guessed easily, we also demonstrate our algorithm's potential in theoretical studies, giving hitherto unknown estimates of the average discrepancy of binary dB sequences.
The code is available (in C++ and python) at https://github.com/lucaparmigiani/rnd_dbseq.
This work was published in the Proceedings of the 16th Latin American Symposium (LATIN 2024) and is joint work with Luca Parmigiani.
--
Comunicaciones DCC