Previous |  Up |  Next

Article

Title: Implementation of directed acyclic word graph (English)
Author: Balík, Miroslav
Language: English
Journal: Kybernetika
ISSN: 0023-5954
Volume: 38
Issue: 1
Year: 2002
Pages: [91]-103
Summary lang: English
.
Category: math
.
Summary: An effective implementation of a Directed Acyclic Word Graph (DAWG) automaton is shown. A DAWG for a text $T$ is a minimal automaton that accepts all substrings of a text $T$, so it represents a complete index of the text. While all usual implementations of DAWG needed about 30 times larger storage space than was the size of the text, here we show an implementation that decreases this requirement down to four times the size of the text. The method uses a compression of DAWG elements, i. e. vertices, edges and labels. The construction time of this implementation is linear with respect to the size of the text, a search for a specific pattern is done in a linear time with respect to the size of the pattern. This implementation preserves both good properties of the DAWG automaton. (English)
Keyword: directed acyclic word graph automaton
Keyword: string matching
Keyword: data structures
MSC: 05C85
MSC: 68P05
MSC: 68Q45
MSC: 68R10
MSC: 68W05
idZBL: Zbl 1265.68116
idMR: MR1899849
.
Date available: 2009-09-24T19:44:12Z
Last updated: 2015-03-24
Stable URL: http://hdl.handle.net/10338.dmlcz/135448
.
Reference: [1] Adámek J.: Coding.MVŠT XXXI, SNTL, Prague 1989 (in Czech)
Reference: [2] Anderson A., Nilson S.: Efficient implementation of suffix trees.Software–Practice and Expirience 25 (1995), 129–141 10.1002/spe.4380250203
Reference: [3] Balík M.: String Matching in a Text.Diploma Thesis, CTU, Dept. of Computer Science and Engineering, Prague 1998
Reference: [4] Crochemore M., Rytter W.: Text Algorithms.Oxford University Press, New York 1994 Zbl 1078.68151, MR 1307378
Reference: [5] Crochemore M., Vérin R.: Direct construction of compact directed acyclic word graphs.In: CPM97 (A. Apostolico and J. Hein, eds., Lecture Notes in Computer Science 1264), Springer–Verlag, Berlin 1997, pp. 116–129 MR 1608404
Reference: [6] Gonnet G. H., Baeza–Yates R.: Handbook of Algorithms and Data Structures.Pascal and C. Addison–Wesley, Wokingham 1991 Zbl 0719.68001
Reference: [7] Huffman D. A.: A method for construction of minimum redundancy codes.Proc. IRE 40 (1952), 9, 1098–1101
Reference: [8] Irving R. W.: Suffix Binary Search Trees, Technical Report TR-1995-7, Computing Science Department, University of Glasgow 199.
Reference: [9] Kärkkäinen J.: Suffix cactus: A cross between suffix tree and suffix array.In: Proc. 6th Symposium on Combinatorial Pattern Matching, CPM95, 1995, pp. 191–204 MR 1467515
Reference: [10] Kurtz S.: Reducing the Space Requirment of Suffix Trees.Software–Practice and Experience 29 (1999), 13, 1149-1171 10.1002/(SICI)1097-024X(199911)29:13<1149::AID-SPE274>3.0.CO;2-O
Reference: [11] Melichar B.: Approximate string matching by finite automata.In: Computer Analysis of Images and Patterns (Lecture Notes in Computer Science 970), Springer–Verlag, Berlin 1995
Reference: [12] Melichar B.: Fulltext Systems.Publishing House CTU, Prague 1996 (in Czech)
Reference: [13] Melichar B.: Pattern matching and finite automata.In: Proceedings of the Prague Stringology Club Workshop ’97, Prague 1997
.

Files

Files Size Format View
Kybernetika_38-2002-1_6.pdf 1.516Mb application/pdf View/Open
Back to standard record
Partner of
EuDML logo