Skip to main content
Home

InfoTech conference

2024 International Conference on Information Technologies

Architecture of a regular expression translator with optimization of intermediate states

Liliya Anatolyevna Demidova
Nikita Andreevich Moroshkin
Department of Corporate Information Systems, MIREA – Russian Technological University, Moscow
Russaian Federation
Abstract:
Regular expressions are a powerful tool for searching words in a text corpus by a specific pattern. Regular expressions are represented in many programming languages with different implementations. These implementations often have varying syntax and mathematical bases, which can create issues with backward compatibility and optimization of the expressions themselves. To address this, we propose a universal translator that can translate one syntax to another with the minimization of intermediate states to speed up the regular expression's operation. Optimization techniques such as naive replacement of fuzzy boundaries, refinement of search patterns, and finding more efficient expressions using genetic evolution algorithms are suggested.
Key words:
regular expressions
genetic algorithms
finite automata
abstract syntax tree
expression optimization
The full text of the report is included in IEEE InfoTech-2024 eProceedings
and will be available on the IEEE Xplorer DL website
https://ieeexplore.ieee.org/xpl/conhome/1828024/all-proceedings