Typed Parsing and Unparsing for Untyped Regular Expression Engines
Regular expressions are used for a wide variety of purposes from web-page input validation to log file crawling. Very often, they are used not only to match strings, but also to extract data from them. Unfortunately, most regular expression engines only return a list of the substrings captured by the regular expression. The data has to be extracted from the matched substrings to be validated and transformed manually into a more structured format.
For richer classes of grammars like CFGs, such issues can be solved using type-indexed combinators. Most combinator libraries provide a monadic API to track the type returned by the parser through easy-to-use combinators. This allows users to transform the input into a custom data-structure and go through complex validations as they describe their grammar.
In this paper, we present the Tyre library which provides type-indexed combinators for regular languages. Our combinators provide type-safe extraction while delegating the task of substring matching to a preexisting regular expression engine. To do this, we use a two layer approach where the typed layer sits on top of an untyped layer. This technique is also amenable to several extensions, such as
routing, unparsing and static generation of the extraction code. We also provide a syntax extension, which recovers
the familiar and compact syntax of regular expressions. We implemented this technique in a very concise
manner and evaluated its usefulness on two practical examples.
Slides (talk_pepm.pdf) | 211KiB |
Mon 14 JanDisplayed time zone: Belfast change
14:00 - 15:30 | |||
14:00 30mTalk | Method Name Suggestion with Hierarchical Attention Networks PEPM Sihan Xu Nankai University, China, Sen Zhang Nankai University, China, Weijing Wang Nankai University, China, Xinya Cao Nankai University, China, Chenkai Guo Nankai University, China, Jing Xu Nankai University, China DOI | ||
14:30 30mTalk | Reduction from Branching-Time Property Verification of Higher-Order Programs to HFL Validity Checking PEPM Keiichi Watanabe University of Tokyo, Japan, Takeshi Tsukada University of Tokyo, Japan, Hiroki Oshikawa University of Tokyo, Japan, Naoki Kobayashi University of Tokyo, Japan DOI | ||
15:00 30mTalk | Typed Parsing and Unparsing for Untyped Regular Expression Engines PEPM Gabriel Radanne University of Freiburg, Germany DOI Pre-print File Attached |