Position Heaps for Cartesian-tree Matching on Strings and Tries

Nishimoto, Akio; Fujisato, Noriki; Nakashima, Yuto; Inenaga, Shunsuke

Computer Science > Data Structures and Algorithms

arXiv:2106.01595 (cs)

[Submitted on 3 Jun 2021 (v1), last revised 14 Aug 2021 (this version, v2)]

Title:Position Heaps for Cartesian-tree Matching on Strings and Tries

Authors:Akio Nishimoto, Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga

View PDF

Abstract:The Cartesian-tree pattern matching is a recently introduced scheme of pattern matching that detects fragments in a sequential data stream which have a similar structure as a query pattern. Formally, Cartesian-tree pattern matching seeks all substrings $S'$ of the text string $S$ such that the Cartesian tree of $S'$ and that of a query pattern $P$ coincide. In this paper, we present a new indexing structure for this problem called the Cartesian-tree Position Heap (CPH). Let $n$ be the length of the input text string $S$, $m$ the length of a query pattern $P$, and $\sigma$ the alphabet size. We show that the CPH of $S$, denoted $\mathsf{CPH}(S)$, supports pattern matching queries in $O(m (\sigma + \log (\min\{h, m\})) + occ)$ time with $O(n)$ space, where $h$ is the height of the CPH and $occ$ is the number of pattern occurrences. We show how to build $\mathsf{CPH}(S)$ in $O(n \log \sigma)$ time with $O(n)$ working space. Further, we extend the problem to the case where the text is a labeled tree (i.e. a trie). Given a trie $T$ with $N$ nodes, we show that the CPH of $T$, denoted $\mathsf{CPH}(T)$, supports pattern matching queries on the trie in $O(m (\sigma^2 + \log (\min\{h, m\})) + occ)$ time with $O(N \sigma)$ space. We also show a construction algorithm for $\mathsf{CPH}(T)$ running in $O(N \sigma)$ time and $O(N \sigma)$ working space.

Subjects:	Data Structures and Algorithms (cs.DS)
Cite as:	arXiv:2106.01595 [cs.DS]
	(or arXiv:2106.01595v2 [cs.DS] for this version)
	https://doi.org/10.48550/arXiv.2106.01595

Submission history

From: Shunsuke Inenaga [view email]
[v1] Thu, 3 Jun 2021 04:53:23 UTC (1,310 KB)
[v2] Sat, 14 Aug 2021 09:10:08 UTC (1,310 KB)

Computer Science > Data Structures and Algorithms

Title:Position Heaps for Cartesian-tree Matching on Strings and Tries

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Data Structures and Algorithms

Title:Position Heaps for Cartesian-tree Matching on Strings and Tries

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators