Thore Husfeldt

Orthogonal Vectors to Perl Regex

leave a comment »

The Orthogonal Vectors problem is given two sets of d-dimensional 01-vectors to decide if there is an orthogonal pair: two vectors, one from each set, with zero dot product. Obviously, the problem can be decided in time O(n2d) by trying all pairs.

Here’s the a complete reduction from the Orthogonal Vectors problem to evaluation of Perl regular expressions.

chomp($_ = <STDIN>);
tr/1 /0|/;
print !!(<STDIN> =~ /$_/)

This is fully functional Perl code. Save the script as and run it. The process expects two lines on standard input, each describing a set of vectors (as 01-sequences, separated by space). The script returns “1” if and only if the two sets contain an orthogonal pair of vectors.

1000 0100 0010 1001
1110 1101 1111 1100^D

This construction shows that matching an n-character string against an n-character regular expression requires quadratic time under the Strong Exponential Time Hypothesis. So, the obvious quadratic-time dynamic programming algorithm for regex matching is in some sense optimal.

As to the Perl code, the first line eats the trailing newline of standard input. The next two lines contain the main part of the reduction, replacing “0” by “\d” (any digit), “1” by “0”, and “ ” by “|” to build a regex. The last line of the code attempts to match the second line from standard input to the regex. Oh, the double exclamation points? They are a perlism to enforce Boolean context, so that the output is “1” (true) exactly if the regex matches. The reduction is so obviously linear that it hurts.

Tell me this isn’t cute.

The core of the construction underlies the celebrated recent results of Backus and Indyk, Bringmann and Künnemann, and Abboud et al. that show similar bounds for various string edit distances. This particular example is just very concrete version of the NFA-acceptance reduction that appears on Karl Bringmann and Sebastian Krinninger’s slides for an MPI course on Complexity Theory of Polynomial-Time Problems (Summer 2016), there attributed to Russell Impagliazzo. I find that reduction very teachable. The step from NFA acceptance to a concrete Perl script is minimal, but makes it even more concrete (executable code!).

  • Amir Abboud, Arturs Backurs, Virginia Vassilevska Williams: Tight Hardness Results for LCS and Other Sequence Similarity Measures. IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015, pp. 59-78.
  • Arturs Backurs, Piotr Indyk: Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false). Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015, Portland, OR, USA, June 14-17, 2015. pp. 51-58.
  • Karl Bringmann, Marvin Künnemann: Quadratic Conditional Lower Bounds for String Problems and Dynamic Time Warping. IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015. pp. 79-97.

Written by thorehusfeldt

October 10, 2016 at 20:28

Posted in Uncategorized

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: