Pattern Language Reference
This appendix presents
an alphabetical list of operators in the TC pattern language.
And
- expr1 and expr2
Matches regions
that match both expressions.
Anywhere After
- anywhere after expr
Matches regions
anywhere after some match to expr.
Anywhere Before
- anywhere before expr
Matches regions
anywhere before some match to expr.
Balanced from-to
- balanced from expr1 to expr2
Matches a nested
set of regions whose start delimiters match expr1 and end delimiters match
expr2.
Case Sensitive
- case sensitive expr
Forces literal
and regular expression matches inside expr to be case-sensitive.
- not case sensitive expr
Forces literal
and regular expression matches inside expr to be case-insensitive
(the default).
Contains
- contains expr
Matches regions
that contain at least one match to expr.
Either-Or
- either expr1 or expr2
Matches regions
that match either expr1 or expr2. The either is optional.
End of
- end of expr
Matches the end
points of regions matching expr. Always returns a set of
zero-length regions.
Ends
- ends expr
Matches regions
that end at the same point as expr. Ignores background regions
around the end point.
Equals
- equals expr
Matches regions
that match expr. Synonyms include = and
equal to. Ignores background regions around both start point and end point.
Flatten
- flatten expr
Flattens the
regions that match expr, by combining nested and
overlapping regions into a single region.
From-To
- from expr1 to expr2
Matches a flat
region set consisting of regions that start with a match to expr1 and end with the next
match to expr2.
Identifier
- identifier
Matches the regions
matched by the pattern bound to identifier in the pattern library.
Ignoring
- expr1 ignoring expr2
Sets the background
set to the regions matching expr2, then evaluates expr1 and returns its matches.
In
- in expr
Matches regions
lie in some match to expr.
Is
- identifier is expr
Assigns the pattern
expr to identifier in the pattern library,
and returns the matches to expr.
Just After
- just after expr
Matches regions
that lie after and adjacent to some match to expr. Ignores background regions
when testing for adjacency.
Just Before
- just before expr
Matches regions
that lie before and adjacent to some match to expr. Ignores background regions
when testing for adjacency.
Literal
- "string"
'string'
Matches regions
consisting of the literal characters string. Either single or double
quotes may be used to delimit the string.
Melt
- melt expr
Melts the regions
that match expr by combining nested, overlapping,
or adjacent regions into a single region.
Nonzero
- nonzero expr
Matches regions
that match expr and contain at least one
character.
Not
- expr1 not expr2
Matches regions
matching expr1 that do not match expr2.
Nth
- nth expr
nth expr1 in expr2
nth expr1 before expr2
nth expr1 after expr2
The first form
matches the nth region in the document
that matches expr. The other forms match
the nth match to expr1 that lies in, before,
or after each match to expr2.
The ``nth'' part can
be written in a variety of ways:
- 1st, 2nd, 3rd, 4th, ...
- first, second, third, ..., tenth
- last, 2nd from last, 3rd from last, ...
- second from last, third from last, ...
Or
- expr1 or expr2
Matches regions
that match either expr1 or expr2.
Overlaps
- overlaps expr
Matches regions
that overlap some region matching expr.
Overlaps End Of
- overlaps end of expr
Matches regions
that overlap the end point of some region matching expr.
Overlaps Start Of
- overlaps start of expr
Matches regions
that overlap the start point of some region matching expr.
Prefix
- prefix identifier expr
Changes the current
namespace to identifier for the scope of expr.
Regular Expression
- /regexp/
Matches regions
that match the regular expression regexp.
Start of
- start of expr
Matches the start
points of regions matching expr.
Starts
- starts expr
Matches regions
that start at the same point as expr. Ignores background regions
around the start point.
Then
- expr1 then expr2
Matches regions
that are the concatenation of a region matching expr1 with a region matching
expr2 that lies after and adjacent
to it. Ignores background regions when determining whether expr1 and expr2 are adjacent.
Trim
- expr1 trim expr2
Matches regions
that match expr1 with an overlapping match
to expr2 removed from the start
or end point.
View
- view source expr
view rendered expr
Forces literals
and regular expressions inside expr to be matched against
the HTML source or the rendered view of a web page. Has no effect on a plain
text document.
Go to the LAPIS home page.
Send comments or questions to Rob Miller, (rcm@lcs.mit.edu)
Copyright©2003 - Massachusetts Institute of Technology. All Rights
Reserved.