Package org.jsoup.parser
Class TokenQueue
java.lang.Object
org.jsoup.parser.TokenQueue
A character reader with helpers focusing on parsing CSS selectors. Used internally by jsoup. API subject to changes.
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidDeprecated.will be removed in 1.21.1.voidadvance()Drops the next character off the queue.chompBalanced(char open, char close) Pulls a balanced string off the queue.Deprecated.will be removed in 1.21.1chompToIgnoreCase(String seq) Deprecated.will be removed in 1.21.1.charconsume()Consume one character off queue.voidConsumes the supplied sequence of the queue, case-insensitively.Consume a CSS identifier (ID or class) off the queue.Consume a CSS element selector (tag name, but | instead of : for namespaces (or *| for wildcard namespace), to not conflict with :pseudo selects).Pulls a string off the queue, up to but exclusive of the match sequence, or to the queue running out.consumeToAny(String... seq) Consumes to the first sequence provided, or to the end of the queue.Deprecated.booleanPulls the next run of whitespace characters of the queue.Deprecated.will be removed in 1.21.1static StringGiven a CSS identifier (such as a tag, ID, or class), escape any CSS special characters that would otherwise not be valid in a selector.booleanisEmpty()Is the queue empty?booleanmatchChomp(char c) If the queue matches the supplied (case-sensitive) character, consume it off the queue.booleanmatchChomp(String seq) If the queue case-insensitively matches the supplied string, consume it off the queue.booleanmatches(char c) Tests if the next character on the queue matches the character, case-sensitively.booleanTests if the next characters on the queue match the sequence, case-insensitively.booleanmatchesAny(char... seq) Tests if the next characters match any of the sequences, case-sensitively.booleanmatchesAny(String... seq) Deprecated.will be removed in 1.21.1.booleanTests if queue starts with a whitespace character.booleanTest if the queue matches a tag word character (letter or digit).Consume and return whatever is left on the queue.toString()static StringUnescape a \ escaped string.
-
Constructor Details
-
TokenQueue
Create a new TokenQueue.- Parameters:
data- string of data to back queue.
-
-
Method Details
-
isEmpty
Is the queue empty?- Returns:
- true if no data left in queue.
-
consume
Consume one character off queue.- Returns:
- first character on queue.
-
advance
Drops the next character off the queue. -
addFirst
Deprecated.will be removed in 1.21.1.Internal method, no longer supported. -
matches
Tests if the next characters on the queue match the sequence, case-insensitively.- Parameters:
seq- String to check queue for.- Returns:
- true if the next characters match.
-
matches
Tests if the next character on the queue matches the character, case-sensitively. -
matchesAny
Deprecated.will be removed in 1.21.1. -
matchesAny
Tests if the next characters match any of the sequences, case-sensitively.- Parameters:
seq- list of chars to case-sensitively check for- Returns:
- true of any matched, false if none did
-
matchChomp
If the queue case-insensitively matches the supplied string, consume it off the queue.- Parameters:
seq- String to search for, and if found, remove from queue.- Returns:
- true if found and removed, false if not found.
-
matchChomp
If the queue matches the supplied (case-sensitive) character, consume it off the queue. -
matchesWhitespace
Tests if queue starts with a whitespace character.- Returns:
- if starts with whitespace
-
matchesWord
Test if the queue matches a tag word character (letter or digit).- Returns:
- if matches a word character
-
consume
Consumes the supplied sequence of the queue, case-insensitively. If the queue does not start with the supplied sequence, will throw an illegal state exception -- but you should be running match() against that condition.- Parameters:
seq- sequence to remove from head of queue.
-
consumeTo
Pulls a string off the queue, up to but exclusive of the match sequence, or to the queue running out.- Parameters:
seq- String to end on (and not include in return, but leave on queue). Case-sensitive.- Returns:
- The matched data consumed from queue.
-
consumeToIgnoreCase
Deprecated. -
consumeToAny
Consumes to the first sequence provided, or to the end of the queue. Leaves the terminator on the queue.- Parameters:
seq- any number of terminators to consume to. Case-insensitive.- Returns:
- consumed string
-
chompTo
Deprecated.will be removed in 1.21.1Pulls a string off the queue (like consumeTo), and then pulls off the matched string (but does not return it).If the queue runs out of characters before finding the seq, will return as much as it can (and queue will go isEmpty() == true).
- Parameters:
seq- String to match up to, and not include in return, and to pull off queue. Case-sensitive.- Returns:
- Data matched from queue.
-
chompToIgnoreCase
Deprecated.will be removed in 1.21.1. -
chompBalanced
Pulls a balanced string off the queue. E.g. if queue is "(one (two) three) four", (,) will return "one (two) three", and leave " four" on the queue. Unbalanced openers and closers can be quoted (with ' or ") or escaped (with \). Those escapes will be left in the returned string, which is suitable for regexes (where we need to preserve the escape), but unsuitable for contains text strings; use unescape for that.- Parameters:
open- openerclose- closer- Returns:
- data matched from the queue
-
unescape
Unescape a \ escaped string.- Parameters:
in- backslash escaped string- Returns:
- unescaped string
-
escapeCssIdentifier
Given a CSS identifier (such as a tag, ID, or class), escape any CSS special characters that would otherwise not be valid in a selector. -
consumeWhitespace
Pulls the next run of whitespace characters of the queue.- Returns:
- Whether consuming whitespace or not
-
consumeWord
Deprecated.will be removed in 1.21.1Retrieves the next run of word type (letter or digit) off the queue.- Returns:
- String of word characters from queue, or empty string if none.
-
consumeElementSelector
Consume a CSS element selector (tag name, but | instead of : for namespaces (or *| for wildcard namespace), to not conflict with :pseudo selects).- Returns:
- tag name
-
consumeCssIdentifier
Consume a CSS identifier (ID or class) off the queue.Note: For backwards compatibility this method supports improperly formatted CSS identifiers, e.g.
1instead of\31.- Returns:
- The unescaped identifier.
- Throws:
IllegalArgumentException- if an invalid escape sequence was found. Afterward, the state of the TokenQueue is undefined.- See Also:
-
remainder
Consume and return whatever is left on the queue.- Returns:
- remainder of queue.
-
toString
-