The "test-html-tokenize" command:
Tokenize an HTML file. Return the offset and length and text of each token - one token per line. Omit white-space tokens.
Tokenize an HTML file. Return the offset and length and text of each token - one token per line. Omit white-space tokens.