net.sf.parcinj
Class SimpleLexer

java.lang.Object
  extended by net.sf.parcinj.SimpleLexer
All Implemented Interfaces:
Lexer

public class SimpleLexer
extends java.lang.Object
implements Lexer

A simple lexer which can handle keywords, delimiters, and quoted text. It ignores specified delimiter characters like white spaces.

The character starting and finishing quoted text can be specified as well as the escape character for escaping a character in quoted texts. A quoted text Token is of type QuotedText.

Keywords and delimiters are specific strings. They are defined by TerminalSymbolType instances. The TokenType of recognized tokens will be the corresponding TerminalSymbolType.

A keyword is only recognized if it is preceded and followed by either a delimiter string, an ignored delimiter character, or a quoted text. For example, the keyword for is not recognized in form.

Delimiter strings chunk the text in pieces. The longest matching delimiter will be taken as delimiter. Example: ++ and + are delimiters. The text i+++j will lead to the tokens i, ++, +, and j.

Tokens which are neither keywords, delimiters, nor quoted text will be of type NormalToken.


Constructor Summary
SimpleLexer(char escape, char quote, java.lang.String ignoredDelimitingCharacters, TerminalSymbolType... terminalSymbols)
          Creates an instance for the specified escape character, quote character, ignored delimiting characters, and terminal symbols.
SimpleLexer(char escape, char quote, TerminalSymbolType... terminalSymbols)
          Creates an instance for the specified escape character, quote character, and terminal symbols.
SimpleLexer(TerminalSymbolType... terminalSymbols)
          Creates an instance for the specified terminal symbols and no quoting.
 
Method Summary
 TokenIterator createTokenIterator(java.io.Reader reader)
          Creates a token iterator for the specified reader.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SimpleLexer

public SimpleLexer(TerminalSymbolType... terminalSymbols)
Creates an instance for the specified terminal symbols and no quoting. Ignored delimiting characters are ' ', TAB, CR, and LF.


SimpleLexer

public SimpleLexer(char escape,
                   char quote,
                   TerminalSymbolType... terminalSymbols)
Creates an instance for the specified escape character, quote character, and terminal symbols. Ignored delimiting characters are ' ', TAB, CR, and LF.


SimpleLexer

public SimpleLexer(char escape,
                   char quote,
                   java.lang.String ignoredDelimitingCharacters,
                   TerminalSymbolType... terminalSymbols)
Creates an instance for the specified escape character, quote character, ignored delimiting characters, and terminal symbols.

Method Detail

createTokenIterator

public TokenIterator createTokenIterator(java.io.Reader reader)
Description copied from interface: Lexer
Creates a token iterator for the specified reader.

Specified by:
createTokenIterator in interface Lexer