Pcom is a Parser COMbinator library for Java. It is PEG-based and has a simple API set.
Download, install and write your code.
Download jar-file of pcom, flist, tuple and install them to your environment. (In addition, you must download and install "Apache commons lang".)
Add repository and dependency elements to your pom.xml as below. (flist and "Apache commons lang" will be automatically downloaded.)
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
...
<repositories>
<repository>
<id>jp.nhiguchi</id>
<name>jp.nhiguchi</name>
<url>http://www.nhiguchi.jp/mvnrepos/</url>
</repository>
</repositories>
...
<dependencies>
<dependency>
<groupId>jp.nhiguchi.tuple</groupId>
<artifactId>tuple</artifactId>
<version>0.0.1</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>jp.nhiguchi.pcom</groupId>
<artifactId>pcom</artifactId>
<version>1.0.0a3</version>
<type>jar</type>
</dependency>
</dependencies>
...
</project>
The Javadoc API document is available.
"Parser" and "ParseResult" are key classes.
"Parsers" provides useful factory methods of "Parser".
import jp.nhiguchi.pcom.*;
import static jp.nhiguchi.pcom.Parsers.*;
public class YourClass
{
public static void main(String[] args)
{
Parser<String> pa = string("a");
Parser<String> pA = string("A");
Parser<String> p
= concat(rep1(or(pa, pA)));
ParseResult<String> pr = p.parse("aAaaArest");
System.out.println(pr);
}
}
You can make a parser from a parsing expression of PEG. (The parsing expression must contain terminal symbols only. Nonterminals are prohibited.)
import jp.nhiguchi.pcom.*;
import static jp.nhiguchi.pcom.Parsers.*;
public class YourClass
{
public static void main(String[] args)
{
/* C-like comments. */
Parser<String> p = expr("'/*' (!'*/' .)* '*/'");
ParseResult<String> pr = p.parse("/* comment */");
System.out.println(pr);
}
}
You can indirectly make a recursive parser. (If you try to make one directly, stack overflow occurs in it's factory method.)
An easy way to make a recursive parser is use of RecursionMark.
Another way is to write a lazy parser by hand.
import jp.nhiguchi.pcom.*;
import static jp.nhiguchi.pcom.Parsers.*;
public class YourClass
{
private static Parser<String> badParser()
{
return concat(seq(
string("["),
or(badParser(), string("a")),
string("]")));
}
private static Parser<String> goodParser1()
{
RecursionMark<String> m = new RecursionMark();
return mark(m, concat(seq(
string("["),
or(recur(m), string("a")),
string("]"))));
}
private static Parser<String> goodParser2()
{
Parser q = new Parser<String>() {
public ParseResult<String> parse(String str) {
return goodParser2().parse(str);
}};
return concat(seq(
string("["),
or(q, string("a")),
string("]")));
}
public static void main(String[] args)
{
// Stack overflow
//Parser<String> bp = badParser();
Parser<String> gp1 = goodParser1();
Parser<String> gp2 = goodParser2();
ParseResult<String> pr1 = gp1.parse("[[[a]]]");
System.out.println(pr1);
ParseResult<String> pr2 = gp2.parse("[[[a]]]");
System.out.println(pr2);
}
}
Pcom doesn't support any left recursive parser. (Because stack overflow inevitably occurs in it's parse() method.)
You should use Parsers.rep() or Parsers.rep1() instead of left recursive definition.
import jp.nhiguchi.pcom.*;
import static jp.nhiguchi.pcom.Parsers.*;
public class YourClass
{
/**
* ListOfA <- ListOfA / 'A'
*/
private static Parser<String> listOfA()
{
RecursionMark<String> m = new RecursionMark();
return mark(m, or(recur(m), string("A")));
}
public static void main(String[] args)
{
Parser<String> p = concat(rep(string("A")));
ParseResult<String> pr = p.parse("AAAAA");
System.out.println(pr);
Parser<String> q = listOfA();
// Stack overflow
//ParseResult<String> qr = q.parse("AAAAA");
}
}
Parsers.map() applies a given processing to a parsed object. A typical usage of map is data conversion. Additionaly, data conversion by map is applicable to construct AST.
import jp.nhiguchi.pcom.*;
import static jp.nhiguchi.pcom.Parsers.*;
public class YourClass
{
/**
* Data conversion: String -> Integer
*/
public static void main(String[] args)
{
Map1<String, Integer> toInt
= new Map1<String, Integer>() {
public Integer map(String v) {
return Integer.parseInt(v);
}};
Parser<Integer> p = map(toInt, expr("[0-9]+"));
ParseResult<Integer> pr = p.parse("12345");
System.out.println(pr);
System.out.println(new Integer(12345).equals(pr.value()));
}
}
import jp.nhiguchi.pcom.*;
import static jp.nhiguchi.pcom.Parsers.*;
public class YourClass
{
private static class Person
{
private final String fName;
private final Integer fAge;
private Person(String name, Integer age)
{
fName = name;
fAge = age;
}
}
/**
* Construct AST
*/
public static void main(String[] args)
{
Map1<String, Integer> toInt
= new Map1<String, Integer>() {
public Integer map(String v) {
return Integer.parseInt(v);
}};
Map2<String, Integer, Person> toPerson
= new Map2<String, Integer, Person>() {
public Person map(String name, Integer age) {
return new Person(name, age);
}};
Parser<String> name
= followedBy(expr("[a-zA-Z .]+"), string(":"));
Parser<Integer> age = map(toInt, expr("[0-9]+"));
Parser<Person> p = map(toPerson, name, age);
ParseResult<Person> pr = p.parse("Naoshi HIGUCHI:34");
System.out.println(pr);
System.out.println(pr.value().fName);
System.out.println(pr.value().fAge);
}
}
Pcom provides a substitute of Operator-precedence parser.
import jp.nhiguchi.pcom.*;
import static jp.nhiguchi.pcom.Parsers.*;
import jp.nhiguchi.pcom.opp.*;
public class YourClass
{
private static Parser<String> symbol(String str)
{
Parser<String> ws = expr("[ \t]*");
return trim(ws, string(str));
}
/**
* Simple calculator
*/
public static void main(String[] args)
{
Map1<String, Integer> toInt
= new Map1<String, Integer>() {
public Integer map(String v) {
return Integer.parseInt(v);
}};
Binary<Integer> add
= new Binary<Integer>() {
public Integer map(Integer v1, Integer v2) {
return v1 + v2;
}};
Binary<Integer> subtract
= new Binary<Integer>() {
public Integer map(Integer v1, Integer v2) {
return v1 - v2;
}};
Binary<Integer> multiply
= new Binary<Integer>() {
public Integer map(Integer v1, Integer v2) {
return v1 * v2;
}};
Binary<Integer> divide
= new Binary<Integer>() {
public Integer map(Integer v1, Integer v2) {
return v1 / v2;
}};
Operator<Integer> opAdd
= Operator.infixL(100, add, symbol("+"));
Operator<Integer> opSubtract
= Operator.infixL(100, subtract, symbol("-"));
Operator<Integer> opMultiply
= Operator.infixL(200, multiply, symbol("*"));
Operator<Integer> opDivide
= Operator.infixL(200, divide, symbol("/"));
OppBuilder b = new OppBuilder()
.setOperandParser(map(toInt, expr("[0-9]+")))
.add(opAdd)
.add(opSubtract)
.add(opMultiply)
.add(opDivide)
.addParentheses(symbol("("), symbol(")"))
.addParentheses(symbol("{"), symbol("}"))
.addParentheses(symbol("["), symbol("]"));
Parser<Integer> p = b.toParser();
ParseResult<Integer> pr = p.parse("[2 * {(2 + 3 * 2) / 2}] - 1");
System.out.println(pr);
}
}