Haskell > Scala > (Java 7 Functional Java) > Java

5 09 2008

Write a parser that accepts command line arguments.
For each command line argument, print out true or false,
depending on whether or not that value parses.

For a value to parse, it must consist entirely of these
characters: '(', ')', '[', ']'. Any other character renders
the parse invalid and so prints “false”.

The value must also be “balanced”. That is, for each opening
parenthesis or bracket, it is matched by a closing parenthesis
or bracket accordingly. The different bracket types must not
intersperse.

Here are some examples of successfully parsing values that
result in printing “true”:

""
"()"
"[]"
"([])"
"[()]"
"[]()"
"[][[([])]]"

Here are some examples of values that fail the parsing and
result in printing “false”:

"("
")"
"["
"]"
"]["
")("
"( )"
"([)"
"[)]"
"([)]"
"({})"
"[())]"

Using Haskell (without Parsec):

import Prelude hiding (foldr, any)
import Control.Monad
import Data.Foldable
import System

tokens = [('(', ')'), ('[', ']')]

parse x = any null (foldr (\c f -> (\s -> (const (c:s)) `fmap`
            flip lookup ((uncurry (flip (,))) `fmap` tokens)  c `mplus`
              (flip lookup tokens c >>= (\v ->
                do h:t <- return s
                   guard (v == h)
                   return t))) =<>= print . (parse `fmap`)

Using Scala (without scala.util.parsing):

object Parser {
  val tokens = List(('(', ')'), ('[', ']'))

  def parse(s: String) =
    s.foldRight[Option[String]](Some(""))((a, b) => b flatMap
      (ss => {
        val s = ss.toList
        (lookup(tokens map { case (x, y) => (y, x) }, a)
          map (t => (a::s)) orElse
        (lookup(tokens, a) flatMap (v => s match {
          case Nil => None
          case h :: t => if(v == h) Some(t) else None
        }))) map (_.mkString)
      })) exists (_.isEmpty)

  def main(args: Array[String]) {
    args foreach (parse _ andThen println)
  }

  // The Scala API needs hand-holding
  def lookup[A, B](x: List[(A, B)], a: A) =
    x.find { case (aa, _) => a == aa } map (_._2)
}

Using Java 1.5 and Functional Java:

import fj.F;
import fj.F2;
import fj.Function;
import static fj.P.p;
import fj.P1;
import fj.P2;
import fj.data.List;
import static fj.data.List.asString;
import static fj.data.List.fromString;
import static fj.data.List.list;
import static fj.data.List.lookup;
import fj.data.Option;
import static fj.data.Option.some;
import static fj.function.Strings.isEmpty;
import static fj.pre.Equal.charEqual;
import static fj.pre.Show.booleanShow;

final class Parser {
  @SuppressWarnings("unchecked")
  static final List<p2> tokens =
    list(p('(', ')'), p('[', ']'));

  static boolean parse(final String s) {
    return fromString(s).foldRight(
      new F2<character, Option, Option>() {
      public Option f(final Character c, final Option b) {
        return b.bind(new F<string, Option>() {
          public Option f(String s) {
            final List ss = fromString(s);
            return lookup(charEqual,
              tokens.map(P2.swap_()), c)
                .map(Function.<character,
                  List>constant(ss.cons(c)))
                .orElse(lookup(charEqual, tokens, c)
                .bind(new F<character, Option<list>>() {
                  public Option<list> f(final Character v) {
                    return ss.isEmpty() || v != ss.head() ?
                        Option.<list>none() :
                        some(ss.tail());
                  }
                })).map(asString());
          }
        });
      }
    }, some("")).exists(isEmpty);
  }

  public static void main(final String[] args) {
    for(final String s : args) {
      booleanShow.println(parse(s));
    }
  }
}

Using Java BGGA and Functional Java:

import fj.Function;
import static fj.P.p;
import fj.P1;
import fj.P2;
import fj.data.List;
import static fj.data.List.asString;
import static fj.data.List.fromString;
import static fj.data.List.list;
import static fj.data.List.lookup;
import fj.data.Option;
import static fj.data.Option.some;
import static fj.function.Strings.isEmpty;
import static fj.pre.Equal.charEqual;
import static fj.pre.Show.booleanShow;

final class ParserBGGA {
  @SuppressWarnings("unchecked")
  static final List<p2> tokens =
    list(p('(', ')'), p('[', ']'));

  static boolean parse(final String s) {
    return fromString(s).foldRight({ char c, Option b =>
      b.bind({ String ss =>
        final List s = fromString(ss);
          lookup(charEqual,
            tokens.map(P2.swap_()), c)
          .map(Function.<character, List>constant(s.cons(c)))
          .orElse(lookup(charEqual, tokens, c)
          .bind({ char v =>
            s.isEmpty() || v != s.head() ?
              Option.<list>none() :
              some(s.tail())})).map(asString())})}
      , some("")).exists(isEmpty);
  }

  public static void main(final String[] args) {
    for(final String s : args) {
      booleanShow.println(parse(s));
    }
  }
}

A submission by a Java programmer who volunteered:

import java.util.ArrayList;

public class IsBalanced {

  private Stack stack = new Stack();

  public boolean isBalanced(String expression) {

    boolean isBalanced = true;

    for (int i = 0; i < expression.length(); i++) {
      char nextChar = expression.charAt(i);

      if (nextChar == '(') {
        stack.push('(');
      }
      else if (nextChar == '[') {
        stack.push('[');
      } else {
        if (!stack.isEmpty()) {
          char stackChar = (Character) stack.peek();
          if (nextChar == ')') {
            if (stackChar == '(') {
              stack.pop();
            } else {
              return false;
            }

          } else if (nextChar == ']') {
            if (stackChar == '[') {
              stack.pop();
            } else {
               return false;
            }
          }
        } else {
          return false;
        }
      }
    }

    if (!stack.isEmpty()) {
      isBalanced = false;
    }
    return isBalanced;

  }

  public static void main(String[] args) {
    System.out.println(args[0]);
    System.out.println((new IsBalanced()).isBalanced(args[0]));
  }

  private class Stack {

    private ArrayList stack = new ArrayList();

    public void push(char c) {
      stack.add(c);

    }

    public void pop() {
      stack.remove(stack.size()-1);
    }

    public Object peek() {
      return stack.get(stack.size()-1);
    }

    public boolean isEmpty() {
      return stack.size() == 0;
    }
  }
}

Forget the syntax for a moment and consider the compositional properties of each of these programs. For example, what if I were to add another pair of brackets such as '{' and '}'? Are there sub-programs that exist independently and can be extracted from the larger program? How does that affect my ability to reason (and therefore, read) the code?

What properties about the code make it easy to read?


Actions

Information

58 responses

5 09 2008
augustss

I’m sorry, but that Haskell code is unreadable. Here’s an alternative:

import Control.Monad
import Text.ParserCombinators.Parsec
import System

pArg =
     do char '('; pArg; char ')'
 <|> do char '['; pArg; char ']'
 <|> return ' '

testArg = either (const False) (const True) . parse (pArg >> eof) ""

main = getArgs >>= print . fmap testArg
5 09 2008
huu

valid ::= ‘(‘ valid ‘)’ | ‘[' valid ']‘ | valid valid | e

5 09 2008
bearophile

Python version, you can add a new pair of brackets adding them to the parens tuple:
def parse(s, parens=(“[]“,”()”)):
__ return (not s or
__ any((s[0]==o and parse(s[1:-1]) and s[-1]==c) for o,c in parens)
__ or any(parse(s[:i]) and parse(s[i:]) for i in xrange(1, len(s))))

5 09 2008
JN

object Parser {
def tokenMap = Map(‘(‘ -> ‘)’, ‘[' -> ']‘)

def parseUntil(l : List[Char], end : Char) : List[Char] =
if (l.head == end) l.tail else parseUntil(parseUntil(l.tail, tokenMap(l.head)), end)

def parse(l : List[Char]) : List[Char] =
if (l.isEmpty) l else parse(parseUntil(l.tail, tokenMap(l.head)))

def main(args : Array[String]) = args map (_.toList) foreach parse
}

5 09 2008
bearophile

A bit better:
def parse(s, parens=(“[]“, “()”)):
__ return (not s or
___ (parse(s[1:-1]) and any(s[0]==o and s[-1]==c for o,c in parens))
___ or any(parse(s[:i]) and parse(s[i:]) for i in xrange(1, len(s))))

5 09 2008
JN

Or in an imperative style:

object Parser {
  def tokenMap = Map('(' -> ')', '[' -> ']')

  def parsePair(l : List[Char]) : List[Char] = {
    var l2 = l.tail

    while (l2.head != tokenMap(l.head))
      l2 = parsePair(l2)

    l2.tail
  }

  def parse(l : List[Char]) = {
    var l2 = l

    while (!l2.isEmpty)
      l2 = parsePair(l2)

    l2
  }

  def main(args : Array[String]) = args map (_.toList) foreach parse
}
5 09 2008
Stephen Gregory

I agree Java isn’t usually the most elegant for the sake of comparison it could be done like this.


import java.util.Stack;

public class IsBalanced {

private static final boolean matches(char stack, char next){
switch (stack){
case '(': return next ==')';
case '[': return next ==']';
}
return false;
}

public boolean isBalanced(String expression) {
Stack stack = new Stack();

for(char nextChar: expression.toCharArray()){
if (nextChar == '('|| nextChar == '[') {
stack.push(nextChar);
} else if (!stack.isEmpty()&&matches(stack.peek(), nextChar)) {
stack.pop();
} else {
return false;
}
}

return stack.isEmpty();
}

public static void main(String[] args) {
System.out.println(args[0]);
System.out.println((new IsBalanced()).isBalanced(args[0]));
}

}

5 09 2008
walter chang

even shorter, in scala:

import util.parsing.combinator.syntactical._

object BracketParenParsers extends StandardTokenParsers {

def main(args: Array[String]) {
args.foreach(parse)
}

lexical.delimiters += (“[", "]“, “(“, “)”)

def pairs: Parser[Any] = ( “[" ~ pairs ~ "]” | “(” ~ pairs ~ “)” )*

def parse(input: String) {
phrase(pairs)(new lexical.Scanner(input)) match {
case Success(_, _) => println(“true”)
case _ => println(“false”)
}
}
}

6 09 2008
olt

IMHO they are all very hard to read. The only readable is the last Java one, but that one is not extensible.

So, python FTW!
http://paste.pocoo.org/show/84473/

6 09 2008
Foo

Man, there’s some crappy Java there.

import java.util.*;
import java.io.*;

public class Parse {
public static ArrayList in = new ArrayList(Arrays.asList(new Object[] { ‘(‘, ‘[' }));
public static ArrayList out = new ArrayList(Arrays.asList(new Object[] { ‘)’, ‘]’ }));

public static boolean isBalanced(char[] vals) {
char[] old = new char[vals.length];
int j = -1;
for(char val : vals) {
int n = out.indexOf(val);
if (j >= 0 && n >= 0 && old[j] == in.get(n)) –j;
else old[++j] = val;
}
return j == -1;
}

public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
while(scan.hasNextLine())
System.out.println(isBalanced(scan.nextLine().toCharArray()));
}
}

6 09 2008
Evgeniy

public class IsBalanced {
private final String[] pairs = {“\\(\\)”, “\\[\\]“, “\\{\\}”};
public boolean isBalanced(String subj) {
int len;
while (true) {
len = subj.length();
for (int i = 0; i < pairs.length; i++) {
subj = subj.replaceAll(pairs[i], “”);
}
if (len == subj.length())
break;
}
return subj.length() == 0;
}
public final static void main(String[] args) {
IsBalanced me = new IsBalanced();
System.out.println(“Subject string is” + (me.isBalanced(args[0]) ? “” : ” NOT”) + ” balanced”);
}
}

6 09 2008
lamby

/(?(\((?>(?&r))*\))|(\[(?>(?&r))*\]))*/ under PCRE.

6 09 2008
Marcio

I’m sorry, but what is the point? Your Haskell code is UNREADABLE! It’s like russian or chinese!

6 09 2008
cschneid

<?php

foreach (array_splice($argv, 1) as $arg)
{
$tokens = array(“()”, “[]“);
while (($new = str_replace($tokens, array(), $arg)) != $arg)
$arg = $new;
echo $arg === “” ? “true\n” : “false\n”;
}

6 09 2008
lamby

Prolog anyone?

balanced(X) :- balanced(X, []).
balanced([], []) :- true.
balanced([H|T], [H|Ts]) :- balanced(T, Ts).
balanced(['['|T], S) :- balanced(T, [']‘|S]).
balanced(['('|T], S) :- balanced(T, [')'|S]).

6 09 2008
michael

It depends on the programmer

List in = Arrays.asList(new Character[] { '(', '[' });
List out = Arrays.asList(new Character[] { ')', ']' });

boolean isBalanced(char[] characters) {
  Stack stack = new Stack();

  for(char character : characters) {
    if(in.contains(character)) {
      stack.push(character);
    } else if(out.contains(character) && !stack.isEmpty() && in.indexOf(stack.peek()) == out.indexOf(character)) {
      stack.pop();
    } else {
      return false;
    }
  }

 return stack.isEmpty();
}
6 09 2008
Bob

huu has it right. CFG >> all other languages for this problem.

I wish people would stop making up trivial problems and using them to “prove” their language is best, but in this case the Haskell code offered is so amusingly obscure it refutes your thesis, so I guess it’s ok.

6 09 2008
Stefan Wagner

struggeling again with the wordpress-syntax – maybe it’s fixable, and maybe a permanent link is possible.

public class PairMatcher
{
	public String [] pairs;
	public PairMatcher (String [] pairs)
	{
		this.pairs = pairs;
	}

	/** reduce as long as it has an effect */
	public boolean wellformed (String expression)
	{
		String old;
		do
		{
			old = expression;
			expression = reduce (expression);
		} while (old.length () > (expression.length ()));

		return expression.length () == 0;
	}

  	/**	replace a pair in the kern "([()])" => "([])"  	*/
	public String reduce (String expression)
	{
		for (String pair : pairs)
		{
			if (expression.contains (pair))
				return expression.replace (pair, "");
		}
		return expression;
	}

	public static void main (String[] args)
	{
		PairMatcher pm = new PairMatcher (new String [] {"[]", "()"});
		for (String test : args)
		{
			System.out.println (test + "\t" + pm.wellformed (test));
		}
	}
}

and maybe I missed the topic of the lesson. :)

6 09 2008
Stefan Wagner

Mhm. Talking about wordpress:
The <pre>

foo
    bar
         code
    rab
oof

</pre>
works quiet nice, but not in the preview. :)
And I would like to use those nice colorizing.
I tried {{syntax:java}} and [code:java]] String foo="foo"; [[/code] without success...

6 09 2008
Spencer Tipping

If you’re willing to use ASCII values, then you can make the code algorithmically very simple by using the two least-significant bits as markers. The second-least significant bit is 1 for closers and 0 for openers, and the least-significant bit is 1 for square brackets and 0 for parens.

(How do you get the code to indent?)

import java.util.LinkedList;

public class match {
  private static boolean chars_valid (char[] cs) {
    for (char c : cs)
      if ("()[]".indexOf (c) == -1)
        return false;
    return true;
  }

  private static boolean match_valid (char[] cs) {
    LinkedList<Character> state = new LinkedList<Character> ();
    for (char c : cs)
      if ((c & 0x2) == 0)
        state.addFirst (c);
      else if ((state.size () == 0) || (((state.remove () ^ c) & 0x1) != 0))
        return false;
    return state.size () == 0;
  }

  public static void main (String[] args) {
    for (String a : args)
      System.out.println (
        chars_valid (a.toCharArray ()) &&
        match_valid (a.replace ('(', '0').replace (')', '2')
                      .replace ('[', '1').replace (']', '3').toCharArray ()));
  }
}

This might be more elegant in C.

Spencer

6 09 2008
cschneid

PHP:

foreach (array_splice($argv, 1) as $arg)
{
	$tokens = array("()", "[]");
	while (($new = str_replace($tokens, array(), $arg)) != $arg)
		$arg = $new;
	echo $arg === "" ? "true\n" : "false\n";
}
6 09 2008
Tony Morris

So er thanks for all the comments everybody. There are no doubt more elegant solutions to this problem (heck, the existence of Parsec – QED). However, the “thesis” (Bob) is to observe compositional aspects of the programs.

6 09 2008
Java Programmer

What you write tells lot about you. Looking at your coding, at best it describes you as armature programmer. Do you really use f, s and j as variable names? May be when you are the only one to read your code (or never read again at all)

If you write code like this in any organization you wouldn’t last to see a single full Friday in office

6 09 2008
Will

Yuck. What is this? Why hasn’t someone chimed in with the proper state machine solution? You guys need some Knuth in your life. Desperately so.

6 09 2008
Tony Morris

Who else would submit the charge of amateurism (armaturism?) for variable names, but a self-proclaimed “Java Programmer”. tehe, how amusing :)

Will, the proper solution is to use a parser such as those submitted (e.g. monadic parsers as in the case of augustss and scala.util.parsing). I think the point of this post was lost and I should have made it clearer.

I rushed it in the end, since it had been sitting around for a while. Oh well, next time.

6 09 2008
Will

Sweating the compositional aspects at this point won’t gain you an inch, as you haven’t completed your analysis of the problem at hand and grokked its true nature. None of the solutions are conducive to clear thought. The language politics are of no consequence. Everyone loses.

6 09 2008
Will

I agree that using a prepackaged parser makes sense in general. Sorry if I’m coming across like a pathological douche; I know you’re trying to make some weird point and it’s flying over my head.

6 09 2008
jb

Will, by the time you have contemplated the true nature of the stream, the water will have flowed on, and that stream will no longer be in front of you.

6 09 2008
Indefinite Articles » Balanced Tokens in Many Languages

[...] Tony Morris blogs about his solution to the problem of validating ‘balanced pairs’ of to… [...]

6 09 2008
DavidLG

A cute Scala solution using pattern matching:

object Parser {
  def main(args: Array[String]) {
    def takeBalanced(s: List[Char]): Option[List[Char]] = s match {
      case Nil => Some(Nil);
      case ('(' :: ss) => takeBalanced(ss) flatMap {
        case (')' :: sss) => takeBalanced(sss); case _ => None }
      case ('[' :: ss) => takeBalanced(ss) flatMap {
        case (']' :: sss) => takeBalanced(sss); case _ => None }
      case ss => Some(ss);
    }
    for (s <- args) println(takeBalanced(s.toList) == Some(Nil));
  }
}

Btw, this to me is the “functional programming” solution, the one that uses recursion and immutable structures instead of iteration and a mutable stack. I don’t even have a name for the technique where there are “maps” and “folds” on every line, except maybe “unreadable”…

7 09 2008
Foo

Tony, your “thesis” is that Haskell is better than Java. It’s proudly announced in the title of your blog entry. the compositional bit is just tucked in at the very end of a long chunk of code. In your example you post an almost unreadable bit of Haskell, which happens to be shorter than an incompetent bit of Java writing. As I showed, it’s trivial to replace this with Java code which is rather better than your Haskell code from a readability and maintainability standpoint, and is almost as short. (Note that your blogmachine strips out double-minuses in comments, so what should have been “decrement j” got turned into “minus j”; in addition to the stripping out of formatting, bah, is this a coding blog or what?)

I love Haskell. But as it turns out in for this problem, decent Java is far easier to read, maintain, and write, than decent Haskell. Perhaps that’s not what you wanted to say.

7 09 2008
James Iry

Will, this language is not regular so it can’t be recognized by any (finite) state machine. If it could be recognized by a state machine everybody would have written a regular expression and moved on.

It can, however, be recognized by a pushdown automaton. Many of the solutions offered are based on that – some use an explicit stack and some use the call stack. The ones based on parsers treat it as a CFG which is more powerful than necessary but cleanly solves the problem.

7 09 2008
Spencer Tipping

This comes across very nicely in C as a purely-functional program. When properly commented, the code should be easy enough to read:

#include <stdio.h>

int type  (char c) { return c == '(' || c == ')'; }
int open  (char c) { return c == '(' || c == '['; }
int valid (char c) { return c == '(' || c == ')' || c == '[' || c == ']'; }

char *matches (char current, char *rest) {
  // If the current character opens a group, then check if the first character
  // of the rest closes it. If so, then jump past the group. Otherwise, enter
  // one level.

  // Base case. If we get a null string, then return null.
  if (! rest) return NULL;

  // Base case. If we hit the end of the string, then the current character must
  // be null as well to indicate that we've closed everything.
  if (*rest == '') return (current == '') ? rest : NULL;

  // Check. If the current character is invalid, then reject.
  if (! valid (*rest)) return NULL;

  // Base case. If we open and immediately close, then return the character
  // after the closing paren.
  if (open (current) && ! open (*rest) && type (current) == type (*rest))
    return rest + 1;

  // Inductive case. If we open and do not immediately close, then step into the
  // parens. We must be prepared to come back out when that happens, though, so
  // pass the current character into the next run and jump past the closing
  // paren.
  if ((open (current) || current == '') && open (*rest))
    return matches (current, matches (*rest, rest + 1));

  // Base case. If nothing else works, then fail.
  return NULL;
}

int main (int argc, char **argv) {
  int i = 0;
  for (i = 1; i < argc; i++)
    printf ("%s\n", matches ('', argv[i]) ? "true" : "false");
  return 0;
}

This is more straightforward than trying to use imperative programming, since you get a stack for free.

Spencer

7 09 2008
Kevin

@Foo: “decent Haskell” would use Parsec.

7 09 2008
Tony Morris

Thank you Kevin, I’m not sure why this needs stating. Can anyone talk about the compositional aspects of both programs already? Why can it all be abstracted to a monadic parser for example? Does the language matter in answering this question?

This has really got out of hand and I didn’t expect it.

8 09 2008
Neal Gafter

Will: this problem can’t be solved with a state machine. This can be proven using the pumping lemma for regular languages. That is why the proposed solutions use push-down automata.

8 09 2008
Steve Cooper

If we’re interested in composability, I’d definitely want something more general than any of the listed solutions. (I’ll use what I think are haskell type annotations, but my haskell’s not so hot, so excuse any mistakes. I do C# by day.)

First pass, a little more flexible, would be to have a function which accepts a set of brace pairs, with a signature like;

[string]->string->bool.

Or in C#;

bool Parse(string input, IEnumerable bracePairs)

which you partially-apply like;

var tokens = new [] { “[]“, “{}” };
Predicate bracketParser = args => Parse(args, tokens);

so you can pass your list of open-close characters in. That would extend it to use any number of bracket pairs.

More generally, though, why not extract the string data type entirely? The underlying idea, I think, is that we want to pair open and close operations on a sequence. So paren-matching is connected to these predicates (C#);

char => char == ‘(‘
char => char == ‘)’

But who cares about characters specifically? We can get to a general matcher, which will match paired open/close predicates in any sequence. I think the haskell type looks like [(a->bool, a->bool)]->a->bool. (That first type should be ‘list of pairs of predicates of a’. In C#

bool Parse(
IEnumerable
inputSequence,
IEnumerable<Pair<Predicate
, Predicate> openClosePredicates)

I’ve written a C# version which I think is fairly general. It’s on another machine, but I’ll post it if you’re interested.

Basic idea, though, is to use a stack to match open and close predicates. So;

Create a stack of close-predicates. These are, say, the close-brcakets we’re waiting for,

Loop through the sequence.

If you match an open-predicate to the current item (say, char => char == ‘(‘) you push the paired close-predicate (say, char => char == ‘)’) onto the stack.

If you can match the close-predicate at the top of the stack, great — pop it off, we’ve got paired brackets.

Otherwise, you’ve got a mismatched sequence and return false.

At the end, you return true iff you’ve got an empty stack.

Anyway, this allows you to write general open/close parsers. If you wanted to write a parser for a VB-like language, you could pass in paired functions like

statement => Regex.IsMatch(statement, “if .* then”)
statement => statement == “end if”

working on a sequence of lines of code.

Floating around in the back of my head, half-baked, is the idea that maybe you could combine or pass parsers to each other somehow; something like;

vbParser = Parser.Compose(IfBlockParser, ForLoopParser, WhileLoopParser)

Hmm…

8 09 2008
Steve Cooper

Oops. The code that starts that link should look like

bool Parse<A>(
IEnumerable<A> inputSequence,
IEnumerable<Pair<Predicate<A>, Predicate<A>> openClosePredicates)

9 09 2008
Bruce Chapman

java – Emphasising readability over efficiency

public class BracketParser {

    public static void main(String[] args) {
        for(String arg : args) {
            System.out.println(parse(arg));
        }
    }

    public static boolean parse(String arg) {
       while(arg.length() !=0 && ( arg.contains("()") || arg.contains("[]"))) {
            arg = arg.replace("[]", "").replace("()", "");
        }
        return arg.length() == 0;
    }
}
10 09 2008
aes

Spencer Tipping beat me to it with a purely functional C solution – ah well. Let’s compete in brevity then.

A C solution:

#include <stdio.h>

char matching_parens[256] = { ['('] = ')', ['['] = ']' };

const unsigned char* look_for(unsigned char expected, const unsigned char* str)
{
    return (str == 0) ? 0
        :  (*str == expected) ? *str == 0 ? str : str + 1
        :  (matching_parens[*str]) ?
           look_for(expected, look_for(matching_parens[*str], str + 1))
        :  0;
}

int main(int argc, const char* argv[]) {
    while (++argv, --argc) {
        const unsigned char* result = look_for(0, *((unsigned char**) argv));
        printf((result != 0 && *result == 0) ? "true\n" : "false\n");
    }
    return 0;
}
10 09 2008
aes

Trying to write a cleaner and better composed version of the above no less than triples the line count from 20 to 60. But at least it’s neatly split into three ‘modules’:

Bracket configuration. Single point of configuration of valid opening and closing brackets in the initializer of char matching_brackets[]. Public interface:

is_opening_bracket(char c)
get_matching_closing_bracket(char c))

The parsing module. Public interface:

check(const char* str)

The main program. Public interface:

main(int argc, const char* argv[])

#include <stdio.h>
#include <assert.h>

/* Section 1: configuration of brackets and accessors to it. */

/* Mapping from opening brackets to closing brackets. */
char matching_brackets[256] = {
    ['('] = ')',
    ['['] = ']'
};

int is_opening_bracket(char c)
{
    return matching_brackets[(unsigned char) c] != '';
}

char get_matching_closing_bracket(char c)
{
    assert(is_opening_bracket(c));
    return matching_brackets[(unsigned char) c];
}

/* Section 2: the parsing function itself. */

/* Parse a sequence of valid start/end brackets followed by
 * 'expected'.*/
const char* parse(char expected, const char* str)
{
    return
            /* In case of NULL input, return NULL. */
            (str == NULL) ? NULL
            /* In case of succeeded match, return the rest of the string;
             * as a special case, if 'expected' is 0, return pointer to
             * it to signal success. */
        :   (*str == expected) ? *str == 0 ? str : str + 1
            /* In case of opening bracket, parse until its matching
             * closing bracket and then continue from there */
        :   is_opening_bracket(*str) ?
            parse(expected, parse(get_matching_closing_bracket(*str), str + 1))
            /* In case of anything else, return NULL */
        :   NULL;
}

/* Is str a valid sequence of paired brackets? */
int check(const char* str)
{
    const char* result = parse(0, str);
    return (result != 0 && *result == 0);
}

/* Section 3: the main program. */

/* Walk through all parameters and show parse results. */
int main(int argc, const char* argv[])
{
    while (++argv, --argc) {
        printf(check(*argv) ? "true\n" : "false\n");
    }
    return 0;
}

Unlike some might say, it would seem that brevity is occasionally the enemy of maintainability. Or is it just a daydream by us Blub programmers?

10 09 2008
Arnaud Bailly

Hello,
Thank you for the problem. I proposed it for yesterday’s dojo in Paris and we have had a very interesting session. We came up with a simple solution based on replaceAll(), like one proposal among the preceding comments, except of course we did it TDD :-) An interesting point IMHO is that all of the people around the table (about 10) did manage to help making progress towards the solution, as they were not impaired by the syntax (java) nor the environment (Eclipse), beyond the first 10 minutes setup, although most of them were not coding in java daily. I regret to say we did not achieve the same “effectiveness” with Haskell, which seems to be quite hard to grasp for most people.

Anyway, thanks again for this nice setting. Could you be a bit more clear about the composability issues you were aiming at ?

21 09 2008
Paul King

Groovy version of Bruce’s solution:

def parse(s) { def t = s.replace(“[]“, “”).replace(“()”, “”); t == s ? t : parse(t) }
args.each { println parse(it) == “” }

A little less imperative:

brackets = ["[]“, “()”]
def parse2(s) { def t = brackets.inject(s){ a, b -> a.replace(b,”") }; t == s ? t : parse2(t) }

25 09 2008
Steven Shaw

Here’s my top-down parser version in Java.

public class SimpleBraceParser {

  private int index = 0;
  private String s;

  public SimpleBraceParser(String s) { this.s = s; }

  private void skip(char c) {
    ++index;
  }

  private char currentChar() {
    if (index < s.length()) return s.charAt(index);
    else return ''; // Evil "sentinel" value
  }

  private void eat(char c) {
    if (currentChar() == c) {
      ++index;
      return;
    }
    throw new RuntimeException("parse error at " + index + ", expecting '" + c + "'");
  }

  public void parseRoundBrace() {
    skip('(');
    parseExpr();
    eat(')');
  }

  public void parseSquareBracket() {
    skip('[');
    parseExpr();
    eat(']');
  }

  public boolean parseExpr() {
    if (currentChar() == '(') {
      parseRoundBrace();
      return true;
    } else if (currentChar() == '[') {
      parseSquareBracket();
      return true;
    } else return false;
  }

  public void parseExprs() {
    while (parseExpr()) {
    }
  }

  public void parse() {
    parseExprs();
    if (currentChar() != '')
      throw new RuntimeException("not terminated correctly at " + index);
  }

  static boolean reportErrors = false;

  public static void main(String[] args) {
    for (String arg : args) {
      SimpleBraceParser p = new SimpleBraceParser(arg);
      try {
        p.parse();
        System.out.println("true");
      } catch (Exception e) {
        if (reportErrors) e.printStackTrace(System.err);
        System.out.println("false");
      }
    }
  }

}

Yep, it’s pretty long! I like top-down parsers as they are pretty intuitive. I’d use Antlr for any more serious parser work.

I really like the look at the Haskell version with Parsec. I tested your Haskell, Scala and Java versions along with my Java version and the Parsec version (amongst others). Your Haskell version and Parsec are very fast – Parsec is faster :) . I only used /usr/bin/time to test so the JVM startup time could have hampered the Java solutions. My recursive Java version and your Scala version require stack-size increases (on my large inputs).

With the Scala version performance extremely poorly on large inputs. Pretty sure it doesn’t matter if the inputs are heavily nested or not. Maybe Scala isn’t the best language for functional style code. Are there any ways to improve the performance and stay functional?

5 10 2008
Krzysztof Kościuszkiewicz

I have to admit that your Haskell solution is far from being clear, and that doesn’t help you with geting your main point across.

I’m posting a shorter and less dense solution to your problem below. Hope that helps :)


import Control.Monad
import System.Environment

tokens = [('[',']'), ('(',')')]

parse [] = Just []
parse (c:cs) = case lookup c tokens of
Nothing -> return (c:cs)
Just end -> parse cs >>= consume end >>= parse

consume c [] = Nothing
consume c (x:xs) = guard (c == x) >> return xs

main = getArgs >>= mapM (print . maybe False null . parse)

11 10 2008
Tony Morris

Actually, the fact that you can post a neater solution was exactly the point, but that was lost a long time ago. The neater solution can be derived from the less neat solution by replacing lower-order expressions with higher-order expressions and kapoof! Let there be Parsec.

15 05 2009
Igor K.

I’d argue this Perl implementation is cleaner and (gasp!) more readable than most of the above:

while(s’(\(\)|\[\])”g){}

…or, the whole program:

for(@ARGV){
while(s’(\(\)|\[\])”g){}
print $_ ? “false\n”:”true\n”
}

Why is it cleaner? Because it directly explains the algorithm: for each string on the command line, substitute ‘()’ or ‘[]‘ with ” while you can, and then print the “false” if the string is non-empty, “true” otherwise.

There’s nothing else in the program that distracts reader from the programmer’s intent. That’s why I think it’s a cleaner and more readable implementation.

10 06 2009
John Elroy Demsford
import java.util.*;

public class Parser extends HashMap {

	public Parser (String pairs) {
		for (int i = 0; i < pairs.length ();
			put (pairs.charAt (i++), pairs.charAt (i++)));
	}

	public boolean parse (String s) {
		Character ch = 0;
		Stack  stack = new Stack ();
		for (int i = 0; i < s.length (); i++) {
			if ((ch = get(s.charAt (i))) == null) {
				if (stack.isEmpty () || !stack.pop ().equals (s.charAt(i)))
					return false;
			}
			else stack.push (ch);
		}
		return stack.isEmpty();
	}

	public static void main (String [] args) {
		Parser p = new Parser ("[]()");
		for (String s : args)
			System.out.println (p.parse (s));
	}

}

This makes it easy to generalize to whatever symbol-pairs you want. . .

11 06 2009
John Elroy Demsford

oops–angle brackets apparently got parsed as HTML–the class declaration should be extending HashMap (open angle bracket)Character, Character (close angle bracket)

3 07 2009
Mark Volkmann

I’ll jump in with a Clojure solution. Note that character literals start with a backslash. For example, a left paren is \(.

(defn- match [prev-char next-char]
(condp = prev-char
\( (= next-char \))
\[ (= next-char \])
false))

(defn- helper [s stack]
(if (empty? s)
(empty? stack)
(let [c (first s)
top (first stack)
stack (if (match top c) (next stack) (cons c stack))]
(helper (next s) stack))))

(defn balanced? [s] (helper s ‘()))

(doseq [arg *command-line-args*]
(println (balanced? arg)))

3 07 2009
Mark Volkmann

Oops! Lost the indentation in my Clojure code. That certainly doesn’t make it easy to read!

3 07 2009
Mark Volkmann

The indentation of my Clojure solution is preserved here.
http://pastie.org/532388

3 07 2009
Mark Volkmann

Better Clojure code, including unit tests.
http://pastie.org/532433

3 07 2009
Laurent Petit

Hi,

Here is a clojure version. It is intended as being agood example of idiomatic clojure code, but not idiomatic to the point that only a clojure guru could understand. So it’s no golf, no cheating, but rather idiomatic clojure I would write in my day job.

Works with clojure 1.0 without any additional library.

;; file: challenge.clj
;; invoke from command line:
;; java -cp clojure.jar /path/to/challenge.clj “()” “((([[]])))” … …

(ns challenge)

(def push conj)
(def closing-counterpart { \( \), \[ \] })
(defn matching-pair? [left right] (= right (closing-counterpart left)))

(defn consume-one [stack c]
(if (matching-pair? (peek stack) c)
(pop stack)
(push stack c)))

(defn balanced? [s] (empty? (reduce consume-one [] s)))

(defn main [] (apply print (map balanced? *command-line-args*)))

(when *command-line-args* (main))

Regards,


Laurent

3 07 2009
Laurent Petit

OK, here is also a better formatted version :

http://paste.lisp.org/+1RZW

Regards,


Laurent

10 07 2009
Alex Stangl

Short and simple Haskell version:

pairs = [('[',']'), ('(',')')]

parse = match []
match (s:ss) (c:cs) | c == s = match ss cs
match ss (c:cs) =
  case lookup c pairs of
    Nothing -> False
    Just s  -> match (s:ss) cs
match ss [] = null ss
12 07 2009
Simple parser « Musemantic

[...] 07 2009 A recent thread on the St. Louis Lambda Lounge mailing list resurrected an old thread on Tony Morris’ blog, writing a parser for a simple grammar (balanced () and []) in different languages. The original [...]

6 11 2009
EBFE

Prolog FTW ? ;)

parse --> "(",parse,")".
parse --> "[",parse,"]".
parse -->[].

Ask:

141 ?- parse("[([)]]",[]).
false.

142 ?- parse("[()]",[]).
true 

Leave a comment