Prev:
BlocksInJavaIntro
Next:
BlocksInJavaCompositors
Top:
BlocksInJava
Implementing the AST Package
Basic Concepts...
I split the class tree of the
ast package into three basic clusters:
- Predicates that return a boolean to indicate if some logical expression is true or false
- Functions that evaluate some calculation and return an java.lang.Object representing its result
- Procedures that run and have no result -- i.e. return void
Each cluster is composed of three root interfaces:
- A No argument version
- A Unary version -- one argument
- A Binary version -- two arguments
Each root is a Java Interface that extends
Serializable so that our expressions can be persisted. The three clusters each with their three interfaces result in the following root interfaces for the
ast package:
- The Predicates
- interface Predicate
- interface UnaryPredicate
- interface BinaryPredicate
- The Functions
- interface Function
- interface UnaryFunction
- interface BinaryFunction
- The Procedures
- interface Procedure
- interface UnaryProcedure
- interface BinaryProcedure
Each
interface has only a single
member.
Predicates have the method
is, Functions have the method
eval, and Procedures have the method
run (i.e. like
java.lang.Runnable. It is mainly due to Java's schizophrenic type-system that we give each cluster of interfaces a different
action word. However, it method name clearly indicates the difference between the interface and the interfaces in another cluster.
To ease the use of these classes in a way that is similar to Smalltalk blocks, I provide a default implementation of each interface into a class called
ast.Block. The block class can still be used as the base for
AnonymousInner closures, it will just be easier to type and simpler to use. The user need only remember one instead of nine class names. Consider the example we opened this page with:
Object item = names.detect( new UnaryPredicate() {
public boolean is( Object arg ) {
return arg.equals( "some_name" ); } } );
Using the helper class
Block, we can write the same code like so:
Object item = names.detect( new Block() {
public boolean is( Object arg ) {
return arg.equals( "some_name" ); } } );
Now consider the following three examples which show the versatility of the
Block:
1.
Block as
UnaryPredicate
Person joe = (Person)
names.detect( new Block() {
public boolean is( Object each ) {
return ((Person)each).isName( "Joe" ); } } );
-
- BTW, let me say here how much I despise Java's type system!!!
2.
Block as
BinaryFunction
String sOnePerLine = (String)
collection.inject( new String(), new Block() {
public Object eval( Object val, Object each ) {
return val.toString() + each.toString() + '\n'; } } );
3.
Block as
UnaryProcedure
final Mutable.Integer counter = new Mutable.Integer();
collection.enum( new Block() {
public void run( Object each ) {
++counter.value; } } );
Before we get into the
Block class, we need to define the foundational interfaces it simplifies. The basic package name I will use is
ast for
abstract syntax tree. The full package name will be
com.tripwire.rdifalco.ast.
Implementation of Basic Interfaces
If you recall, we have three clusters, each with three basic interfaces. I call these the AST
foundational interfaces. The pattern is well defined and consistent - each interface has a single member. The member takes zero, one, or two arguments as Objects and returns a boolean, an Object, or void. Each should be placed into its own source file:
Cluster: Predicate
#package com.tripwire.rdifalco.ast;
public interface Predicate
extends java.io.Serializable
{
boolean is();
}
#package com.tripwire.rdifalco.ast;
public interface UnaryPredicate
extends java.io.Serializable
{
boolean is( Object arg );
}
#package com.tripwire.rdifalco.ast;
public interface BinaryPredicate
extends java.io.Serializable
{
boolean is( Object arg1, Object arg2 );
}
Cluster: Function
#package com.tripwire.rdifalco.ast;
public interface Function
extends java.io.Serializable
{
Object eval();
}
#package com.tripwire.rdifalco.ast;
public interface UnaryFunction
extends java.io.Serializable
{
Object eval( Object arg );
}
#package com.tripwire.rdifalco.ast;
public interface BinaryFunction
extends java.io.Serializable
{
Object eval( Object arg1, Object arg2 );
}
Cluster: Procedure
#package com.tripwire.rdifalco.ast;
public interface Procedure
extends java.io.Serializable,
java.lang.Runnable // NOTE:
{
void run();
}
#package com.tripwire.rdifalco.ast;
public interface UnaryProcedure
extends java.io.Serializable
{
void run( Object arg );
}
#package com.tripwire.rdifalco.ast;
public interface BinaryProcedure
extends java.io.Serializable
{
void run( Object arg1, Object arg2 );
}
I see a lot of
ConceptualIntegrity here. The interfaces are simple and consistent. This is essentially all we need to build everything else. Note that the
Procedure class extends
java.lang.Runnable. This allows us to use our
ast abstractions anywhere, even in those places where java requires an
Runnable instance, such as with the
java.lang.Thread class.
Without going any further, we can start using these interfaces for anonymous inners and
InternalizeExternalIterators. For example, consider the following internalization of the java.util.Iterator:
public class Iterate
{
static public
Object detect(
List list,
UnaryPredicate aBlock )
{
Iterator at = list.iterator();
while ( at.hasNext() )
{
Object each = at.next();
if ( aBlock.is( each ) )
return each;
}
return null; // indicate none-detected
}
static public
void enum(
final List list,
final UnaryProcedure aBlock )
{
detect( list, new UnaryPredicate() {
public boolean is( Object each ) {
aBlock.run( each );
return false; } } );
}
[...]
}
And so on. You can do this for all the basic
InternalIterator types -
#inject,
#select,
#reject,
#collect, and so on. We can start using this package now just with its basic interfaces. For example::
File file = (File)Iterate.detect( aFiles,
new UnaryPredicate() {
public boolean is( Object each ) {
((File)each).getName().equals( "temp.dat" ); } } );
Defining a Concrete and Universal Java Block
With the block class, I can use
detect without having to remember the name of the nine interface types. One can simply type
Block and remember
is for
boolean,
eval for
java.lang.Object, and
run for
void. The
Universal Block class is incredibly convenient and straightforward. It essentially provides a default implementation for all of the interfaces we just defined. Besides providing a mapping from multiple interfaces to a single class, the
Block implementation of the
Function interfaces will forward to the
Predicate interfaces and then convert its
boolean result to
java.lang.Boolean (i.e. an Object). The
Predicate and
Procedure interfaces, if not overridden, will call the
Function interfaces and either convert or discard the return value appropriately.
#package com.tripwire.rdifalco.ast;
public class Block
implements Procedure, UnaryProcedure, BinaryProcedure,
Function, UnaryFunction, BinaryFunction,
Predicate, UnaryPredicate, BinaryPredicate
{
//..The Procedures
/* [JavaDoc Header]
* Default implementation of a <code>Procedure</code> interface. This
* implementation calls the <code>Function</code> interface member
* <code>eval</code> and discards its return value.
*/
public void run() {
eval();
}
public void run( Object arg1 ) {
eval( arg1 );
}
public void run( Object arg1, Object arg2 ) {
eval( arg1, arg2 );
}
//..The Functions
/* [JavaDoc Header]
* Default implementation of a <code>Function</code> interface. This
* implementation calls the <code>Predicate</code> interface member
* <code>is</code> and returns its <code>boolean</code> value as an
* instance of <code>java.lang.Boolean</code>.
*/
public Object eval() {
return is() ? Boolean.TRUE : Boolean.FALSE;
}
public Object eval( Object arg1 ) {
return is( arg1 ) ? Boolean.TRUE : Boolean.FALSE;
}
public Object eval( Object arg1, Object arg2 ) {
return is( arg1, arg2 ) ? Boolean.TRUE : Boolean.FALSE;
}
//..The Predicates
/* [JavaDoc Header]
* Default implementation of a <code>Predicate</code> interface.
* This implementation calls the <code>eval</code> member of the
* <code>Function</code>, casts its result as a
* <code>java.lang.Boolean</code>, and returns the result of
* calling <code>booleanValue</code>.
*/
// NOTE: No error checking!!
public boolean is() {
return ((Boolean)eval()).booleanValue();
}
public boolean is( Object arg1 ) {
return ((Boolean)eval( arg1 )).booleanValue();
}
public boolean is( Object arg1, Object arg2 ) {
return ((Boolean)eval( arg1, arg2 )).booleanValue();
}
}
A Brief Excursion on Iterators
I want to stop here to say that all of my
abstract data types and foundational structures use
InternalIterator functions like Smalltalk rather than the
ExternalIterator objects talked about in
DesignPatterns, used by STL, and supplied with the Java Collection Classes. Furthermore, I have done performance tests on all of these and the
InternalIterators win every time over the
java.util collections. Sometimes they are 4 or 5 times faster than their
ExternalIterator equivalents.
The reason for this is pretty straightforward. When creating an
ExternalIterator class, iteration is decoupled from the data structure - it is
not part of the collection. As a result, it must use the same public interface everyone else does. Even if you avoid this by exploiting package visibility and inner-classes, it is still an unbounded operation with little control. You cannot control how many times next is called and as such, each access calls a member of the collection object that must check the range of the specified index. Furthermore, it is difficult to implement an
ExternalIterator for a distributed system or to control synchronization. For more on this, see
Iterators: Signs of Weakness in Object-Oriented Languages at
ftp://ftp.netcom.com/pub/hb/hbaker/Iterator.html.
Using iterator
methods as Smalltalk and Lisp do is
much more efficient. Since the iterator can
share the implementation of the data structure, the author has complete control over the iteration! For example, the iterator function for an array class need not check the validity of its indices:
public void forEach( UnaryProcedure aBlock )
{
int end = m_nCount;
for ( int at = 0; at != end; ++at )
aBlock.eval( m_contents[ at ] );
}
Simple. Since the function
forEach is bounded, we need not perform a boundary check on each access. Now consider its usage:
list.forEach( new Block() {
public void run( Object each ) {
Dbg.trace( "item: " + each ); } } );
Contrast this to the following use of the Iterator
DesignPattern:
Iterator iter = list.iterator();
while ( iter.hasNext() )
Dbg.trace( "item: " + iter.next() );
We can zero out on the construction of the
Iterator object since we must also create a
Block instance with the internal version. However, every operation on Iterator must map to the
list object's class and every iteration of the loop must check the current value of the Iterator against the boundaries of the collection represented by
list.
My big question is
why didn't Java use iterator functions with code blocks rather than an ExternalIterator class? My guess is that because of its syntactical similarity to C++, it is easier for programmers to relate Java to C++ than to Smalltalk or Lisp. As a result, it was more natural for the designer of the Java Collections to take a more traditional C++ view of the world and support procedural over Object-Oriented iteration. Even though procedural iteration is not as optimal as fully-encapsulated O-O iteration, it either just didn't occur to them, or they did not have the courage to take the Java/C++ community in a new and different direction. Anyway, check out that paper I provided a link to earlier in this section. If you can get through some mighty ugly Lisp code, there are some interesting points to find.
If you are interested, here are the iterators I support. If I get enough requests, I will also publish my
adt (abstract data types) package. I use the traditional Smalltalk enumerators with a couple of additions and variations to accommodate Java's
feel. I'm just going to list the enumerators here and leave off the collection members:
interface Items // a bunch of items (unordered collection)
{
the enumerators
//* eval block for each
void enum( UnaryProcedure aBlock );
//* increment count for each true
int count( UnaryPredicate aBlock );
//* remove item for each block that answer true
void remove( UnaryPredicate aBlock );
//* detect first for block answers true
Object detect( UnaryPredicate aBlock );
//* inject value into block with each item
Object inject( Object value, BinaryFunction aBlock );
//* answer new collection for each true
Items select( UnaryPredicate aBlock );
//* answer new collection for each false
Items reject( UnaryPredicate aBlock );
//* answer new collection of non-null results
Items collect( UnaryFunction aBlock );
}
While it's a simple collection hierarchy, I have another interface for items that can be sequenced. This interface adds another bunch of its own enumerators in addition to those defined by
Item:
interface ItemSequence extends Items
{
enumerators
//* apply block in reverse order
void enumBack( UnaryProcedure aBlock );
//* apply pairs to block.enum( at( n ), at( n + 1 ) )
void enumPairs( BinaryProcedure aBlock );
//* detect last item for block.is( eachItem )
Object detectLast( UnaryPredicate aBlock );
//* detect first Index for block.is( eachItem )
Index atFirst( BinaryPredicate aBlock );
//* detect last Index for block.is( eachItem )
Index atLast( BinaryPredicate aBlock );
binary blocks for index,item pairs
//* Like enum but for index and item
void withIndexEnum( BinaryProcedure aBlock );
//* detect Index for block.is( eachIndex, eachItem )
Index withIndexDetect( BinaryPredicate aBlock );
//* detect last Index for block.is( eachIndex, eachItem )
Index withIndexDetectLast( BinaryPredicate aBlock );
}
That's the basic set of enumerators (i.e.
InternalIterators) I use. All of these work for most anything I would ever want to do in an abstract data type whether that be a List, Array, Sorted List, Set, Bag, Dictionary, whatever. I haven't had an instance for a long time where I felt I needed to code a
for...next or
iter.next() while loop by hand.
RobertDiFalco
Prev:
BlocksInJavaIntro
Next:
BlocksInJavaCompositors
Top:
BlocksInJava