Java Solaris Communities Sun Store Join SDN My Profile Why Join?
 
Bug Database
Bug Detail
Quick Lists
Top 25 Bugs
Top 25 RFE's
Recently Closed Bugs
Printable Page Printable Page


Bug Database
Bug ID: 6431636
Votes 2
Synopsis (coll) New methods for handling iterable sequences in Collection framework
Category java:classes_util
Reported Against
Release Fixed
State 5-Cause Known, request for enhancement
Priority: 4-Low
Related Bugs 6463989
Submit Date 30-MAY-2006
Description
A DESCRIPTION OF THE REQUEST :
Proposal
=======

I propose adding to interface 'java.util.Collection'
the following methods for handling iterable sequences:

  a: Methods receiving instances of interface 'java.util.Iterable':

	boolean addAll(Iterable<? extends T> iterable)
	boolean containsAll(Iterable<? extends T> iterable)
	boolean removeAll(Iterable<? extends T> iterable)
	boolean retainAll(Iterable<? extends T> iterable)
	
  b: Methods receiving instances of interface 'java.util.Iterator':
	
	boolean addAll(Iterator<? extends T> iterator)
	boolean containsAll(Iterator<? extends T> iterator)
	boolean removeAll(Iterator<? extends T> iterator)
	boolean retainAll(Iterator<? extends T> iterator)

Especially, the interface 'java.util.List' should receive
the following additional methods:

  a: Methods receiving instances of interface 'java.util.Iterable':

    boolean addAll(int index, Iterable<? extends T> iterable)

  b: Methods receiving instances of interface 'java.util.Iterator':

   	boolean addAll(int index, Iterator<? extends T> iterator)

For the concrete collections within the Collections framework
(LinkedList, HashSet, etc.) I further suggest adding the
following constructors:

  a: Constructors receiving instances of interface 'java.util.Iterable':

    public ConcreteCollection(Iterable<? extends T> iterable)
    
  b: Constructors receiving instances of interface 'java.util.Iterator':
    
    public ConcreteCollection(Iterator<? extends T> iterator)

The semantics of those methods/constructors would be analoge to the
semantics of the according existing methods/constructors,
which receive a 'Collection<? extends T>' as their argument:
Instead of a collection, which itself is an iterable sequence
(Collection extends Iterable), the methods would receive
an iterable sequence as an  customer  or represented by an iterator.

Implementation
============

The implementation of the new methods/constructors
will be possible in a rather straightforward form.
As an example, here is code for 'ArrayList'.

  public LinkedList(Iterable<? extends T> iterable) {
    this(iterable.iterator());
  }

  public LinkedList(Iterator<? extends T> iterator) {
    super();
    this.addAll(iterator);
  }

  public void addAll(Iterable<? extends T> iterable) {
    this.addAll(iterable.iterator());
  }

  public void addAll(Iterator<? extends T> iterator) {
    while (iterator.hasNext()) {
        this.add(iterator.next());
    }
  }
  
It should be mentioned that the last of those methods can be implemented
in a more efficient way by directly manipulating the internal
representation of the list. See the implementation of
'java.util.LinkedList(int, Collection<? extends T>)' !

After adding the proposed methods, one can change the
implementation of the 'Constructor<...>' versions of the methods
to become simple delegates to the 'Iterable<...>' versions. This
would avoid duplication of similar code. For instance:

  public void addAll(Collection<? extends T> c) {
    this.addAll( (Iterable)c );
  }


JUSTIFICATION :
Reasoning
=========

a: Reasoning for the 'Iterable<...>' methods/constructors
-----------------------------------------------------------

There is some need for creating new collections from existing
collections of objects, or to merge two existing collections.
In Java 1.5 (and in the upcoming 1.6, too), we only have a constructor
for creating e.g. a new List from a j.u.Collection.
Until 1.4 this was already a slight problem, in that there were already
other ways to represent collections of objects: Not just by "official"
Java CollectionS, but also by IteratorS. But with the introduction
of the Iterable interface in Java 1.5, together with the
addition of the for-each loop, the definition of iterators for
user defined classes will presumably become a common task in Java.
So this should also be recognized in the definition of the standard APIs.
Currently I have to define utility functions for this purpose in all
my projects. While this is very easy to do (see section "Implementation"),
I feel this is not the right thing to do -- it should be there out of the box.

b: Reasoning for the 'Iterator<...>' methods/constructors
-----------------------------------------------------------

While discussing this new methods in the Java forum,
most discussion was on the question whether to add
the 'Iterator<...>' methods or not.
They would indeed not be necessary, if every class,
for which an iterator exists, would implement
the 'Iterator' interface. This makes sense for newly defined classes,
but there are several existing classes, which do not currently
implement 'Iterable', and some of those classes won't be
able to do this later on.

For example, the "Jena Semantic Web" framework, hosted on
<jena.sourceforge.net>, has a key concept called "Model",
which actually represents an RDF graph. You can query such a Model
for RDF triples by the model's 'list' methods, which will return you
an 'Iterator' representing the result set of the query.
Because Jena is a pre-J1.5 development, the Model interface
does not extend the Iterator interface. And because Jena is the base
for a lot of existing code, this won't change that fast.

On the other hand, one could argue that it is not necessary
to support old code, when creating new features within a
new framework. I admit to have some sympathy with this thought,
so I would like to leave it open to further discussion, whether
to add the 'Iterator<...>' methods or not.

Compatibility
==========

This new feature would break no existing code, because it only adds
new methods and constructors to the existing framework. Besides that,
it would be a rather conservative feature:
The 'Iterable<...>' methods are just an extension
to the existing 'Collection<...>' methods,
and the 'Iterator<...>' methods are _pragmatically_ analoge
to the 'Iterable<...>' methods.

While it seems somewhat more resricted to think of a _sequence_ of objects
instead of a _collection_ of objects, this is defacto not a restriction.
The Javadoc specification of 'Collection.addAll' does not say anything
about the order in which the collection's entries are read, so in this
case, every order is welcome. And whenever, like for List.addAll',
an order is defined, it is defined in terms of the iterator returned by the
'iterator()' method, which itself implements the 'Iterable' interface.
So again no problem. Another point to consider is the 'Set' interface:
Here each entry of a sequence must only be inserted once, if it is not
already in. But this is also the definition for the
current 'Collection<...>' version of 'addAll', so the 'Iterable<...>'
and 'Iterator<...>' versions can be handled in the same way
with no problems.
 
It should be mentioned that there will be no clashes between
the current 'Collection<...>' methods and the new 'Iterable<...>' methods;
both versions of those methods can coexist.
When given a real 'Collection' instance, the 'Collection<...>' version
of the interface will be chosen by the compiler, so there will be
no  customer  of a hidden change in semantics after recompiling against
the new interface, and it even does not lose any performance.
Posted Date : 2006-05-30 11:21:18.0

The submitter provided the following SDN comment:

> Having methods that take Collections instead of Iterables does allow
> optimizations based on calling size().

I do not want to _replace_ the methods getting a Collection. As I 
already pointed out in the RFE, it is perfectly possible for them 
to coexist with the new methods, and so you can put all kinds of 
optimized handling into them. But even in case of replacement,
you could do a simple runtime type check (by means of instanceof)
to handle CollectionS specially.


Comment added by       : 
> It might have been better to use the more general signatures
> suggested by the submitter, but interfaces can never be changed compatibly

This argument has several facets, some of them have already been
discussed in the forum.

First, adding _Constructors_ to the concrete classes will not have any
impact on existing code, neither on code using the classes nor on 
code defining custom collections. So adding at least the constructors
won't be a problem.

Second, by putting reasonable default implementations for the new methods 
into the definition helper classes ('AbstractList', etc.), 
implementations of customer collections would not be hit by the change,
if they extend those helpers. To do so is recommended practice, anyway! 
Of course, if a custom class wants to cope with the new methods, it can override 
them for optimized handling at any time later. It is very easy for the new 
methods to have a reasonable default implementation, as I already showed in the RFE.    

But now I see one really dangerous point, indeed: In Java, it is not always
possible to extend a class, because multiple inheritance is not
allowed, so sometimes one _must_ implement the interface instead of
extending the helper class. One such case would be the application of
the class adapter design pattern (Gamma e.a, "Design Patterns", p.139).
For example, someone has some class A which already extends
another class B with an interface similar to that of 'Collection'. 
To get this class into Java's Collection framework, he would add an 
'implements Collection' to the definition of class A. In such a case, 
his code would break after adding new methods to the Collection interface.

So, while I do not agree with you, that one has to create interfaces
in the first place (interfaces will always change, because external
circumstances change; there is no such "everything done just right"
in software development, and there will always come the time, where
one has to break compatibility, to keep a language or software vivid), 
here is my

======================
PROPOSED CHANGE TO RFE
====================== 

   * No change to RFE alluded to the _constructors_: They should
     still go into the concrete implementing classes of Collection
     (consider adding them also to the abstract helper classes). 

   * Do _not_ add the proposed _methods_ to the Collection interface. 

   * Add implementations of the proposed _methods_ to the concrete implementing classes 
     of Collection and also to the abstract helper classes.

   * Add according class methods to class 'j.u.Collections'. These are:

      * static boolean addAll(Collection c, Iterable iterable)
      * static boolean containsAll(Collection c, Iterable iterable)
      * static boolean removeAll(Collection c, Iterable iterable)
      * static boolean retainAll(Collection c, Iterable iterable)
 
      * static boolean addAll(Collection c, Iterator iterator)
      * static boolean containsAll(Collection c, Iterator iterator)
      * static boolean removeAll(Collection c, Iterator iterator)
      * static boolean retainAll(Collection c, Iterator iterator)

     These class methods can internally check for the concrete type of the given collection
     and delegate to the according method. For unrecognized types (custom implementations) there 
     will be a default implementation analoge to the implementation I suggested in the original RFE.

After some years of transistion, one can then reconsider again to add the method
declarations directly to the interfaces themself.
Posted Date : 2006-06-15 02:57:36.0
Work Around
N/A
Evaluation
It might have been better to use the more general signatures
suggested by the submitter, but interfaces can never be changed compatibly;
you have to get them right in the first release; and it's not worth adding
yet another interface like Collection.

Having methods that take Collections instead of Iterables does allow
optimizations based on calling size().

An ambitious designer could create a new Collections framework with 
"everything done just right" outside of the JDK.  That might find
a home in the next Java-like language library, or if it was sufficiently
popular, could be incorporated into the JDK itself someday.
But beware; creating a significantly better Collections Framework is very hard.

Probably ... will not fix.
Posted Date : 2006-05-30 14:19:58.0

In response to the submitter's SDN comment re compatibility:
Everyone has their own idea of how compatible different releases
of the JDK must be, but in the JDK at the very least we try to ensure
that every spec-compliant program will continue to work FOREVER,
so no new methods in interfaces.
Posted Date : 2006-06-15 02:57:36.0

Perhaps we could envisage adding a bridge method like this in Collections:
    public static <T> Collection<T> iteratorToCollection(Iterator<? extends T> iter)
(maybe there should be some ? magic in the generic parameters).  This would exhaust the iterator by calling its next() method repeatedly, and add each element found to the returned Collection, which would be readonly.  This would address cases where you need to addAll from an Iterator, though not in the most efficient manner possible.

The method could also work by lazy evaluation, i.e. the returned Collection could be one on which you are only allowed to call iterator(), and that only once.  This would work with typical implementations of addAll and the like, which match that constraint.  But it would be dangerous in general because effectively it would be imposing this new constraint on abstract methods like Collection.addAll.  So I don't think it would be a good idea.

There is no need for a corresponding Iterable method because you can just call iteratorToCollection(iterable.iterator()).

I have to say that I have not found the absence of this functionality at all bothersome, so I don't think that absence justifies more new methods than this or a more efficient solution.  Others may have a different experience.
Posted Date : 2006-06-15 08:25:27.0
Comments
  
  Include a link with my name & email   

Submitted On 12-JUN-2006
m_schnei
> Having methods that take Collections instead of Iterables does allow
> optimizations based on calling size().

I do not want to _replace_ the methods getting a Collection. As I 
already pointed out in the RFE, it is perfectly possible for them 
to coexist with the new methods, and so you can put all kinds of 
optimized handling into them. But even in case of replacement,
you could do a simple runtime type check (by means of instanceof)
to handle CollectionS specially.


Submitted On 12-JUN-2006
m_schnei
> It might have been better to use the more general signatures
> suggested by the submitter, but interfaces can never be changed compatibly

This argument has several facets, some of them have already been
discussed in the forum.

First, adding _Constructors_ to the concrete classes will not have any
impact on existing code, neither on code using the classes nor on 
code defining custom collections. So adding at least the constructors
won't be a problem.

Second, by putting reasonable default implementations for the new methods 
into the definition helper classes ('AbstractList', etc.), 
implementations of customer collections would not be hit by the change,
if they extend those helpers. To do so is recommended practice, anyway! 
Of course, if a custom class wants to cope with the new methods, it can override 
them for optimized handling at any time later. It is very easy for the new 
methods to have a reasonable default implementation, as I already showed in the RFE.    

But now I see one really dangerous point, indeed: In Java, it is not always
possible to extend a class, because multiple inheritance is not
allowed, so sometimes one _must_ implement the interface instead of
extending the helper class. One such case would be the application of
the class adapter design pattern (Gamma e.a, "Design Patterns", p.139).
For example, someone has some class A which already extends
another class B with an interface similar to that of 'Collection'. 
To get this class into Java's Collection framework, he would add an 
'implements Collection' to the definition of class A. In such a case, 
his code would break after adding new methods to the Collection interface.

So, while I do not agree with you, that one has to create interfaces
in the first place (interfaces will always change, because external
circumstances change; there is no such "everything done just right"
in software development, and there will always come the time, where
one has to break compatibility, to keep a language or software vivid), 
here is my

======================
PROPOSED CHANGE TO RFE
====================== 

   * No change to RFE alluded to the _constructors_: They should
     still go into the concrete implementing classes of Collection
     (consider adding them also to the abstract helper classes). 

   * Do _not_ add the proposed _methods_ to the Collection interface. 

   * Add implementations of the proposed _methods_ to the concrete implementing classes 
     of Collection and also to the abstract helper classes.

   * Add according class methods to class 'j.u.Collections'. These are:

      * static<T> boolean addAll(Collection<? super T> c, Iterable<? extends T> iterable)
      * static<T> boolean containsAll(Collection<? super T> c, Iterable<? extends T> iterable)
      * static<T> boolean removeAll(Collection<? super T> c, Iterable<? extends T> iterable)
      * static<T> boolean retainAll(Collection<? super T> c, Iterable<? extends T> iterable)
 
      * static<T> boolean addAll(Collection<? super T> c, Iterator<? extends T> iterator)
      * static<T> boolean containsAll(Collection<? super T> c, Iterator<? extends T> iterator)
      * static<T> boolean removeAll(Collection<? super T> c, Iterator<? extends T> iterator)
      * static<T> boolean retainAll(Collection<? super T> c, Iterator<? extends T> iterator)

     These class methods can internally check for the concrete type of the given collection
     and delegate to the according method. For unrecognized types (custom implementations) there 
     will be a default implementation analoge to the implementation I suggested in the original RFE.

After some years of transistion, one can then reconsider again to add the method
declarations directly to the interfaces themself.

Michael Schneider


Submitted On 16-APR-2009
It seems like everyone's now agreed that java.util.Collection should not be modified.  But I still think it's very worthwhile and safe to at least modify all the relevant JDK classes (including the AbstractList, etc.) to have addAll(Iterator) and new ConcreteCollection(Iterator).  Neither of these should break any code, and as stated instanceof can be used to prefer the Collection constructor if desired.

In addition, a Collections.iteratorToCollection would be valuable.  The generated Collection can be lazily initialized in a clever way.  Or, of course, it could just do something simple like make a java.util.LinkedList.  But unlike Collections.list(Enumeration<T> e) (which returns ArrayList<T>) the declared return type should be as general as possible, so Collection in this case.

But do consider looking at this bug, as I've noticed this almost since starting to use the Collections framework.



PLEASE NOTE: JDK6 is formerly known as Project Mustang