Response from Philip Milne follows:
> The Encoder class is intended to be a base class for different
> Encoder implementations but is mostly designed as if it should never
> be subclassed and implemented as if XMLEncoder was the only possible
> subclass and if it were a singleton design.
This is a little harsh, the Encoder can in fact be trivially subclassed
to produce a textual (Java-like) output. Granted however, some package
private methods make it very difficult to override some of its features.
Where the problems are a question of exposing methods that are currently
hidden this will probably happen, though it is too late for changes
that are not backward compatible. Its a shame we didn't have you on the
expert group when you could have raised these issues while the design was
taking place last year.
Anyway, to answer your points:
> Most important points are:
> - Encoder should be abstract
I'm actually not sure this is better, but its too late for
this anyway as this would break people.
> - Encoder should provide abstract methods for the implementors
> e.g., a method which is called after writeStatement has
> processed it and has created the nessessary expressions
> - The logic of maintaining a statement list and value to expression
> mapping as well as looking up which expressions are really needed
> by the written statements should be provided by the Encoder not
> implemented in XMLEncoder and re-implemented by all other Encoder
> 90% of the XMLEncoder code is maintaining these informations instead
> of dealing with XML encoding what the class actually is for
Well, although I wouldn't claim the design is perfect, there is
some method to this madness. The Encoder does actually have all the
state it needs to emit a stream of expressions and statements which
will produce the object graph. The XMLEncoder is performing an inherently
more complicated task. It is the structure of the XML and the fact that
expressions are nested in lexical blocks with all operations on an
instance done in one place that requires the extra data structures.
In particular, deciding whether am object needs an explicit ID is
quite difficult to calculate from the graph and requires access to
the all the state which is found in the XMLEncoder.
> - getPersistenceDelegate and setPersistenceDelegate are instance
> methods but have static effect. These should be either
> static methods
> or have an effect per instance
Yes, this has been reported as a problem a few times already.
There is a fundamental mismatch here, between the idioms which
the Encoder/Decoder take from the IO package and from the rest
of the beans specification. Basically, a the IO package introduces
the idea of streams which have state - and so we attached the
delegates that we use for controlling persistence to the stream so
that we would be able to support different configurations of encoders
at the same time at some point in the future. Had we made them static
this would never have been possible.
The beans spec and, in particular, the Introspector attach BeanInfo
to classes and this is an inherently static idea. After worrying about
this for some time I made the methods instance methods because
that left us greatest room for expansion in the future and allowed
them to be subclassed. As you have rightly pointed out they currently
affect static data structures in the MetaData class and this is
very confusing. The structures could be made instance specific though
there is always going to be a mismatch between this and BeanInfo
which is, and always will be, a static concept.
> - getValue(Expression) should be protected not package-private
> - clear() should be protected not package-private
Yes this is a good point though once methods are exposed they
can never be deleted; even deprecating methods is pretty difficult
to do in organisations which create public libraries. Given this,
I erred on the side of caution with the API and kept everything hidden
that could be hidden on the grounds that we could expose things if
people asked us to and not the other way around. You're right that
these methods should be made protected at some point at that was
always the intention.
> - XMLEncoder should not call any package-private methods.
> Then it would better match it's role of a reference
> implementation of an Encoder and the design flaws would
> be detected earlier
The XMLEncoder is not really intended to be a reference implementation
for the Encoder. The real purpose of the Encoder is to provide an
API which the persistence delegates can call - thereby isolating
the persistence delegates from the encoding. That it can be subclassed
to make other encoders is (or would be with the changes you suggest)
a useful thing but it is not the main reason for the class. FYI: There
were thoughts of making this class an interface for exactly the reason
above but interfaces cannot be extended over time - and that is the main
reason it is a class. Its API is important, not its implementation here.
That said, it does actually work if you instantiate it and implement the
write() methods using toString(). There's actually very little
(a couple of pages I think) in the Encoder implementation and so a
start-from-scratch implementation might be your best bet in the short
> - The constructor of XMLEncoder has global side-effects.
Yes, though, once again, writing BeanInfo for a class has a global
> BTW the called static method setCaching(boolean) is replacing the
> by a new HashMap every time it is set to true, even if it was true
Yes, though this happens rarely enough that it did not show up
at all in our performance tests.
> - The flush() method of XMLEncoder has global side-effects
True, in the NameGenerator. This should be fixed though it only
affects the toString() method which is there for debugging
> - The toString() method of Statement has global side-effects
> The name assigned to the target of the statement is stored
> in a global
> map. This map is cleared in the flush() method of XMLEncoder. This
> - calling flush() on a XMLEncoder causes other Encoders to output
> inconsistent names if Statement.toString() is used.
Good point. I must admit I hadn't paid enough attention to the scenario
in which multiple streams are being used to write the same object graph
to different sinks. I think all of this can be fixed by making the
NameGenerator and co. instance specific thispart of the implementation
is all private.
> - if other Encoder implementations are using the toString()
> method of
> Statement the target objects are never garbage-collected
You mean because the flush method is package private? Yes, this
is also true.
> - There might be more side-effects
> It seems as if the whole code should be better completly rewritten.
As I say there is very little to the Encoder, most of the work is
in the XMLEncoder and that is there to support a format that you
obviously don't need. I would just do what you say, subclass the
Encoder override all the public methods and replace the two pages
or so of code there with your own implementation.
Its early days for this API and you are the first person I know
of to be trying to produce an entirely new format. If you'd
like to work with Mark to help get all of the features you need
exposed, so that others people would be able to use the API to
produce other formats more easily, then let him know. I expect, if
he has time, he'll try to get them in for FCS.
Many of the issues raised about multi threaded access to the static caches maintained in NameGenerator and Statement has been resolved as part of the fix for 4880633.