[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

SOAP-RP: Potential Disaster In Progress



"It's deja vu all over again."  -- Yogi Berra

Okay, spent a good bit time of today mulling over the SOAP-RP spec.  While it's
thoughtful, well written, thorough, and clearly the product of good
intentions...  I've come to the conclusion that it's potentially a
disaster-in-progress.  The issue is with the <via> element.  Here's the basic
problem...  Perhaps y'all remember this nightmare from the bad old days:

 ns-mx!hobbes.physics.uiowa.edu!zaphod.mps.ohio-state.edu!mips!spool.mu.edu!agate!dog.ee.lbl.gov!ucbvax!emunix.emich.edu!grover.

Well, there was a *reason* for that.  Pre-DNS, our e-mail systems were stitched
together together as a series of point-to-point links.  We really didn't have
the tools to do it any better.  (There's still some grotesque hackery in the
interface between i.e. sendmail and DNS, but for the most part things ARE much
better.)  The bad old way of doing things resulted in a hard to manage, rigid,
brittle environment.  It was hard to set up, harder to maintain, damn near
impossible to use in many nontrivial cases, and almost completely impossible to
troubleshoot.  Thinking back to my network admin days in the mid/late 80s, this
whole concept of explicit delivery paths was the source of tremendous pain and
expense.

SOAP-RP revolves around just such a proposal:  the notion that the sender of a
SOAP message may specify a series of intermediates playing various roles between
itself and the ultimate receiver.  Now, clearly, the analogy is imperfect;
intermediates in the above e-mail example add no value, while clearly SOAP-RP
intermediates may.  The purpose of defining intermediates along a SOAP-RP path
is presumably to model a kind of multiparty communication, something roughly
equivalent to (say) a UNIX pipeline.  The idea is that intermediates could
perhaps have particular semantics, producing particular side effects along the
way, etc.  And that's a fine motivation, on the surface.  But the analogy breaks
down pretty quickly:  while UNIX pipelines are synchronous, SOAP-RP messages are
not.  Given that, it's a bad idea to make the failure semantics of a multiparty
communication non-local.  In other words, a SOAP-RP message delivered along a
path can fail without the sender ever finding out, if I read the spec
correctly.  By way of bad analogy, consider the following:

Ex. 1:  proc1 | proc2 | proc3 | proc4
Ex. 2:  proc1 > p1;  proc2 <p1 >p2;  proc3 <p2 >p3;  proc4 <p3

Now, on the surface, these are essentially equivalent.  And given synchronous /
local semantics, they are equivalent in result.  Distributed systems and local
systems, however, have fundamentally different semantics, particularly in terms
of failure modes.  If it were possible for the "|" operator to fail without the
shell becoming aware, the two examples would not be equivalent at all;  rather,
Ex. 2 would be highly preferred.  By having the shell act as a kind of
"coordinator" for a sequential set of single party communications, the detection
and resolution of failure is kept local, which for any sort of mission critical
application is preferable.  Throw in the fact that each procN above, in a
SOAP-RP world, is likely in a different administrative domain, and you've got
massive coordination costs and compounded failure modes.

The intent of <via> is one or both the following:  (1) an attempt to provide an
abstract description of how to chain multiple, sequential point-to-point message
sends, as a kind of psuedo-pipeline, OR (2) it's truly a (mistaken) attempt to
describe a communication pattern.  (2) is bad;  it's unnecessary,
administratively tedious, costly, and necessarily unreliable.  (1) is fine,
however the current formulation of SOAP-RP confuses the situation, apparently
assuming (2) is necessary to accomplish (1).  IMO there should be separation of
concerns.  Say, a SOAP-PIPELINE spec that gives a declarative dialect for
chaining together multiple, sequential, async, point-to-point SOAP message sends
/ possibly receipts, with local failure semantics, separate from a
SOAP-ADDRESSING spec for how to identify senders and receivers, separate from a
SOAP-ROUTING spec that defines how to locate and deliver messages to particular
receivers.

Folks, it's abundantly clear from our earlier experiences with sender defined
e-mail delivery paths that the notion of explicit paths for their own sake is a
broken concept.  The determination of routing path for a store-and-forward
message is something best left to a true *routing* fabric.  I'm not suggesting
that such a fabric need live at the lowest levels of the stack;  indeed
(ulterior motive) I've spent much of the last several months pondering
application-level routing protocols for SOAP messages, built around Plaxton
meshes.  What I *am* arguing for is that we not confuse the notions of
identifying to: and from: with the lower level notion of how we get from A to
B;  further, my contention is that asynchronous communication is inherently
2-party, and that any N-party communication should, in async-land, be structured
as a sequential series of 2-party communications, performance arguments to the
contrary notwithstanding.

It's further clear that tossing failure detection and recovery into the cloud is
a bad idea for any form of mission critical system.  Imagine the problems with
non-local failure semantics in an end-to-end fulfillment pipeline that spans
multiple enterprises...

I don't mean to rain on Henrik's / Satish's parade.  This is otherwise a great
piece of work, and pretty close to a perfect example of how to write a great
spec.  I just think one of the fundamental premises is either wrong or slightly
confused.  Unfortunately, that translates to  the potential for extreme, costly
fallout.  Giving ourselves <via> is unnecessary, dangerous, and costly.
Suggestion:  kill it.

Comments?


jb

"Those who do not learn from history are doomed to repeat it."  -- George
Santayana