Chip Salzenberg wrote:
XML is coming on strong, and it makes sense for us to take a stab at a
facility to ease metadata manipulation. If it turns out not to help,
or to be too much work to implement in the first place, then we'll give
up on that approach and look for alternatives.
XML is only one way to represent metadata. I'd like to see filters
that can handle #..\n, //..\n, /*..*/, %%.\n, etc. as comments, rather
than imposing that only <!--..--> are treated as comments.
IIRC, there was talk of embedding expat into the perl core when the
XML geeks played with it enough and figured out what 'making XML
easy to parse in perl' means.
Is this discussion going on with the expectation that the REengine
_needs_ to understand angle brackets? Possibly by bypassing
Clark Cooper's XML::Parser classes? Embedding expat into the REengine?
It's looking to me like there should be an REengine that has
1) a liberal interpretation of a zero-width metadata stream
2) an easy way for a programmer to define what's zero-width and
meaningful vs. zero-width and ignored. (blocks to avoid, tags to
avoid, [don't] skip comments, etc.)
3) an easy way to match text within zero-width metadata blocks
4) a set of RE hooks for modules to use to define what metadata is.
But, most importantly, the definition of metadata is defined outside
the REengine. That way, the programmer defines an XML-structured
stream with comments ignored, a Postscript stream with comments
ignored, etc. Each match is tweakable with some metadata def'n.
my $xml = new XML::Parser;
$xml->ignore_comments(1);
$text = "perl is <!--really-->great";
$text =~ m|perl is great|p=$xml; # p = prototype
(Hmmm...can prototypes be additive?
Ignore comments + ignore bold/italic + confine to bodytext?)
-- Adam.