Ilya Zakharevich wrote :
|| Chip Salzenberg writes:
|| > Start with a string, part of which has the "bold" attribute --
|| > something that might be written in HTML as "<b>hello</b> there". When
|| > working code-based, when extracting the 'll' (perhaps with substr), I
|| > would have to be aware of the state (bold) at the point of extraction
|| > so I could know to extract "<b>ll</b>".
||
|| This is a very interesting question: how to cut-and-paste a piece of
|| enhanced text. The current solution of EText Tk-widget is to cut "ll"
|| out of "<b>hello</b> there". However, if you extract /llo/, you will
|| get "<b>llo</b>". In other words: "tags" (hints which apply to
|| substrings of the text) are extracted only if the boundary of the tag
|| is hit.
An interesting question indeed.
Whether the attribute also applies to a piece cut out of the middle
can certainly depend upon the sort of attribute. If you extract "ll"
out of <URL>http://perl.com/foo/ll</URL> it is certainly not
appropriate to retain the URL attribute. XML has many attributes
that imply that the data has a specific structure. So, for those, it
only makes sense to retain the attributes that have *both* boundaries
included. But, for something like <b>, it makes more sense to retain
the attribute even if the data comes from the middle of the range -
that is an attribute that applies individually to each component -
although even there you'll often not want the attributes carried
along, depending upon your purpose in copying (e.g. if you copy a
filename from one place into a command to execute, you don't really
want to retain the bold attribute - but the out-of-band mechanism
will certainly make it not important if the attribute does get
copied, an in-band keeping of the attribute might be a nuisance).
|| > In contrast, working frame-based, I only need walk the attribute tree,
|| > find the attributes that apply to the given characters, and copy them.
|| > That's O(log N) or so -- certainly better than O(N). More
|| > significantly, it requires *no* knowledge of metadata semantics.
||
|| Same for inline data. There is absolutely no difference between
|| semantic of having metadata inline or separate. We need more shallow
|| arguments than the semantic ones.
<HTML> ... 100k bytes later ... <b>Hello</b> ... </HTML>
Retaining enclosing inline attributes does require more effort,
unless you've built an out-of-line wrapping to collect its meaning.
--
objects: | John Macdonald
Think of them as data with an attitude. | jmm(_at_)elegant(_dot_)com