On 29/10/18 21:52, ian(_dot_)proudfoot(_at_)itp-x(_dot_)co(_dot_)uk wrote:
Going a little off topic, but the concept is relatively simple. Many
writers don't make the best use of their word processors. Maybe lists
are manually indented with bullets inserted from a character palette.
Titles may be 'Normal' text with character overrides for font size and
weight.
Indeed they are. This was the starting-point in my session at the XML
Summerschool on how to deal with Word documents as a source for XML in
publishing.
Careful analysis of many documents showed that there are between
eight and ten properties that have the most effect on the output for
character styles and paragraph styles. This is presented as an
override code in a format that is very compact but also possible for
anyone to understand. The combination of any correctly defined style
name plus its override code gives us a key that can be used for
mapping to elements in the output.
Very ingenious — have you published this? I'd be interested to compare
it to the dozen or so areas I investigated when looking at the use of
editing software for structured documents
(https://cora.ucc.ie/handle/10468/1690)
This works well when there is some inherent logic to the implied
structure of the source document. Less so when no regard has been
given to sensible style use.
That's the key, of course. The problem of getting authors to adhere to
stylesheets is a lost cause of many years (in most cases: there are a
few exceptions). In effect, as Wendell and Tommie put it, "the author
sees it as *his own job* to invent the schema"¹
///Peter
--
¹ Piez, W., & Usdin, T. (2007). 'Separating Mapping from Coding in
Transformation Tasks'. XML Conference. Boston, MA: IdeAlliance.
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--