Here is a strawman syntax for new quoted strings that would only allow
for valid UTF-8 sequences, but would also allow for escaped non-UTF-8
sequences:
new-quoted-string = "~" DQUOTE new-quoted-text DQUOTE
new-quoted-text = *(utf8-quoted-safe / quoted-special /
quoted-non-utf8)
not-qspecial = %x01-09 / %x0B-0C / %x0E-21 / %x23-5B / %x5D-7F /
UTF8-2 / UTF8-3 / UTF8-4
; a single Unicode character other than NUL,
; CR, or LF, represented in UTF-8
old-quoted-string = DQUOTE quoted-text DQUOTE
quoted-non-utf8 = "\" "x" 2*HEXDIG
; represents a hex encoded octet
quoted-string = old-quoted-string / new-quoted-string
; "new-quoted-string" is not allowed as
; a parameter to "require" action.
; "new-quoted-string" is only allowed after
; 'require "utf8-strings";'
utf8-quoted-safe = CRLF / not-qspecial
; either a CRLF pair, OR a single octet other
; than NUL, CR, LF, double-quote, or backslash
As quoted strings can already contain multiple lines, I don't think we
need to introduce a new variant of the "multi-line" string. Please let
me know if you disagree.