Should semantic markup encode predictable orthographic transformations?
Posted: Wednesday, 29 January 2025 @ 13:22 EST
The subject line is generic but this is a question about capitalization in English. Suppose you have the following T·E·I :—
The capitalization isn’t really necessary here, since the
Does anyone have thoughts as to which of these is preferable? I’m feeling like the latter is probably better if strict adherence to the original source isn’t a requirement, but I’m open to competing opinions.
Code: Select all
<p>
This is a sample sentence that <addName>Lady</addName> is using to demonstrate
<name>T·E·I</name> functionality for her blog.<milestone unit="sentence"/>
This is a second sentence, mentioning <forename>æsc</forename>.<milestone
unit="sentence"/>
</p>
<⸺name>
and <milestone unit="sentence"/>
provide enough information to derive the capitalization programmatically. The special case of æsc’s name could be represented with rend="lowercase"
. So this could just as easily be encoded as :—Code: Select all
<p>
this is a sample sentence that <addName>lady</addName> is using to demonstrate
<name>t·e·i</name> functionality for her blog.<milestone unit="sentence"/>
this is a second sentence, mentioning <forename rend="lowercase">æsc</forename
>.<milestone unit="sentence"/>
</p>