SAM is an abstract markup language, otherwise known as a meta-language. This means that it is intended, like XML, defining your own markup languages. Unlike XML, however, SAM pre-defines a basic set of structures. SAM's predefined structures provide a simple clear syntax for common text structures, but they don't cover everything you might need. The following recipes are suggestions on how to handle some common markup problems in SAM. But remember, these involve you defining markup for yourself and require you to handle that markup in the application layer.
SAM does not have an predefined markup for footnotes, but it does have support for IDs and citations, which provide a reference mechanism for footnotes. The suggested recipe for footnotes is to create a footnote block with an id, and use a citation to reference that id:
This paragraph reqires a footnote.[*1]
footnote:(*1)
This is a footnote.
The footnote structure can be anywhere in the document, but it makes sense to place it immediately after the paragraph in which the reference occurs. You can move it to the bottom of the page or the of the chapter or book in the publishing process.
Note that the footnote structure is not a child of the paragraph but a sibling at the same level of indentation. Paragraphs cannot have block children.
There is no support for SAM markup in the body of a codeblock. This works for code, since it requires no escaping, regardless of language. It does not work so well for terminal sessions if you want to mark up the prompt, input, and output. If you need this kind of terminal session support, the suggested recipe is to create a terminal session block and use fields for the sequence of prompt, input, and response. This reads quite clearly and you can easily use it to construct a formatted terminal session illustration when publishing.
terminal:
prompt: $
input: dir
response: Empty directory.
Sometimes you have semantic lists. That is, the list and its items are of a specific type. For example:
filmography:
film: Rio Bravo
film: The Shootist
If you don't want to have to repeat the item name over and over, you can use a recordset with a single field.
filmography:: film
Rio Bravo
The Shootish
Notice that while these two constructions capture the same data and the same semantics, they are not the same structure, and are therefore not interchangeable. You have to decide which one you want in your markup language and provide processing for it accordingly.
Also note the the XML representation of the two forms is different. The first outputs this:
<filmography>
<film>Rio Bravo</film>
<film>The Shootist</film>
</filmography>
The second outputs this:
<filmography>
<row>
<film>Rio Bravo</film>
</row>
<row>
<film>The Shootist</film>
</row>
</filmography>
Of course, it is easy enough to transform the second output into the first in post-processing.
SAM does not have predefined support for superscripts and subscripts. If you need then, you need to define a tagging language the provides them. You would do this with annotations:
H{2}(super)S0{4}(super)
You can't apply attributes to a paragraph. If you want to make paragraphs conditional, support fragments in your tagging language and put the conditions on the fragments.
~~~(?novice)
Be very careful and ask for help if you need it.
Push the big red button and run.
SAM allows you to omit the annotations on a phrase after you have annotated it the first time in a file. When the parser sees a phrase with no annotation or citation, it looks back through the file an finds the last instance of that phrase and copies its annotation.
This is a time saver. If you have a set of phrases that are commonly annotated the same way across a number of documents, you can create a lookup file in which those phrases are annotated and import it into each document. Then you will not need to annotate any of those phrases in the document. Just mark them up as phrases using curly braces and the parser will fill in the annotations.
You need to make sure that the contents of the lookup file do not become part of the published document. To do this, simply create a structure in the lookup file to hold the list of annotations and suppress that structure in the application layer.
dont-publish-this:
{Enter}(key)
{the Duke}(actor "John Wayne" (SAG))
SAM provides a simple labeled list format that allows only one paragraph attached to a label. For anything more complex than this, construct a labeled list structured with blocks and fields.
ll:
li:
lable: Item label
item:
The text of the item, including:
* paragraphs
* lists
* etc
You can provide single sourcing support via attribute chaining. That is, you can provide both an HTTP link and a subject annotation. But if you use the trick of importing a file of annotation definitions and relying on annotation lookup to add them to the content, you can also substitute in different annotation lookup sets for different media (or different anything else). This will get easier when we add catalog support.
There are a number of ways to implement index markers in SAM. The first it to avoid the use of explict index marker altogether and to generate an index based on semantic annotation.
{The Duke}(actor "John Wayne") plays an ex-Union colonel.
In this case you could generate an index entry for John Wayne
, and a categorized entry for Actor:John Wayne
from this annotation.
Note that you don't have to generate an index entry for every instance of an annotated phrase. You can choose to generate an entry only for the first occurrence of an annotated phrase in a section or chapter, for instance.
To implement explicit index markers, you can use annotations with a type index
:
{The Duke}(index "John Wayne; Actor:John Wayne") plays an ex-Union colonel.
To implement index markers that span a passage, you can use a field on a block:
bio: John Wayne
index: John Wayne; Actor:John Wayne
John Wayne was an actor known for his roles in westerns.
To implement index markers that span an arbitrary passage within a block, you can use a field on a fragment:
~~~
index: John Wayne; Actor:John Wayne
Jimmy Stewart also made a number of films with John Wayne.
...
Remember that these are suggestions as to how you might implement index markers in a tagging language based on SAM. SAM does not provide explicit support for index markers.
There is no way to insert markup into a regular codeblock. If you want to do callouts in code, you can use lines for the code and citations for the callouts. (The presumptive semantics for citations is that what they produce is based on the thing they refer to.)
print("Hello World")
print ("Goodbye, cruel World")
Look it up in {Wikipedia}[*wikipedia]
link:(*wikipedia) http://wikipedia.org
Sam grids do not support spanning rows and columns in a table. However, you could provide for spanning rows and columns in grids at the application layer if you wanted to. This allows you to adopt a syntax for spanning rows and columns that works for your markup language.
One possible way to do this as to indicate spanning rows and columns as follows. The example here is based on https://tex.stackexchange.com/questions/368176/i-badly-need-to-generate-the-following-table which has shots of layout.
+++
Paper Title | Performance Measure ||| Image Type
_ | Evaluation Metric | Proposed Method | traditional method | _
| DC | 0.0019 | 0.0021 |
| JS | 0.9975 | 0.9916 |
| DSC | 0.9987 | 0.9958 |
Here the column spanning in indicated by stacking all of the vertical bars that separate the columns together at the end of the spanned column while the row spanning is indicated by placing a single underscore character in the row to be spanned.
This will result in XML output where the cells to be column-spanned with contain and empty string and those to be row-spanned contain a single underscore. The application layer processing can then create the column and row spans accordingly.
You could also do a complex table in fully explicit markup something like this:
table:
header:
row:
cell: Paper Title
cell: Performance Measure
cell: #hspan
cell: #hspan
cell: Image Type
row:
cell: #vspan
cell: Evaluation Metric
cell: Proposed Method
cell: traditional method
cell: #vspan
body:
row:
cell: #empty
cell: DC
cell: 0.0019
cell: 0.0021
cell: #empty
row:
cell: #empty
cell: JS
cell: 0.9975
cell: 0.9916
cell: #empty
row:
cell: #empty
cell: DSC
cell: 0.9987
cell: 0.9958
cell: #empty
Because SAM does not support arbitrary attributes, this approach uses special cell content to indicate spanning. The application layer has to interpret the #vsapn
and #hspan
. #empty
is really just semantic sugar.
Another alternative, it to use nested tables, though this will produce a slightly different layout:
table:
header:
row:
cell: Paper Title
cell: Performance Measure
cell: Image Type
body:
row:
cell:
cell:
table:
header:
cell: Evaluation Metric
cell: Proposed Method
cell: traditional method
body:
row:
cell: DC
cell: 0.0019
cell: 0.0021
row:
cell: JS
cell: 0.9975
cell: 0.9916
row:
cell: DSC
cell: 0.9987
cell: 0.9958
cell:
This can be expressed more compactly like this:
table:
header:
row:
cell: Paper Title
cell: Performance Measure
cell: Image Type
body:
row:
cell:
cell:
+++
Evaluation Metric | Proposed Method | traditional method
DC | 0.0019 | 0.0021
JS | 0.9975 | 0.9916
DSC | 0.9987 | 0.9958
cell:
The annotation lookup feature allows writers to avoid having to explicitly annotate phrases every time they mention them. If the parser detect an unannotated phrase, it will work its way back through the document, including any included files, looking for the same phrase with an annotation. It then copies the annotation from the first annotated phrase it finds. The question is, how does it match phrases? The parser provides two annotation lookup modes, case_sensitive
and case_insensitive
. However, this does not account for cases where you might want various forms of the word to be annotated the same way, such as "run", "ran", and "running". To match different forms of a word, you need to use a technique called "stemming".
Stemming is not universal. There are different stemming algorithms with different strengths and weaknesses, and different languages require different stemming algorithms. Rather than choosing one, therefore, the parser leaves it open to the user to supply their own annotation lookup mode, which may implement stemming or any other approach to annotation lookup that the user wants.
You can add other annotation lookup modes by writing a lookup function and adding it to the dictionary samparser.annotation_lookup_modes
.