YAM: Yet Another Markup
THIS IS NOT A GOOD GUIDE TO THE NEW SYNTAX! SEE yam-minimal.yam INSTEAD
This page documents YAM's syntax and bugs/wishlist.
Yet Another Mark-up (YAM) is derived from Terrance's Mark-up Language 1. It embodies the simplicity of wiki editing languages, but it is open to extension through the use of plug-ins.
The YAM translator is easily extended with new output languages. Currently it can produce output in HTML and LaTeX and thence PDF.
Although simple YAM is rich in features. From simple text processing features like bold, emphasis. and teletype to tables, a title page, and a table of contents, there is a range of features which can help create highly presentable documents in both LaTeX and HTML:
- Headings
- Text processing: bold, emphasis, teletype
- Line breaks
- Hrules
- Nested Lists
- Links and targets
- Tables
- Verbatim
- Quotations
- Plugins
Plug-ins in particular are very powerful and allow different extensions such as comments, boxed text, date and time support, etc.
1. YAM Syntax
1.1. Bold, italic and teletype #bold
Bold text is contained in stars: *this is bold* becomes this is bold.
Italic text is contained in underscores: _this is italic_ becomes this is italic.
Fixed-width text is contained in equals signs: ^this is teletype^ becomes this is teletype.
1.2. Horizontal lines #hr
Horizontal lines are indicated by 2 or more '-' signs at the start of a line. For example:
---
and
---------------------------------
both result in:
1.3. Lists #lists
Unordered lists are indicated by 'o' at the start of a line, and ordered lists by '-'. Nesting is indicated by two spaces preceding the item indicator. For example:
- This is an undordered list - Second item # This is a nested... # ...ordered list - Back to the third item of the enclosing list
results in:
- This is an undordered list
- Second item
- This is a nested...
- ...ordered list
- Back to the third item of the enclosing list
1.4. Verbatim output #verbatim
Verbatim output starts with '%<' at the start of a line and ends with '%>'. For example:
%< This *will not* get translated % > (imagine there were no space between the % and the >)
When the target language is HTML, for example, the output will contain '<pre>' tags. It is also possible to tell the translator to write output directly without any intervention, using '%output':%
<pre> %< This will not get translated either, but any markup in the target language will be interpreted in that language.
</pre> %>
1.5. Notes #notes
Notes are like this:
\%notes("This is a note")
The contents will be output to the translation file, but will be commented out in that file. The quotation marks around the note are necessary; notes cannot contain quotation marks (even if escaped).
1.6. Escapes #escapes
To stop a special character from being interpreted, use a '\'. For example,
\\%%
will not generate a line.
Some syntax elements interact with each other and produce unexpected escaping behaviour. For example, in
'=http://gate.ac.uk/='
the equals signs are translated, but not the URL they contain (with the result 'http://gate.ac.uk/').
1.7. Headings #headings
Headings are lines starting with %1 (for first level), %2, %3 or %4. For example, the heading for this section is
%2 Headings
1.8. Links and anchors #links
Links can be specified in four ways:
- As plain text, e.g. 'http://gate.ac.uk/' will become http://gate.ac.uk/
- Using 'target', e.g. http://gate.ac.uk/ will become http://gate.ac.uk/
- Using 'label', e.g. GATE home will become GATE home
- Using Wiki syntax % NOT DONE YET
Anchors and labels are specified using '#name'. For example,
%2 A Heading #label
will result in a heading followed by the anchor label.
1.9. Quotations #quotations
Quotations are enclosed in '"' marks that start are preceded by two spaces at the start of a line. For example,
%"This is a quote%"
becomes:
This is a quote
1.10. Line breaks #linebreaks
Line breaks are indicated by a backslash at the end of a line. For example:
This line is broken%\ in two.
becomes:% This line is broken% in two.
1.11. Tables #tables
Tables use square brackets, bars and dashes. For example:
%[ |*header col 1* | *header col 2* | --- |row 1 col 1 | col 2 | --- |row 2 col 1 | col 2 | --- %]
results in:
header col 1 | header col 2 |
row 1 col 1 | col 2 |
row 2 col 1 | col 2 |
1.12. Plugins #plugins
NOT DONE YET
2. Development Notes
2.1. Bugs:
- include plugin only works from command line, otherwise it looks for the file in the plugins directory, and that's no good. get resource doesn't appear to work well either in this case
- the target in a url gets interpreted as an anchor when the protocol isn't specified (e.g. antlr.org/doc/lexer.html#unicode)
- unclosed = causes open tt with no close
- text at the end of a URL can get included in the URL, e.g. http://antlr.org/doc/lexer.html#unicode: includes the ":"
- there's no way to end an embedded list element except by another list element (embedded or higher level)
- percentage signs need to be escaped in latex output
- odd behaviour on non-native newlines (e.g. ^M on macs)
- URLs that contain commas don't work properly with the %() syntax
- URLs don't work in headings
- the type of the .yam character set should be output to the HTML, e.g.
UTF-8; a simplified way to get better encoding treatment would be:
- put all input files in UTF-8 (e.g. by opening them in GUK)
- check that YAM uses Readers and Writers and sets them to use UTF-8
- oddly enough, wc will report unusual characters
2.1.1. Completed:
- the citation plugin is HTML specific
- the citation plugin closes any embedding lists; what is needed is to be able to tell the context not to do further processing on the results of the plugin, instead of using the output mechanism
- an empty notes field results in null pointer exception
2.2. Wish list:
- a syntax for "less than" symbol and other symbols / unicode escapes
- anchors that are at the end of a heading line should be sent to the translator as a separate call, link(url, title, anchor), so that the translator can position the anchor before or after the heading as appropriate
- change the use of backslash to trigger a newline to something else
- allow _ to be escaped within an emphasised phrase
- allow a string of percents at the end of a line to be a comment
- section level one should translate to H1
- WikiLinks (see below)
- definition lists like twiki?
- table spacing options like twiki?
- variables, e.g. like twiki's %TOC%, %WEB% (%path perhaps?)
- table summary attributes
2.2.1. Completed:
- "generated file" warning in the output file
- autogenerate anchor/label from normalised words of header + int?
- this is only done when building the table of contents, but can be easily changed to work otherwise too.
- the anchors are the same as the number preceding the heading, e.g. "1.1."
- citation
- auto parsing all in-line links like http:, mailto:, ...
- ability to create mailto and ftp links with text - mailto:... or ...
- auto-numbering of sections in the HTML translator
- images
- %contents with numbered links to sections
- double dashes: — makes a long dash
- get escaping of * etc. to works except within the markup itself,i.e.
I couldn't escape an underscore in this sentence
- allow - in title
- allow e.g. - in anchors
2.3. WikiLinks WikiLinks are just links, created either like other links (%(...)), or by typing a WikiWord. Some points:
- WikiSyntaxForLinks should include an optional relative path, e.g. path/to/WikiWord
- when the target of a WikiLink exists, it is treated exactly like a normal link
- when the target doesn't exist, the link is rendered specially (perhaps a different colour, or italic, or with a postfix ? like in Twiki), and clicking directs you to a create page
- therefore, the renderer (yam2...) has to know whether a local link exists or not in every case
2.4. New Parser
To run the JspWiki converter from the command line (from the test/resources directory):
java -classpath ../../../target/yam-1.0-SNAPSHOT.jar:../../../lib/gate.jar gate.yam.convert.JspWiki2Yam jsp-comprehensive.txt