6.8 KiB
Txtmark - Java markdown processor
Copyright (C) 2011 René Jeschke rene_jeschke@yahoo.de
See LICENSE.txt for licensing information.
Txtmark is yet another markdown processor for the JVM.
-
It is easy to use:
String result = txtmark.Processor.process("This is ***TXTMARK***"); -
It is fast (see below)
... well, it is the fastest markdown processor on the JVM right now. -
It does not depend on other libraries, so classpathing
txtmark.jaris sufficient to use Txtmark in your project.
For an in-depth explanation of the markdown syntax have a look at daringfireball.net.
Where Txtmark is not like Markdown
-
Txtmark does not produce empty
titleattributes in link and image tags. -
Unescaped
"in link titles starting with"are not recognized and result in unexpected behaviour. -
Due to a different list parsing approach some things get interpreted differently:
* List > Quotewill produce when processed with Markdown:
<p><ul> <li>List</p> <blockquote> <p>Quote</li> </ul></p> </blockquote>and this when produced with Txtmark:
<ul> <li>List<blockquote><p>Quote</p> </blockquote> </li> </ul>Another one:
* List ====will produce when processed with Markdown:
<h1>* List</h1>and this when produced with Txtmark:
<ul> <li><h1>List</h1> </li> </ul>
Txtmark extensions
To enable Txtmark's extended markdown parsing you can use the PROFILE mechanism:
[$PROFILE$]: extended
This seemed to me as the easiest and safest way to enable different behaviours. (All other markdown processors will ignore this line.)
Behavior changes when using [$PROFILE$]: extended
-
Lists and code blocks end a paragraph (inspired by Actuarius)
In normal markdown the following:
This is a paragraph * and this is not a listwill produce:
<p>This is a paragraph * and this is not a list</p>When using Txtmark extensions this changes to:
<p>This is a paragraph</p> <ul> <li>and this is not a list</li> </ul> -
More to come ...
Markdown conformity
Txtmark passes all tests inside MarkdownTest_1.0_2007-05-09 except of two:
-
Images.text
Fails because Txtmark doesn't produce empty 'title' image attributes.
(IMHO: Images ... OK) -
Literal quotes in titles.text
What the frell ... this test will continue to FAIL.
Sorry, but using unescaped"in a title which should be surrounded by"is unacceptable for me ;)Change:
Foo [bar](/url/ "Title with "quotes" inside"). [bar]: /url/ "Title with "quotes" inside"to:
Foo [bar](/url/ "Title with \"quotes\" inside"). [bar]: /url/ "Title with \"quotes\" inside"and Txtmark will produce the correct result.
(IMHO: Literal quotes in titles ... OK)
Performance comparison of markdown processors for the JVM
Based on this benchmark suite.
| Test | Actuarius | PegDown | Knockoff | Txtmark | ||||
|---|---|---|---|---|---|---|---|---|
| 1st Run (ms) | 2nd Run (ms) | 1st Run (ms) | 2nd Run (ms) | 1st Run (ms) | 2nd Run (ms) | 1st Run (ms) | 2nd Run (ms) | |
| Plain Paragraphs | 887 | 461 | 2455 | 2236 | 764 | 568 | 89 | 47 |
| Every Word Emphasized | 2220 | 2077 | 3411 | 3406 | 30503 | 30514 | 72 | 66 |
| Every Word Strong | 2384 | 2270 | 2456 | 2466 | 23639 | 23577 | 62 | 57 |
| Every Word Inline Code | 824 | 804 | 2337 | 2237 | 23506 | 23622 | 54 | 55 |
| Every Word a Fast Link | 3942 | 3738 | 1164 | 1159 | 8621 | 8595 | 89 | 68 |
| Every Word Consisting of Special XML Chars | 9393 | 9312 | 7544 | 7314 | 801 | 608 | 3587 | 3614 |
| Every Word wrapped in manual HTML tags | 6843 | 6828 | 1850 | 1859 | 8699 | 8692 | 1169 | 1154 |
| Every Line with a manual line break | 859 | 724 | 2968 | 2946 | 2171 | 1990 | 58 | 56 |
| Every word with a full link | 528 | 501 | 2252 | 2280 | 3513 | 3512 | 66 | 60 |
| Every word with a full image | 395 | 374 | 2463 | 2569 | 3757 | 3726 | 56 | 55 |
| Every word with a reference link | 19208 | 19035 | 39183 | 38710 | 243450 | 244943 | 1826 | 1798 |
| Every block a quote | 465 | 449 | 2687 | 2684 | 978 | 977 | 48 | 48 |
| Every block a codeblock | 151 | 134 | 597 | 601 | 270 | 262 | 36 | 27 |
| Every block a list | 1209 | 1106 | 3448 | 3432 | 1411 | 1368 | 52 | 60 |
| All tests together | 6062 | 6042 | 11556 | 11589 | 19827 | 19637 | 452 | 448 |
-
Q: Why is Txtmark so slow when it comes to XML entities?
-
A: Because Txtmark does some sanity checks on XML entities to make sure it outputs valid XML. For example:
&cutie;will produce (when processed with Markdown and most other markdown processors):
&cutie;and
&cutie;when processed with Txtmark.
Benchmarked versions:
Actuarius version: 0.2
PegDown version: 0.8.5.4
Knockoff version: 0.7.3-15
Markdown is copyright (c) 2004 by John Gruber
Actuarius is copyright (c) 2010 by Christoph Henkelmann
Knockoff is copyright (c) 2009-2011 by Tristan Juricek
PegDown is copyright (c) 2010 by Mathias Doenitz
Project link: https://github.com/rjeschke/txtmark