Added devdoc ant target, added HTML block processing. Implemented setting of user Decorator.
5.0 KiB
txtmark - Java markdown processor
Copyright (C) 2011 René Jeschke rene_jeschke@yahoo.de
See LICENSE.txt for licensing information.
txtmark is yet another markdown processor for the JVM.
-
It is easy to use:
String result = txtmark.Processor.process("This is ***TXTMARK***"); -
It is fast (see below)
... well, it is the fastest markdown processor on the JVM right now.
This is a RC version, tagged v0.5
For an in-depth explanation of the markdown syntax have a look at daringfireball.net.
Markdown conformity
Txtmark passes all tests inside MarkdownTest_1.0_2007-05-09 except of two:
-
Images.text
Fails because Txtmark doesn't produce empty 'title' image attributes.
(IMHO: Images ... OK) -
Literal quotes in titles.text
What the frell ... this test will continue to FAIL.
Sorry, but using unescaped"in a title which should be surrounded by"is unacceptable for me ;)Change:
Foo [bar](/url/ "Title with "quotes" inside"). [bar]: /url/ "Title with "quotes" inside"to:
Foo [bar](/url/ "Title with \"quotes\" inside"). [bar]: /url/ "Title with \"quotes\" inside"and Txtmark will produce the correct result.
(IMHO: Literal quotes in titles ... OK)
Performance comparison of markdown processors for the JVM
Based on this benchmark suite.
| Test | Actuarius | PegDown | Knockoff | Txtmark | ||||
|---|---|---|---|---|---|---|---|---|
| 1st Run (ms) | 2nd Run (ms) | 1st Run (ms) | 2nd Run (ms) | 1st Run (ms) | 2nd Run (ms) | 1st Run (ms) | 2nd Run (ms) | |
| Plain Paragraphs | 887 | 461 | 2455 | 2236 | 764 | 568 | 89 | 47 |
| Every Word Emphasized | 2220 | 2077 | 3411 | 3406 | 30503 | 30514 | 72 | 66 |
| Every Word Strong | 2384 | 2270 | 2456 | 2466 | 23639 | 23577 | 62 | 57 |
| Every Word Inline Code | 824 | 804 | 2337 | 2237 | 23506 | 23622 | 54 | 55 |
| Every Word a Fast Link | 3942 | 3738 | 1164 | 1159 | 8621 | 8595 | 89 | 68 |
| Every Word Consisting of Special XML Chars | 9393 | 9312 | 7544 | 7314 | 801 | 608 | 3587 | 3614 |
| Every Word wrapped in manual HTML tags | 6843 | 6828 | 1850 | 1859 | 8699 | 8692 | 1169 | 1154 |
| Every Line with a manual line break | 859 | 724 | 2968 | 2946 | 2171 | 1990 | 58 | 56 |
| Every word with a full link | 528 | 501 | 2252 | 2280 | 3513 | 3512 | 66 | 60 |
| Every word with a full image | 395 | 374 | 2463 | 2569 | 3757 | 3726 | 56 | 55 |
| Every word with a reference link | 19208 | 19035 | 39183 | 38710 | 243450 | 244943 | 1826 | 1798 |
| Every block a quote | 465 | 449 | 2687 | 2684 | 978 | 977 | 48 | 48 |
| Every block a codeblock | 151 | 134 | 597 | 601 | 270 | 262 | 36 | 27 |
| Every block a list | 1209 | 1106 | 3448 | 3432 | 1411 | 1368 | 52 | 60 |
| All tests together | 6062 | 6042 | 11556 | 11589 | 19827 | 19637 | 452 | 448 |
-
Q: Why is Txtmark so slow when it comes to XML entities?
-
A: Because Txtmark does some sanity checks on XML entities to make sure it outputs valid XML. For example:
&cutie;will produce (when processed with Markdown and most other markdown processors):
&cutie;and
&cutie;when processed with Txtmark.
Tested versions:
[Actuarius] version: 0.2
[PegDown] version: 0.8.5.4
[Knockoff] version: 0.7.3-15
[Markdown] is copyright (c) 2004 by John Gruber
[Markdown]: http://daringfireball.net/projects/markdown/
[Actuarius] is copyright (c) 2010 by Christoph Henkelmann
[Actuarius]: http://henkelmann.eu/projects/actuarius/
[Knockoff] is copyright (c) 2009-2011 by Tristan Juricek
[Knockoff]: http://tristanhunt.com/projects/knockoff/
[PegDown] is copyright (c) 2010 by Mathias Doenitz
[PegDown]: https://github.com/sirthias/pegdown
Project link: https://github.com/rjeschke/txtmark