txtmark - Java markdown processor
Copyright (C) 2011 René Jeschke rene_jeschke@yahoo.de See LICENSE.txt for licensing information.
txtmark is yet another markdown processor for the JVM.
... and is damn fast^^
Again this is a WIP release.
TODO:
- block-level HTML element processing
- code clean-ups
- see below (markdown test suite)
MarkdownTest results so far
Based on MarkdownTest_1.0_2007-05-09
- Amps and angle encoding ... OK
- Auto links ... OK
- Backslash escapes ... OK
- Blockquotes with code blocks ... OK
- Code Blocks ... OK
- Code Spans ... OK
- Hard-wrapped paragraphs with list-like lines ... OK
- Horizontal rules ... OK
- Images ... FAILED (see Note 1)
- Inline HTML (Advanced) ... OK
- Inline HTML (Simple) ... FAILED (see Note 2)
- Inline HTML comments ... FAILED (see Note 2)
- Links, inline style ... OK
- Links, reference style ... OK
- Links, shortcut references ... OK
- Literal quotes in titles ... FAILED (see Note 3)
- Markdown Documentation - Basics ... OK
- Markdown Documentation - Syntax ... FAILED (see Note 2)
- Nested blockquotes ... OK
- Ordered and unordered lists ... OK
- Strong and em together ... OK
- Tabs ... OK
- Tidyness ... OK
18 passed; 5 failed.
Benchmark: 2 wallclock secs ( 0.02 usr 0.01 sys + 1.78 cusr 0.68 csys = 2.49 CPU)
-
Note:
Fails because Txtmark doesn't produce empty 'title' image attributes. (IMHO: Images ... OK) -
Note:
Fails because of currently missing block-level HTML identification. -
Note:
What the frell ... this test will continue to FAIL. Sorry, but using unescaped '"' in a link, which should be surrounded by '"' is unacceptable for me ;)Change:
Foo [bar](/url/ "Title with "quotes" inside"). [bar]: /url/ "Title with "quotes" inside"to:
Foo [bar](/url/ "Title with \"quotes\" inside"). [bar]: /url/ "Title with \"quotes\" inside"and Txtmark will produce the correct result.
(IMHO: Literal quotes in titles ... OK)
Performance comparison of markdown processors for the JVM
Based on this.
Txtmark's results should not be considered final, they may change in either direction
during the upcoming releases.
But I think you get the point.
| Test | Actuarius | PegDown | Knockoff | Txtmark | ||||
|---|---|---|---|---|---|---|---|---|
| 1st Run (ms) | 2nd Run (ms) | 1st Run (ms) | 2nd Run (ms) | 1st Run (ms) | 2nd Run (ms) | 1st Run (ms) | 2nd Run (ms) | |
| Plain Paragraphs | 1213 | 301 | 1282 | 944 | 600 | 340 | 125 | 45 |
| Every Word Emphasized | 1579 | 913 | 1482 | 1473 | 12840 | 12663 | 51 | 43 |
| Every Word Strong | 1142 | 1004 | 1131 | 1110 | 9505 | 9615 | 39 | 39 |
| Every Word Inline Code | 374 | 275 | 1064 | 1030 | 9118 | 9075 | 49 | 35 |
| Every Word a Fast Link | 2172 | 1569 | 545 | 530 | 3951 | 3385 | 88 | 44 |
| Every Word Consisting of Special XML Chars | 4008 | 4243 | 3029 | 3319 | 316 | 363 | 1303 | 1270 |
| Every Word wrapped in manual HTML tags | 3041 | 2874 | 887 | 888 | 3776 | 3472 | 570 | 530 |
| Every Line with a manual line break | 457 | 530 | 1325 | 1297 | 1340 | 981 | 46 | 43 |
| Every word with a full link | 359 | 277 | 999 | 952 | 1713 | 1658 | 91 | 50 |
| Every word with a full image | 209 | 143 | 1097 | 1068 | 1852 | 1756 | 33 | 33 |
| Every word with a reference link | 9944 | 9098 | 18326 | 18318 | 116259 | 115617 | 1467 | 1313 |
| Every block a quote | 431 | 210 | 1319 | 1328 | 477 | 469 | 37 | 37 |
| Every block a codeblock | 67 | 95 | 374 | 378 | 166 | 174 | 62 | 22 |
| Every block a list | 852 | 865 | 1706 | 1673 | 599 | 622 | 47 | 39 |
| All tests together | 3313 | 2904 | 5273 | 5333 | 9732 | 9698 | 194 | 190 |
[Markdown] is copyright (c) 2004 by John Gruber [Markdown]: http://daringfireball.net/projects/markdown/ [Actuarius] is copyright (c) 2010 by Christoph Henkelmann [Actuarius]: http://henkelmann.eu/projects/actuarius/ [Knockoff] is copyright (c) 2009-2011 by Tristan Juricek [Knockoff]: http://tristanhunt.com/projects/knockoff/ [PegDown] is copyright (c) 2010 by Mathias Doenitz [PegDown]: https://github.com/sirthias/pegdown
Project link: https://github.com/rjeschke/txtmark