txtmark - Java markdown processor

Copyright (C) 2011 René Jeschke rene_jeschke@yahoo.de
See LICENSE.txt for licensing information.


txtmark is yet another markdown processor for the JVM.
... and is damn fast^^

Again this is a WIP release.

TODO:

  • block-level HTML element processing
  • code clean-ups
  • see below (markdown test suite)

MarkdownTest results so far


Based on MarkdownTest_1.0_2007-05-09

  • Amps and angle encoding ... OK
  • Auto links ... OK
  • Backslash escapes ... OK
  • Blockquotes with code blocks ... OK
  • Code Blocks ... OK
  • Code Spans ... OK
  • Hard-wrapped paragraphs with list-like lines ... OK
  • Horizontal rules ... OK
  • Images ... FAILED (see Note 1)
  • Inline HTML (Advanced) ... FAILED (see Note 2)
  • Inline HTML (Simple) ... FAILED (see Note 2)
  • Inline HTML comments ... FAILED (see Note 2)
  • Links, inline style ... OK
  • Links, reference style ... OK
  • Links, shortcut references ... OK
  • Literal quotes in titles ... FAILED (see Note 3)
  • Markdown Documentation - Basics ... OK
  • Markdown Documentation - Syntax ... FAILED (see Note 2)
  • Nested blockquotes ... OK
  • Ordered and unordered lists ... OK
  • Strong and em together ... OK
  • Tabs ... OK
  • Tidyness ... OK

17 passed; 6 failed.


  1. Note:

    Fails because Txtmark doesn't produce empty 'title' image attributes. (IMHO: Images ... OK)
  2. Note:

    Fails because of currently missing block-level HTML identification.
  3. Note:

    What the frell ... this test will continue to FAIL. Sorry, but using unescaped `"` in a title which should be surrounded by `"` is unacceptable for me ;)

    Change:

     Foo [bar](/url/ "Title with "quotes" inside").
     [bar]: /url/ "Title with "quotes" inside"
    

    to:

     Foo [bar](/url/ "Title with \"quotes\" inside").
     [bar]: /url/ "Title with \"quotes\" inside"
    

    and Txtmark will produce the correct result.
    (IMHO: Literal quotes in titles ... OK)

Performance comparison of markdown processors for the JVM


Based on this.
Txtmark's results should not be considered final, they may change in either direction during the upcoming releases.
But I think you get the point.

Test Actuarius PegDown Knockoff Txtmark
1st Run (ms)2nd Run (ms) 1st Run (ms)2nd Run (ms) 1st Run (ms)2nd Run (ms) 1st Run (ms)2nd Run (ms)
Plain Paragraphs 969300 1468956 564362 11445
Every Word Emphasized 1409884 14351417 1316112921 5244
Every Word Strong 1087978 11251100 97179586 4046
Every Word Inline Code 351278 10471037 94999245 4535
Every Word a Fast Link 21231580 523512 40863470 7850
Every Word Consisting of Special XML Chars 39813973 33413055 372319 18421841
Every Word wrapped in manual HTML tags 30732907 901888 38263529 492453
Every Line with a manual line break 437583 13701363 1352957 4244
Every word with a full link 398266 10571014 17551689 8847
Every word with a full image 228139 11101101 19171773 3733
Every word with a reference link 97269146 1901920044 117632118306 14311240
Every block a quote 431205 13661328 474464 3536
Every block a codeblock 6884 387377 161169 6119
Every block a list 863912 17351762 602686 4636
All tests together 33192959 52455305 102529751 222173

[Actuarius] version: 0.2
[PegDown] version: 0.8.5.4
[Knockoff] version: 0.7.3-15


[Markdown] is copyright (c) 2004 by John Gruber
[Markdown]: http://daringfireball.net/projects/markdown/ [Actuarius] is copyright (c) 2010 by Christoph Henkelmann
[Actuarius]: http://henkelmann.eu/projects/actuarius/ [Knockoff] is copyright (c) 2009-2011 by Tristan Juricek
[Knockoff]: http://tristanhunt.com/projects/knockoff/ [PegDown] is copyright (c) 2010 by Mathias Doenitz
[PegDown]: https://github.com/sirthias/pegdown


Project link: https://github.com/rjeschke/txtmark

Description
No description provided
Readme 293 KiB
Languages
Java 79.5%
HTML 19.2%
Python 1.3%