If one includes characters from the Unicode Supplemental Multilingual Plane (code points U+10000 upwards) in a story file, if one then asks for an HTML report from the test run the characters will not be HTML-escaped correctly.
For example, given a story file with the following scenario:
------------
Scenario: Some scenario
Given some situation
When I do something
Then the result is 𐐆
------------
(The "dagger"-type character is actually code point U+10406 - see http://en.wikibooks.org/wiki/Unicode/Character_reference/10000-10FFF)
The resulting HTML report will have the "dagger" character escaped as �� - which represent surrogate-pair code points (used in UTF-16 only) and so is rendered as gibberish in HTML. The escape should be 𐐆
NOTE: This is NOT a bug in JBehave per se - the bug is in the StringEscapeUtils class of commons-lang. A related bug has already been raised (and fixed) in commons-lang: https://issues.apache.org/jira/browse/LANG-617. Although the commons-lang bug report relates to XML escaping rather than HTML escaping, it seems likely that the fix will cover both. Unfortunately, the fix is in commons-lang 3.0...
Ah, web interface to JIRA is rendering the character escapes as HTML rather than escaping them. Here is what I meant to say in the penultimate paragraph:
The resulting HTML report will have the "dagger" character escaped as �� - which represent surrogate-pair code points (used in UTF-16 only) and so is rendered as gibberish in HTML. The escape should be 𐐆