We can reproduce this issue as well with 3.9.1.
Our use case is very simple: We rerun our story a number of times by simply calling:
for(i=0; i < repetitions; i++) {
embedder.runStoriesAsPaths(storyPaths);
}
This ends in a StoryRunner instance per repetition and hence in an OutOfMemory Exception after a lot of repetitions as the reporters which are also retained can keep a lot of recorded data.
Moreover NOT calling remove() on the ThreadLocals after the story execution has finished isn't a good practice anyway 
Can you please provide more info about this leak. Given the ThreadLocals are set at each run, it's not clear what benefit would bring to remove them at the end of the story run.