Pretty interesting, but sad that their optimized version isn't much better than the original, because they had to introduce a copy to handle the in-place encryption library. :/
There's an interesting discussion on hackernews that speculates about why they moved the crypto code to kernelspace vs moving the socket i/o to userspace. https://news.ycombinator.com/item?id=9387220
There are also equivalent facilities appearing in Clang. See this page: http://clang.llvm.org/docs/LanguageExtensions.html#threadsafety
EDIT: Sparse has also had similar facilities for a while, check it out here: https://sparse.wiki.kernel.org/