New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize CGI.escapeHTML for ASCII-compatible encodings #1164
Conversation
static VALUE rb_cCGI, rb_mUtil, rb_mEscape; | ||
|
||
static VALUE | ||
html_escaped_cat(VALUE str, char c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function returns no values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. Fixed in k0kubun@39cdd83.
|
||
if (modified) { | ||
rb_str_cat(dest, cstr + beg, len - beg); | ||
return dest; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the encoding of str
is ignored and dest
is always ASCII-8BIT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to associate original encoding and added test case for it in k0kubun@2162835.
419f087
to
2162835
Compare
How much is the final performance gain after fixing the return value not being duped when no replacement takes place? |
You can check performance gain with this benchmark https://gist.github.com/k0kubun/8e1c7efb1e29991e1382. This is the result of ruby compiled from latest revision b58b970.
|
Thanks. So, the new implementation is 3x faster even in the worst cases. Great job! |
* cgi/escape/escape.c: Optimize CGI.escapeHTML for ASCII-compatible encodings. [Fix rubyGH-1164] git-svn-id: svn+ssh://svn.ruby-lang.org/ruby/trunk@53220 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
As commented in #156 (comment), I rewrote
CGI.escapeHTML
in C, which is used byERB::Util#html_escape
.Since escaping HTML is expensive in rendering a template, I want it to be faster.
For now, I optimized it only for strings whose encoding is ASCII-compatible.
With this benchmark https://gist.github.com/k0kubun/b6af6062bc876190e280, it's about 7 times faster than original implementation in escaping html.