Hi!
On Mon, Oct 13, 2003 at 11:02:46AM +1300, Phillip Pearson wrote:
> I just checked out html_cleaner.py and got it to leave entities alone.
> Also got it to keep track of open tags and to:
I tried it and noticed that it removes SCRIPT stuff. But it might not remove
those on* handlers, right? So maybe we could plug stripogram into html_cleaner
to get all of it? Or could html_cleaner be reworked so that it only passes
on those attributes and tags we approve?
bye, Georg
> I only get connection refused on your servers? Weird. Too bad that
Robertos
> Link showed up on linuxtoday just today, they all get just a connection
> refused. Or was your server linuxtodayed? ;-)
<sigh>
After I fix _my_ problems, the server gets rebooted, making it unavailable
_again_!
But it's working again now :-)
Cheers,
Phil
OK, looks like the last database rebuild kicked the server on pycs.net
back into life. It's using 47 megs of RAM now and has gone through 2
CPU minutes since the restart. That works out to about 30
hits per CPU second, though, so I'm not too worried.
If anybody sees the server going down or becoming unreachable again,
drop me a line. With any luck this problem will go away for another
year though! It looks like the same thing that happened about a year
ago.....hmm.
Cheers,
Phil :)
I just checked out html_cleaner.py and got it to leave entities alone.
Also got it to keep track of open tags and to:
- discard closing tags with no matching opening tag
- close all unclosed tags at the end of the comment
That means:
<b>foo
-> <b>foo</b>
and foo</b>
-> foo
also <foo>
-> <foo>
Georg - this all happens when the comment is displayed, not saved,
because otherwise bugs like this would result in permanently damaged
comments. This way, as soon as a fix is checked in, all comments
suddenly start working :-)
Cheers,
Phil
> Maybe we should switch from html_cleaner to use stripogram? That's what
> I use in PyDS. Seems to work quite nicely. I _think_ that's what Yasushi
> says on his weblog, too. Babelfish is sometimes a bit problematic -
> sometimes the english version doesn't make more sense than the japanese
> one ;-)
Sounds fine to me. html_cleaner _is_ used (in comments/__init__.py)
but Yasushi's correct in that it doesn't attempt to strip out dodgy
html tags. I'm sure there's a bunch of XSS holes in PyCS - we should
drop stripogram in there and use it for things like referrer pages as
well ...
Cheers,
Phil
Hi!
Maybe we should switch from html_cleaner to use stripogram? That's what
I use in PyDS. Seems to work quite nicely. I _think_ that's what Yasushi
says on his weblog, too. Babelfish is sometimes a bit problematic -
sometimes the english version doesn't make more sense than the japanese
one ;-)
bye, Georg
Hi!
> I found that html_cleaner.py allow user input of any html tag. For
> example, if you input < and > in comment form, PyCS converts to
> < and >.
Weird. From the source it looks like html_cleaner isn't actually used,
at least not when storing the comment.
Phil: is there just the implementation missing, or is there something I
currently don't see?
bye, Georg
Hi,
I found that html_cleaner.py allow user input of any html tag. For
example, if you input < and > in comment form, PyCS converts to
< and >.