Outrageous memory usage with cElementTree (found it)

Shahms King shahms at shahms.com
Fri Apr 7 23:18:00 UTC 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The problem is not with ElementTree, but can be traced to a specific
regular expression:

_escape = re.compile(u'[&<>"\x80-\uffff]+')

The memory usage doesn't go down after:

>>> del _escape
>>> re.purge()
>>> gc.collect()

either.  I suspect it's a bug/misplaced optimization in the re module,
but haven't investigated any further than that.  The specific problem
with the regular expression in the \x80-\uffff range.  Changing that to
'\ufffa-\uffff' gets rid of the astronomical memory usage.  I'm going to
keep looking into this as it's a particularly vexing bug ;-P

I've updated the bugzilla bug to reflect these changes.


- --
Shahms E. King <shahms at shahms.com>
Multnomah ESD

Public Key:
http://shahms.mesd.k12.or.us/~sking/shahms.asc
Fingerprint:
1612 054B CE92 8770 F1EA  AB1B FEAB 3636 45B2 D75B
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFENvMo/qs2NkWy11sRAkgUAKCiKIAgQK/2tZHLr9tQbRvErQpILQCeNue1
+yYKq95ChuYCcakFGWu9ZSA=
=w3yL
-----END PGP SIGNATURE-----




More information about the fedora-devel-list mailing list