To convert HTML into Unicode in Python, Python has the htmlentitydefs module, but this doesn’t include a function to unescape HTML entities.
Python developer Fredrik Lundh (author of elementtree, among other things) has such a function (you can find it here) on his website, which works with decimal, hex and named entities.