I'm trying to run it with XML file and python file like below, but I don't know how to get rid of the error.Also, is there a way to deal with garbled characters (using UTF-8)?
Error Code (Results)
C:\Users\g21125\python_xml_ex>python copy-element.py
Traceback (most recent call last):
File "copy-element.py", line 20, in <module>
print(xdoc.toxml())
File "C:\Python27\lib\xml\dom\minidom.py", line46, in toxml
return self.toprettyxml("", "", encoding)
File "C:\Python27\lib\xml\dom\minidom.py", line61, intoprettyxml
return writer.getvalue()
File "C:\Python27\lib\StringIO.py", line271, in getvalue
self.buf+='.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: original not in range (128)
C:\Users\g21125\python_xml_ex>
XML Files
<!--language:lang-xml-->
<?xml version="1.0" encoding="UTF-8"?>
<recipe name="bread" preparations-time="5 minutes" cooking-time="3 hours">
<dish>Basic Bread</dish>
<material quality='3' unit='cup'>flour</material>
<material quality='0.25' unit='ounce'>East</material>
<material quality='1.5' unit='cup'>water</material>
<material quality='1' unit='teaspoon'>salt</material>
<point>
<process> Mix all ingredients together.</process>
<process> Finely knead.</process>
Cover with <process> cloth and leave in a warm room for an hour.</process>
<process>Combine again.</process>
Place in <process> baking container.</process>
Cover with <process> cloth and leave in a warm room for an hour.</process>
<process> Place in oven and bake at 180°C for 30 minutes.</process>
</point>
</recipe>
Python Code
<!--language:lang-python -->
# coding —utf-8
# clone and add element
from xml.dom import minimum
# Load the sample.xml file
xdoc=minidom.parse("sample.xml")
# Get recipe
recipe=xdoc.documentElement
# Copy recipe
recipe2 = xdoc. documentElement.cloneNode(recipe)
# Change the name of the dish in the copied recipe
recipe2.getElementsByTagName("dish").item(0).childNodes[0].data="Inconvenient Bread"
# Add copied recipe to document
xdoc.insertBefore(recipe2,recipe)
# Convert and display content into a string
print(xdoc.toxml())
Run Results
<?xml version="1.0"?>
<recipe cooking-time="3 hours" name="pan" preparations-time="5 minutes">
<dish>Inconvenient bread</dish>
<material quality="3" unit="cup">flour</material>
<material quality="0.25" unit="ounce">East</material>
<material quality="1.5" unit="cup">water</material>
<material quality="1" unit=" teaspoon"> salt</material>
<point>
<process> Mix all ingredients together.</process>
<process> Finely knead.</process>
Cover with <process> cloth and leave in a warm room for an hour.</process>
<process>Combine again.</process>
Place in <process> baking container.</process>
Cover with <process> cloth and leave in a warm room for an hour.</process>
<process> Place in oven and bake at 180°C for 30 minutes.</process>
</point>
</recipe>
<recipe cooking-time="3 hours" name="pan" preparations-time="5 minutes">
<dish>Basic Bread</dish>
<material quality="3" unit="cup">flour</material>
<material quality="0.25" unit="ounce">East</material>
<material quality="1.5" unit="cup">water</material>
<material quality="1" unit=" teaspoon"> salt</material>
<point>
<process> Mix all ingredients together.</process>
<process> Finely knead.</process>
Cover with <process> cloth and leave in a warm room for an hour.</process>
<process>Combine again.</process>
Place in <process> baking container.</process>
Cover with <process> cloth and leave in a warm room for an hour.</process>
<process> Place in oven and bake at 180°C for 30 minutes.</process>
</point>
</recipe>
The direct cause of the error is a mixture of unicode
and str
in the XML object.
As noted in the documentation at StringIO.py, you should consciously consider encoding (when dealing with multi-byte characters like Japanese).
To resolve this exception, add "u" to the 26th line string:
recipe2.getElementsByTagName("dish").item(0).childNodes[0].data=u "Inconvenient Bread"
If you actually correct the line as above and try again, you will be able to confirm the successful execution of the process.
Also, is there a way to deal with garbled characters (using UTF-8)?
I think the code is probably running on Windows, so I think the encoding on the console is SJIS/CP932.If you pass the appropriate encoding to toxml()
, it will be solved as follows:
print(xdoc.toxml(encoding='sjis'))
© 2024 OneMinuteCode. All rights reserved.