I am thinking of reading and translating PDF and writing it out in Outlook draft.
(Use the https://www.cpi-japan.com/services/ PDF file of IELTS Reading Samples at the top of this site)
An error occurred in the following parts.
An exception has occurred: TypeError
the JSON object must be str, bytes or byte array, not NoneType
File "C:\Users\NAME\Documents\Practice\2023_01_rensyu\test7.py", line 27, in<module>
result=tr.translate(txt_list, src="en", dest="ja").text
I would appreciate your help.I created it by referring to the following site.
1. Load text data in PDF file https://python-work.com/pdf-get-text/
2.Translated by googletrans (https://plog.shinmaiblog.com/python-translator/)
3. Outlook Draft https://gist.github.com/akilab/529c6954ddbc6024a0e8e2a381126e83)
version:Python 3.9.15
# Library Settings
import fitz#pymupdf library
import openpyxl aspx
from openpyxl.style import Alignment
from googletransport Translator
import json
# Creating a list containing PDF text
txt_list = [ ]
# load a PDF file
filename = 'IELTS.pdf'
doc=fitz.open(filename)
for page in range (len(doc)) :
text=doc[page].get_text()
text=text.replace('\n', '')
txt_list.append([page+1,text])
# translate
tr = Translator()
result=tr.translate(txt_list, src="en", dest="ja").text
# ↑ Error Location: print(result)
#Create a draft in Outlook
_VERSION_='v0.1.0'
def show_version():
print("="*20)
print("Draft Creator VERSION{}.format(_VERSION_).center(20))"
print("="*20)
defmain():
show_version()
pythoncom.CoInitialize()
outlook = win32 com.client.Dispatch("Outlook.Application")
mapi=outlook.GetNamespace("MAPI")
# draft_box=mapi.GetDefaultFolder(16)#16 appears to be the draft folder number
draft_box=mapi.Folders("My email address.co.jp").Folders("Draft")
mail=outlook.CreateItem(0)
mail.To='[email protected]'
mail.Subject = 'Title'
mail.HtmlBody=result#←I want to put the variable of the translation result here
mail.Move(draft_box)
if__name__=='__main__':
sys.exit(main())
what someone did
I heard that it is an object error that cannot be used in JSON, so I thought it might be converted to a string type object.
I inserted the following before the error occurred, but an exception occurred.
txt_list=json.loads(txt_list)
TypeError
The JSON object must be str, bytes or byte array, not list
File "C:\Users\NAME\Documents\For Practice\2023_01_rensyu\test7.py", line 24, in
txt_list=json.loads(txt_list)
add
I revised it based on the comments you gave me.
# Library Settings
import fitz#pymupdf library
import openpyxl aspx
from openpyxl.style import Alignment
from googletransport Translator
import json
import re
# Creating a list containing PDF text
txt_list = [ ]
# load a PDF file
filename = 'IELTS.pdf'
doc=fitz.open(filename)
for page in range (len(doc)) :
text=doc[page].get_text()
text=text.replace('\n', '')
txt_list.append(text)**#List of changes/string only**
# print(txt_list)
# JSON conversion
#txt_list=json.load(txt_list)
# translate
tr = Translator()
result=tr.translate(txt_list, src="en", dest="ja")#** Where the error occurred**
for results in result:
print(results.origin, '->', results.text)
"""
#Create a draft in Outlook
_VERSION_='v0.1.0'
def show_version():
print("="*20)
print("Draft Creator VERSION{}.format(_VERSION_).center(20))"
print("="*20)
defmain():
show_version()
pythoncom.CoInitialize()
outlook = win32 com.client.Dispatch("Outlook.Application")
mapi=outlook.GetNamespace("MAPI")
# draft_box=mapi.GetDefaultFolder(16)#16 appears to be the draft folder number
draft_box=mapi.Folders("My Email Address co.jp").Folders("Draft")
mail=outlook.CreateItem(0)
mail.To='[email protected]'
mail.Subject = 'Title'
mail.HtmlBody=page1
mail.Move(draft_box)
if__name__=='__main__':
sys.exit(main())
"""
Error message where the error occurred in the code above
An exception has occurred: TypeError
the JSON object must be str, bytes or byte array, not NoneType
File "C:\Users\NAME\Documents\For Practice\2023_01_rensyu\test7.py", line 33, in
result=tr.translate(txt_list, src="en", dest="ja")
In terms of content, I tried to change the content of the following article that I was referring to myself, but it didn't work.
[Python] Try using gooletrans for automatic translation.
This article specifies the text to be translated from googletrans as a single string instead of a list.
On the other hand, the source code of the question is probably intended to speed up, but it gives you a list of strings (originally a list of numbers and strings).
Initially: List of Numbers and Strings
txt_list.append([page+1,text])
changing:list of strings
txt_list.append(text)#**Only list of changes/string**
Also, in the latest stable version (3.0.0) document, the list of strings could also be specified as parameters as follows:
translate(text, dest='en', src='auto',**kwargs)
Source of questions (modifying)
tr=Translator()
result=tr.translate(txt_list, src="en", dest="ja")#** Where the error occurred**
The latest stable version (3.0.0) has a bug (specification change?) and needs to be used in the pre-release version (4.0.0-rc1) as described in the opening article and the following article.
Pre-RELEASE can avoid errors in googletrans
Also, it seems that the pre-release version (4.0.0-rc1) does not support a list of strings, and the text to be translated must be specified as a single string.
As mentioned in the opening article and the questioner's own comments, the following should be done:print(result)
should be used to create data to pass to Outlook.
tr=Translator()
try:
for pagetext intxt_list:
# One page is a single string and translates it into Japanese.
result=tr.translate(pagetext, src="en", dest="ja").text
print(result)
except Exception as:
print(e)
By the way, the part where the comment says, "This time, the last line sys.exit(main()) has an exception: NameError name 'sys' is not defined" is because the corresponding import sys
has not been done.
© 2024 OneMinuteCode. All rights reserved.