I tried to load the Japanese file (prn) using NetworkX, but I got the following error.
I'm sorry for the rudimentary point, but I'd appreciate it if you could give me some advice.
Error
QT ------------------------------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-28-f4f3d26af7f2>in<module>()
2G = nx.DiGraph()
3# Create edge (side) list by loading files
---->4G=nx.read_edgelist("sm10.prn", nodetype=int, create_using=nx.DiGraph())
5
6
<C:\Users\IWAMOTO MOMOKA\Anaconda2\lib\site-packages\decorator.pyc:decorator-gen-703>in read_edgelist (path, comments, delimiter, create_using, nodetype, data, edit type, encoding)
C:\Users\IWAMOTO MOMOKA\Anaconda2\lib\site-packages\networkx\utils\decorators.pycin_open_file(func_to_be_decorated, *args, **kwargs)
238 #Finally, we call the original function, making sure to close the fobj
239try:
-->240 result=func_to_be_decorated (*new_args, **kwargs)
241 finally:
242 if close_fobj:
C:\Users\IWAMOTO MOMOKA\Anaconda2\lib\site-packages\networkx\readwrite\edgelist.pycin read_edgelist (path, comments, delimiter, create_using, nodetype, data, edgetype, encoding)
367 return parse_edgelist(lines, comments=comments, delimiter=delimiter,
368 create_using = create_using, nodetype = nodetype,
-->369 data=data)
370
371
C:\Users\IWAMOTO MOMOKA\Anaconda2\lib\site-packages\networkx\readwrite\edgelist.pic in parse_edgelist(lines, comments, delimiter, create_using, nodetype, data)
267 except:
268 raise TypeError("Failed to convert nodes %s, %s to type %s."
-->269% (u, v, nodetype))
270
271 iflen(d) == 0 or data is False:
<type'str'>:(<type' exceptions.UnicodeEncodeError'>, UnicodeEncodeError('ascii',u"Failed to convert nodes\u9d8f,\u305f\u307e\u3054 to type<type'int'>',24internal)
UNQT -----------------------------------------------------------------
The original files and codes are as follows:
QT ---------------------------------------------------------------------
Chicken egg
rice egg
rice cooked with omelet
charha omelet rice
chicken with a long-nosed on its back
UNQT -------------------------------------------------------------
QT ----------------------------------------------------------
#coding=UTF-8
# function declaration
import networkx as nx
import string
import pandas aspd
import collections
import itertools
import matplotlib.pyplot asplt
import numpy as np
# Specify directed graph
G=nx.DiGraph()
# Create edge (side) list by loading files
G=nx.read_edgelist("sm10.prn", nodetype=int, create_using=nx.DiGraph())
# Number of nodes (vertex) output
print(nx.number_of_nodes(G))
# Edge Count Output
print(nx.number_of_edges(G))
# Network Basic Information Output
print(nx.info(G))
# order distribution
print(nx.degre_histogram(G))
UNQT -------------------------------------------------------------
python networkx
cause determination
First of all, I got to the point where something was displayed without any errors.
Two points I have already pointed out
In addition to "The strings in the third and fourth lines of the data file are not separated and are not valid data".This may be a transcription error when I wrote the questionnaire, but I corrected the data as follows and it was processed.
Chicken egg
rice egg
rice with omelet rice
fried rice omelet rice
chicken with a long-nosed on its back
The output is as follows:
6
5
Name:
Type—DiGraph
Number of nodes—6
Number of edges—5
Average in degree: 0.8333
Average out degree: 0.8333
[0, 2, 4]
Environmentally I use Windows 10 64bit, Python 3.7.6, NetworkX 2.4, pandas 1.0.1, matplotlib 3.2.0, numpy 1.18.1.
Below: Initial Answer
Simply because the data file is made with UTF-16 instead of UTF-8 .
You can convert the data file to UTF-8 or specify encoding
as utf-16
when calling read_edgelist
.
read_edgelist(path, comments='#', delimiter=None, create_using=None, nodetype=None, data=True, edgetype=None, encoding='utf-8')
If you look at the code in the error message, it's UTF-16.
<type'str'>:(<type' exceptions.UnicodeEncodeError'>, UnicodeEncodeError('ascii',u"Failed to convert nodes\u9d8f,\u305f\u307e\u3054 to type<internot;in';24,25)
\u9d8fChicken
\u305f just
\u307e still
\u3054Go
The error 0x0a
in the comment is a newline code, so it's usually part of a character, and it's a little strange that it's an error.
So after searching for the data format, if you look at the article below, why don't you specify nodetype=int
instead of nodetype=str
.
Introduction to NetworkX in Python
Here's an example of the data.
01
0 2
0 3
0 4
:
Here's a sample program
#Create Graph
G=nx.read_edgelist('facebook_combined.txt', nodetype=int)
Analyze and visualize your network with Python!Required Steps Summary
Here's an example of the data.
Actual data: edgelist.txt
AB
AC
Ad
AE
AF
:
Here's a sample program
G=nx.read_edgelist('edgelist.txt', nodetype=str)
© 2024 OneMinuteCode. All rights reserved.