To find out how many times certain words appear in a string (text file) and to create a program that organizes items in descending order according to the frequency, we wrote the following code: For example, in a text file named fair_tail.txt,
A long time ago, \n there lived a beautiful princes.
There's a sentence that says, and I want to convert it into a list format.
What I want is
First, we convert this text file into a list called [A], 'long', 'time' 'ago'', 'there', 'lived', 'a', 'beautiful', 'prince'.
Second, count the frequency and organize it into dictionaries of {'A':1, 'long':1, 'time':1, 'ago':1', ':1, 'there':1, 'lived':1, 'a':1, 'beautiful':1, and 'prince':1}.
Third, values organized in dictionaries are converted into lists in the form of ordered pairs and organized.
I don't know where it was written wrong in the first part. If you run the code as shown below, the list is
[(1, 'A'), (1, 'long'), (1, 'time'), (1, 'ago,')] [(1, 'there'), (1, 'lived'), (1, 'a'), (1, 'beautiful'), (1, 'princes.')]
It's created like this. Why isn't everything in one list? I also removed the space using the strip... The code that appears below is the code that I wrote.
story = open('fairy_tale.txt')
for line in story:
line = line.strip()
words = line.split()
counts = dict()
for word in words:
counts[word] = counts.get(word,0) + 1
tmp = [(v , k) for (k),(v) in counts.items()]
print(tmp)
The reason why there are two lists is because the story is two lines, so it is repeated twice by for line in story.
It's repeated twice, so the list is printed twice.
Modify the code written by the questioner, keeping it as long as possible.
Separate the logic of extracting only words because the part where you count words should not be repeated.
story = open('fairy_tale.txt')
words = (word for line in story for word in line.split()) #Token(word)
counts = dict()
for word in words:
counts[word] = counts.get(word,0) + 1
tmp = [(v , k) for (k),(v) in counts.items()]
print(tmp)
© 2024 OneMinuteCode. All rights reserved.