Help me.Linear regression function, characteristics of English words and frequency of use of words

Characteristics of English words and frequency of use of words The second project deals with the British National Corpus (BNC) Text Corpus (Word Collection), which consists of about 100 million English words.

Let's graphically visualize the frequency of 10,000 words stored in the words.txt file and apply linear regression. Read the comments and complete the do_linear_regression() function so that the execution results are output.

<Task>

Understand the code with the annotation and check the return value of each function.

Write the do_linear_regression() function (44th line).

Press the Run button and check the chart being printed.

Uncomment the 21st and 22nd lines of the main() function. Press the Run button and check the newly printed chart.

Compare the printed graph with the [execution results] below and press the Submit button.

<Code>

import operator from sklearn.linear_model import LinearRegression import numpy as np import matplotlib matplotlib.use('Agg') import matplotlib.pyplot as plt import elice_utils

def main(): words = read_data()

**# words.txt, sorted by frequency.
words = sorted(____)***

# Stores words expressed as integers in the X-axis list and the frequency of each word in the Y-axis list.  
X = list(range(1, len(words)+1))
Y = [x[1] for x in words]


# Convert the X and Y lists into arrays and apply log() to each element value.
X, Y = np.array(X), np.array(Y)  
X, Y = np.log(X), np.log(Y)

# Obtain the slope and intercepts, and then output graphs and charts. 
slope, intercept = do_linear_regression(X, Y)
draw_chart(X, Y, slope, intercept)

return slope, intercept

def read_data():

Read the words in

*# words.txt, 
# [[Word 1, Frequency], [Word 2, Frequency]...] Convert to type and return.
words = []*


return words

def do_linear_regression(X, Y): # Write the do_linear_regression() function.

return (slope, intercept)

def draw_chart(X, Y, slope, intercept): fig = plt.figure() ax = fig.add_subplot(111) plt.scatter(X, Y)

# Sets the X and Y axis ranges and graphs of the chart.
min_X = min(X)
max_X = max(X)
min_Y = min_X * slope + intercept
max_Y = max_X * slope + intercept
plt.plot([min_X, max_X], [min_Y, max_Y], 
         color='red',
         linestyle='--',
         linewidth=3.0)

# Use the slope and intercepts to enter the graph into the chart.
ax.text(min_X, min_Y + 0.1, r'$y = %.2lfx + %.2lf$' % (slope, intercept), fontsize=15)

plt.savefig('chart.png')
elice_utils.send_image('chart.png')

if name == "main": main()

I have to write three parts.I have no idea.Help me

linear regression-analysis

2022-09-22 18:20