You want to define a tokenize function that obtains the Python N-gram.

Asked 2 years ago, Updated 2 years ago, 107 views

Hello, I'm studying Python functions.

You want to define a function called tokenize, but you want to create a function that will change 2-gram 3-gram depending on N in the parameter.


 def tokenize(trg, N=1):
    a = a.split()                 
    print(a)
    for x in range(0,N+1):
        for i in range(len(a) - N):    
        return (a[i+0], a[i+1]) #2 grams, a[i+0], a[i+1], 3 grams, a[i+0], a[i+1], a[i+2]

I have a whole understanding of the n-gram. The return should be expressed as a[i] or a[i], a[i+1] according to N.

I put forx in range(0,N+1): to express this in a repetitive way, but I don't know how to express it because it's not a string.

How can I express it to change the return according to the change of N?

python nlp

2022-09-20 17:56

1 Answers

def tokenize(trg, N=1):
    a = trg.split()                 
    d = []
    for i in range(len(a) - N+1):
        b = a[i:i+N]
        c = (' ').join(b)
        d.append(c)
    return d


def main():
    a="There was a farmer who had a dog ."
    print(tokenize(a))
    print(tokenize(a, 2))
    print(tokenize(a, 3))

main()

That wasn't enough I solved it like this Thank you!


2022-09-20 17:56

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.