Transpose results when creating dataloader in torchtext.data.Iterator

Asked 2 years ago, Updated 2 years ago, 142 views

I want to load the CSV file (text, label) in torchtext and turn it into dataloader, but the result will be transferred.
I think the shape of train_loader will be (Batch_size, vocabulary_size), but it will be (vocabulary_size, Batch_size).

Could someone please point out the cause?

# Field definitions
max_len = 25
TEXT=torchtext.data.Field(sequential=True, use_vocab=True, fix_length=max_len)
LABEL=torchtext.data.Field (sequential=False, use_vocab=False, is_target=True)

# Loading Data
dataset=torchtext.data.TabularDataset(
    path='./sample.csv',
    format = 'csv',
    fields=[('Text',TEXT),('Label',LABEL)])

# Split by train and test
train_dataset, test_dataset=dataset.split(split_ratio=0.8)

# word identification
TEXT.build_vocab(train_dataset, min_freq=5)

# Creating a Data Loader
train_loader=torchtext.data.Iterator(train_dataset,train=True,batch_size=4)
test_loader=torchtext.data.Iterator(test_dataset, train=False, sort=False, batch_size=4)

python pytorch

2022-09-30 17:57

1 Answers

I think it's because of the original specifications.
The reason for this specification is that (vocabulary_size, Batch_size) would have been more convenient if you put the data in LSTM, because you loop it along the sequence direction of the text, then put the batch data of each word in order.


2022-09-30 17:57

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.