Transpose results when creating dataloader in torchtext.data.Iterator

I want to load the CSV file (text, label) in torchtext and turn it into dataloader, but the result will be transferred.
I think the shape of train_loader will be (Batch_size, vocabulary_size), but it will be (vocabulary_size, Batch_size).

Could someone please point out the cause?

# Field definitions
max_len = 25
TEXT=torchtext.data.Field(sequential=True, use_vocab=True, fix_length=max_len)
LABEL=torchtext.data.Field (sequential=False, use_vocab=False, is_target=True)

# Loading Data
dataset=torchtext.data.TabularDataset(
    path='./sample.csv',
    format = 'csv',
    fields=[('Text',TEXT),('Label',LABEL)])

# Split by train and test
train_dataset, test_dataset=dataset.split(split_ratio=0.8)

# word identification
TEXT.build_vocab(train_dataset, min_freq=5)

# Creating a Data Loader
train_loader=torchtext.data.Iterator(train_dataset,train=True,batch_size=4)
test_loader=torchtext.data.Iterator(test_dataset, train=False, sort=False, batch_size=4)

python pytorch

2022-09-30 17:57

1 Answers

I think it's because of the original specifications.
The reason for this specification is that (vocabulary_size, Batch_size) would have been more convenient if you put the data in LSTM, because you loop it along the sequence direction of the text, then put the batch data of each word in order.

2022-09-30 17:57

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656