I want python to expose the histogram data in the txt file to quantitative data and store it in an array.

Asked 1 years ago, Updated 1 years ago, 442 views

I'm trying to create a program in Python 3 using the Jupiter Notebook.

What you want to do is to read the histogram data in the txt file into an array that is "dispersed into the original quantitative data".
The txt file contains information about multiple histograms.
The name of the histogram is written on the first row.(Example: Band1, Band2, Band3, ...)
In the second column, there is a histogram class value (DN) and
The third column describes the degree (Npts) corresponding to the class value in the second column.
Each histogram has a new line and is separated.
Also, the txt files are tabbed.
The contents of the txt file are as follows:
Enter a description of the image here

I would like to generate the following array from this txt file.

 Band 1 [178,179,179,180,180,180,181,181,...]
Band2 [179,179,179,179,179,179,179,180,181,181,...]
Band3 [181,181,...]

Additional information below

Thank you for your reply!I understand how to read the txt file.
Also, I'm sorry for the difficult question.
The sequence I want to generate is not like the one below.

 Band 1 [178,179,180,181,182,183,184,...] 

I would like to create an array similar to the one below.

 Band 1 [178,179,179,180,180,180,181,181,...] 

I think it will probably be a program such as turning the for statement about DN and Npts values and storing the same DN values in the array as many Npts values. For example, if the DN value is 198 and the Npts value is 4, place four values of 198 in the array.

python python3

2023-02-01 19:56

3 Answers

How to use groupby from Python standard package

(Updated a little)

with open(textfile) asfp:
    lst = [ln.split() for ln in fp
                if not ln.startswith('Histogram') and ln.strip()]

res = [ ]
for ln install:
    n=len(ln)
    if n>2:
        pre='.join(ln[:n-2])
    res.append([pre]+ln[-2:])

from itertools import groupby, repeat
fork,gin groupby(res,key=lambdaln:ln[0]):
    print(k,sum((list( repeat(int(ln[1], int(ln[2])))) for lning), start=[])))

# Band 1 [178,179,179,180,180,180,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181]
# Band2 [179, 179, 179, 179, 179, 179, 179, 180, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181,
# Band 3 [181, 181, 181, 181, 181, 181, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 18


2023-02-02 05:37

Assume that sample.txt exists in the same folder as the script file below.

I won't explain the code, so please check it out.

import pathlib

txt_path=str(pathlib.Path(__file__).parent) + '\\sample.txt'

data_dict=dict()

with open(txt_path, "r") asf:
    datas=f.readlines()
    usage=None
    for data in datas:
        if data:
            data_head=data.split()
            if data_head:
                if 'Histogram' in data_head:
                    pass
                eliflen(data_head) == 4:
                    usage='.join(data_head[:2])
                    DN,npts=list(map(int,data_head[2:]))
                    data_dict [usage] = {
                        'DN': [DN,],
                        'Npts': [npts,],
                    }
                eliflen(data_head) == 2:
                    DN,npts=list(map(int,data_head))
                    data_dict [usage]['DN'].append(DN)
                    data_dict [usage]['Npts'].append(npts)

print(data_dict)

Result Output

{'Band1':{'DN':[178,179,180,181,181,182,183], 'Npts':[1,2,3,6,8,10]}, 'Band2':{'DN':[179,180,181], 'Npts':[6,1,5]}, 'Band3':{'DN':[181,182,183], 'Npts', [10]}


2023-02-02 07:31

I would like to create an array similar to the one below.

If using Pandas.

import pandas as pd

df=pd.read_table('data.txt', sep=r'\t+', header=None, names=['Band', 'DN', 'Npts', engine='python')
df.loc [df['Npts'].isna()] = df.shift(axis=1)
band=df.query('Band!="Histogram"') .ffill().astype({'DN':int, 'Npts':int})\
         .groupby('Band').apply(lambdax:x['DN'].repeat(x['Npts']).to_dict())

print(band)

# {'Band1': [178,179,179,180,180,180,181,181,181,181,181,181,181,181,...,
#  'Band2': [179,179,179,179,179,179,179,180,181,181,181,181,181,181,...,
#  'Band 3': [181, 181, 181, 181, 181, 181, 182, 182, 182, 182, 182, 182, 182, ..., ]}


2023-02-02 08:38

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.