I'm trying to create a program in Python 3 using the Jupiter Notebook.
What you want to do is to read the histogram data in the txt file into an array that is "dispersed into the original quantitative data".
The txt file contains information about multiple histograms.
The name of the histogram is written on the first row.(Example: Band1, Band2, Band3, ...)
In the second column, there is a histogram class value (DN) and
The third column describes the degree (Npts) corresponding to the class value in the second column.
Each histogram has a new line and is separated.
Also, the txt files are tabbed.
The contents of the txt file are as follows:
I would like to generate the following array from this txt file.
Band 1 [178,179,179,180,180,180,181,181,...]
Band2 [179,179,179,179,179,179,179,180,181,181,...]
Band3 [181,181,...]
Additional information below
Thank you for your reply!I understand how to read the txt file.
Also, I'm sorry for the difficult question.
The sequence I want to generate is not like the one below.
Band 1 [178,179,180,181,182,183,184,...]
I would like to create an array similar to the one below.
Band 1 [178,179,179,180,180,180,181,181,...]
I think it will probably be a program such as turning the for statement about DN and Npts values and storing the same DN values in the array as many Npts values. For example, if the DN value is 198 and the Npts value is 4, place four values of 198 in the array.
python python3
How to use groupby from Python standard package
(Updated a little)
with open(textfile) asfp:
lst = [ln.split() for ln in fp
if not ln.startswith('Histogram') and ln.strip()]
res = [ ]
for ln install:
n=len(ln)
if n>2:
pre='.join(ln[:n-2])
res.append([pre]+ln[-2:])
from itertools import groupby, repeat
fork,gin groupby(res,key=lambdaln:ln[0]):
print(k,sum((list( repeat(int(ln[1], int(ln[2])))) for lning), start=[])))
# Band 1 [178,179,179,180,180,180,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181,181]
# Band2 [179, 179, 179, 179, 179, 179, 179, 180, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181, 181,
# Band 3 [181, 181, 181, 181, 181, 181, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 182, 18
Assume that sample.txt exists in the same folder as the script file below.
I won't explain the code, so please check it out.
import pathlib
txt_path=str(pathlib.Path(__file__).parent) + '\\sample.txt'
data_dict=dict()
with open(txt_path, "r") asf:
datas=f.readlines()
usage=None
for data in datas:
if data:
data_head=data.split()
if data_head:
if 'Histogram' in data_head:
pass
eliflen(data_head) == 4:
usage='.join(data_head[:2])
DN,npts=list(map(int,data_head[2:]))
data_dict [usage] = {
'DN': [DN,],
'Npts': [npts,],
}
eliflen(data_head) == 2:
DN,npts=list(map(int,data_head))
data_dict [usage]['DN'].append(DN)
data_dict [usage]['Npts'].append(npts)
print(data_dict)
Result Output
{'Band1':{'DN':[178,179,180,181,181,182,183], 'Npts':[1,2,3,6,8,10]}, 'Band2':{'DN':[179,180,181], 'Npts':[6,1,5]}, 'Band3':{'DN':[181,182,183], 'Npts', [10]}
I would like to create an array similar to the one below.
If using Pandas.
import pandas as pd
df=pd.read_table('data.txt', sep=r'\t+', header=None, names=['Band', 'DN', 'Npts', engine='python')
df.loc [df['Npts'].isna()] = df.shift(axis=1)
band=df.query('Band!="Histogram"') .ffill().astype({'DN':int, 'Npts':int})\
.groupby('Band').apply(lambdax:x['DN'].repeat(x['Npts']).to_dict())
print(band)
# {'Band1': [178,179,179,180,180,180,181,181,181,181,181,181,181,181,...,
# 'Band2': [179,179,179,179,179,179,179,180,181,181,181,181,181,181,...,
# 'Band 3': [181, 181, 181, 181, 181, 181, 182, 182, 182, 182, 182, 182, 182, ..., ]}
© 2024 OneMinuteCode. All rights reserved.