Is there a way to count or sum key values such as list-dict type in python (jython 2.7)?

Asked 2 years ago, Updated 2 years ago, 100 views

Last time, I asked you how to count the same values in a list-dict type of data Conversely, how do you count multiple values of the same key? For your information, ** does not eat in Jaison 2.7.

origin = [
    { { 'pk':'1', 'a':1, 'b':0, 'c':0, 'd':0, 'e':1 , 'date':'20210308'},
    { { 'pk':'1', 'a':0, 'b':0, 'c':0, 'd':0, 'e':1 , 'date':'20210308'},
    { { 'pk':'1', 'a':1, 'b':0, 'c':0, 'd':0, 'e':1 , 'date':'20210308'},
    { { 'pk':'1', 'a':0, 'b':0, 'c':0, 'd':0, 'e':1 , 'date':'20210308'},
    { { 'pk':'1', 'a':1, 'b':2, 'c':0, 'd':0, 'e':1 , 'date':'20210308'},
    { { 'pk':'2', 'a':0, 'b':0, 'c':0, 'd':0, 'e':1 , 'date':'20210308'},
    { { 'pk':'2', 'a':0, 'b':0, 'c':0, 'd':0, 'e':0 , 'date':'20210308'},
]

Desired Results>
[
    { { 'pk':'1', 'a':3, 'b':2, 'c':0, 'd':0, 'e':5 , 'date':'20210308'},
    { { 'pk':'2', 'a':0, 'b':0, 'c':0, 'd':0, 'e':1 , 'date':'20210308'}
]

python python3.6 python2.7

2022-09-20 17:44

3 Answers

The simplest way is to do it is to do it as a panda groupby. But I thought you said you can't use Pandas in the last question...

You just have to do it the same way.

Suppose the column name is constant data in table form.

from pprint import pprint

origin = [
    {"pk": "1", "a": 1, "b": 0, "c": 0, "d": 0, "e": 1, "date": "20210308"},
    {"pk": "1", "a": 0, "b": 0, "c": 0, "d": 0, "e": 1, "date": "20210308"},
    {"pk": "1", "a": 1, "b": 0, "c": 0, "d": 0, "e": 1, "date": "20210308"},
    {"pk": "1", "a": 0, "b": 0, "c": 0, "d": 0, "e": 1, "date": "20210308"},
    {"pk": "1", "a": 1, "b": 2, "c": 0, "d": 0, "e": 1, "date": "20210308"},
    {"pk": "2", "a": 0, "b": 0, "c": 0, "d": 0, "e": 1, "date": "20210308"},
    {"pk": "2", "a": 0, "b": 0, "c": 0, "d": 0, "e": 0, "date": "20210308"},
]

sum_dict: dict = {}
columns = ["a", "b", "c", "d", "e"]

for dat in origin:
    sum_ = sum_dict.get(dat["pk"], {col: 0 for col in columns})
    sum_ = {col: sum_[col] + dat[col] for col in columns}
    sum_dict[dat["pk"]] = sum_

pprint(sum_dict)

# # {'1': {'a': 3, 'b': 2, 'c': 0, 'd': 0, 'e': 5},
#  #  '2': {'a': 0, 'b': 0, 'c': 0, 'd': 0, 'e': 1}}

res = []
for key, val in sum_dict.items():
    r = {"pk": key}
    r.update(val)
    res.append(r)

pprint(res)

# # [{'a': 3, 'b': 2, 'c': 0, 'd': 0, 'e': 5, 'pk': '1'},
#  #  {'a': 0, 'b': 0, 'c': 0, 'd': 0, 'e': 1, 'pk': '2'}]


2022-09-20 17:44

The standard module has a groupby implemented.

origin = [
    {"pk": "1", "a": 1, "b": 0, "c": 0, "d": 0, "e": 1, "date": "20210308"},
    {"pk": "1", "a": 0, "b": 0, "c": 0, "d": 0, "e": 1, "date": "20210308"},
    {"pk": "1", "a": 1, "b": 0, "c": 0, "d": 0, "e": 1, "date": "20210308"},
    {"pk": "1", "a": 0, "b": 0, "c": 0, "d": 0, "e": 1, "date": "20210308"},
    {"pk": "1", "a": 1, "b": 2, "c": 0, "d": 0, "e": 1, "date": "20210308"},
    {"pk": "2", "a": 0, "b": 0, "c": 0, "d": 0, "e": 1, "date": "20210308"},
    {"pk": "2", "a": 0, "b": 0, "c": 0, "d": 0, "e": 0, "date": "20210308"},
]

import itertools as it
from functools import reduce
import operator as op

[{'pk':key[0], 
  **reduce(lambda a, b: {k: a[k] + b[k] for k in a if k not in ['pk', 'date']}, grouped),
  'date':key[1]}  
 for key, grouped in it.groupby(origin, op.itemgetter('pk', 'date'))]

[{'pk': '1', 'a': 3, 'b': 2, 'c': 0, 'd': 0, 'e': 5, 'date': '20210308'},
 {'pk': '2', 'a': 0, 'b': 0, 'c': 0, 'd': 0, 'e': 1, 'date': '20210308'}]


2022-09-20 17:44

If the value is fixed, wouldn't it be possible to proceed in this way?

origin = [
    { { 'pk':'1', 'a':1, 'b':0, 'c':0, 'd':0, 'e':1 , 'date':'20210308'},
    { { 'pk':'1', 'a':0, 'b':0, 'c':0, 'd':0, 'e':1 , 'date':'20210308'},
    { { 'pk':'1', 'a':1, 'b':0, 'c':0, 'd':0, 'e':1 , 'date':'20210308'},
    { { 'pk':'1', 'a':0, 'b':0, 'c':0, 'd':0, 'e':1 , 'date':'20210308'},
    { { 'pk':'1', 'a':1, 'b':2, 'c':0, 'd':0, 'e':1 , 'date':'20210308'},
    { { 'pk':'2', 'a':0, 'b':0, 'c':0, 'd':0, 'e':1 , 'date':'20210308'},
    { { 'pk':'2', 'a':0, 'b':0, 'c':0, 'd':0, 'e':0 , 'date':'20210308'},
]

f = []
for i in origin:
    pk = i['pk']
    if pk not in f:
        f.append(pk)

g = []
for i in f:
   h = {'pk':i, 'a':0, 'b':0, 'c':0, 'd':0, 'e':0}
    for j in origin:
        if j['pk'] == i:
            h['a'] += j['a']
            h['b'] += j['b']
            h['c'] += j['c']
            h['d'] += j['d']
            h['e'] += j['e']
    g.append(h)

print(g)
>> [{'pk': '1', 'a': 3, 'b': 2, 'c': 0, 'd': 0, 'e': 5}, {'pk': '2', 'a': 0, 'b': 0, 'c': 0, 'd': 0, 'e': 1}]


2022-09-20 17:44

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.