To aggregate the latest records by user_id

Asked 1 years ago, Updated 1 years ago, 67 views

I want Rails to prepare an ActiveRecord::Relation object to pass to view, but I can't aggregate it as intended. It's to aggregate the latest records for each user_id.

There is an entrance/exit model called Status, which has the following structure:

  • Status
    • id
    • user_id
    • logged_in
    • logged_out
    • created_at
    • updated_at
  • id
  • user_id
  • logged_in
  • logged_out
  • created_at
  • updated_at

I'd like to see the latest record of each user (user_id) in the view, but I don't know how to aggregate it.

Thank you very much for letting me know.

  • Rails 4.2.5
  • ruby 2.2.3

ruby-on-rails ruby rails-activerecord

2022-09-30 21:12

3 Answers

If possible, I would like you to include RDBMS, sample data, and expected results.
First of all, I'll answer while imagining, "Maybe this is what it is."

First of all, I wrote a test code like this.

require'test_helper'

class StatusTest<ActiveSupport::TestCase
  test "Total latest records by user" do
    alice=User.create!(name: 'Alice')
    bob=User.create!(name: 'Bob')

    alice_status_1 = alice.status.create!(
      logged_in: '2015-01-01 00:00:00', logged_out: '2015-01-01 01:00:00')
    alice_status_2 = alice.status.create!(
      logged_in: '2015-01-0200:00:00', logged_out:nil)#<=Latest

    bob_status_1 = bob.status.create!(
      logged_in: '2015-01-0100:00:00', logged_out: '2015-01-3101:00:00')#<= Latest
    bob_status_2 = bob.status.create!(
      logged_in: '2015-01-0200:00:00', logged_out: '2015-01-02 01:00:00')

    states=Status.latest_by_user.order(:user_id)
    assert_equal [alice_status_2, bob_status_1], states
  end
end

If this test code matches the expected results, it can be implemented with the following code:

class Status <ActiveRecord::Base
  belongs_to —user

  scope:latest_by_user, ->{
    sub_query=<<- SQL
SELECT user_id,
       MAX (CASE WHEN logged_out IS NULL THE logged_in ELSE logged_out END) AS max_datetime
FROM states
GROUP BY user_id
    SQL

    join_sql=<<- SQL
INNER JOIN(#{sub_query})AS sub
ON sub.user_id=status.user_id
AND sub.max_datetime = CASE WHEN states.logged_out IS NULL THEEN states.logged_in ELSE states.logged_out END
    SQL

    joins(join_sql)
  }
end

For each userd_id, the maximum logged_in or logged_out time is obtained (sub_query) and the status join is narrowed down.(join_sql)

Sometimes I see people who say, "I don't want to write raw SQL!" but when the aggregation function (SUM, MAX, etc.) is involved, it becomes difficult to write ActiveRecord, so my implementation policy is to write SQL without overdoing it.

SQLite was used to verify operation.
I think it's within the standard SQL range, so I think PostgreSQL and MySQL will work, but I haven't checked it yet.

I also uploaded the code on GitHub, so if you need it, please try to move it by hand.

https://github.com/JunichiIto/show-latest-sandbox


2022-09-30 21:12

By creating a subquery, you can aggregate by user_id and select the row with the largest logged_in.

Status
  .where(%{
    NOT EXISTS (
      SELECT1
      FROM status s1
      WHERE states.user_id=s1.user_id AND
            states.logged_in<s1.logged_in
    )
  }).order(:user_id).pluck(:user_id,:logged_in)

Doing so will result in the following SQL queries:

SELECT*
FROM states
WHERE(
  NOT EXISTS (
    SELECT1
    FROM status s1
    WHERE states.user_id=s1.user_id AND
          states.logged_in<s1.logged_in
  )
)

AR It would be better to abstract using AREL, but I don't know how to put the table in the subquery under another name...(probably not)

If you only need to find the maximum value of user_id and logged_in, you can get it with a squid-like code.In this case, id and so on will be nil, so be careful.

Status
  .group(:user_id)
  .select("user_id,MAX(logged_in)AS logged_in")
SELECT user_id, MAX(logged_in)AS logged_in
FROM states
GROUP BY user_id

If you want fields other than user_id and logged_in, you will get a SQL grammar error.This is because the SELECT clause cannot contain anything other than the field specified in GROUP BY and the aggregation function.

Some RDBMS you use do not return errors, but See the documentation to see if it works as intended.

Status
  .group(:user_id)
  .select("*,MAX(logged_in)AS max_logged_in")
SELECT*,MAX(logged_in)AS max_logged_in
FROM states
GROUP BY user_id


2022-09-30 21:12

I don't know what aggregation refers to, but if you just want to get the latest user data in the column with the id autoincrement,

Status.where(user_id:user_id).order(:id).reverse_order.limit(1)

I think that would be fine, but what do you think?


2022-09-30 21:12

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.