I'm having a hard time studying SQL Server.
What we tried to do was to combine 'table_a' in two columns of no, timestamp with 'table_b' in three columns of no, datetime, value (both tables have no, datetime as the primary key).
The bonding conditions are as follows:
This is not a very good explanation, so please see the code below as an example.
IF object_id('table_a')IS NOT NULL
DROP TABLE table_a;
US>CREATE TABLE table_a(
no INTEGER not null
, timestamp DATE not null
,CONSTRAINT pk_table_a PRIMARY KEY CLUSTERED (no, timestamp)
);
IF object_id('table_b') IS NOT NULL
DROP TABLE table_b;
US>CREATE TABLE table_b(
no INTEGER not null
, timestamp DATE not null
, value INTEGER not null
,CONSTRAINT pk_table_b PRIMARY KEY CLUSTERED (no, timestamp)
);
BEGIN TRANSACTION;
INSERT INTO table_a(no, timestamp)
VALUES (1,'2018-01-01'))
,(1, '2018-01-04')
,(1, '2018-01-05')
,(2, '2018-01-01')
,(2, '2018-01-03')
,(2, '2018-01-09')
,(3, '2018-01-10')
,(3, '2018-01-11')
,(3, '2018-01-12')
;
INSERT INTO table_b (no, timestamp, value)
VALUES (1, '2018-01-01', 101)
,(1, '2018-01-02', 102)
,(2, '2018-01-02', 202)
,(3, '2017-01-01', 301)
,(3, '2018-01-12', 312)
;
COMMIT;
SELECT*
FROM table_a
;
SELECT*
FROM table_b
;
-- Query A: I can get what I want.
SELECT A.no
, A. timestamp
, B.value
FROM(
SELECT A.no
, A. timestamp
, (SELECT MAX (B.timestamp)
FROM table_bas B
WHERE B.no = A.no
AND B.timestamp<=A.timestamp) as a_b_timestamp
FROM table_a ASA
as A
LEFT OUTER JOIN TABLE_b ASB
ON A.no = B.no
ANDA.a_b_timestamp = B.timestamp
WHERE A.no in (1, 2)
;
-- Query B: This query's result set is the same with A.
SELECT A.no
, A. timestamp
, B.value
FROM table_a ASA
LEFT OUTER JOIN TABLE_b ASB
ON A.no = B.no
AND B.timestamp= (SELECT MAX (B.timestamp)
FROM table_bas B
WHERE B.no = A.no
AND B.timestamp<=A.timestamp)
WHERE A.no in (1, 2)
;
-- Query C: The result set is far from what I expected, but I don't know reason.
SELECT A.no
, A. timestamp
, B.value
FROM table_a ASA
LEFT OUTER JOIN TABLE_b ASB
ON A.no = B.no
ANDA.timestamp= (SELECT MAX (B.timestamp)
FROM table_bas B
WHERE B.no = A.no
AND B.timestamp<=A.timestamp)
WHERE A.no in (1, 2)
;
The following data will be prepared in the table:
table_a:
no timestamp
1 2018-01-01
1 2018-01-04
1 2018-01-05
2 2018-01-01
2 2018-01-03
2 2018-01-09
3 2018-01-10
3 2018-01-11
3 2018-01-12
table_b:
no timestamp value
1 2018-01-01101
1 2018-01-02102
2 2018-01-02 202
3 2017-01-01301
3 2018-01-12312
There is a query in the code that commented Query A, B and C, but the result set is as follows:What I wanted was A, B, but only C returned a different set of results.
Query A, B result set (I wanted this!):
no timestamp value
1 2018-01-01101
1 2018-01-04102
1 2018-01-05102
2 2018-01-01 NULL
2 2018-01-03 202
2 2018-01-09 202
Query C result set (wanted different):
no timestamp value
1 2018-01-01101
1 2018-01-01102
1 2018-01-04 NULL
1 2018-01-05 NULL
2 2018-01-01 NULL
2 2018-01-03 NULL
2 2018-01-09 NULL
The problem is that I can't explain well why the result of C is as above.
Could someone please explain the reason clearly?
You should be aware that if you do not have enough conditions for table_b to bind table_b externally, you will not get the expected results.
Query C subqueries are table_a-dependent correlation subqueries, so each row of table_a has a different value.Finally, A.no in(1,2)
will narrow it down, so if you omit anything else, it will look like this.
MAX(B.timestamp)
(1) 1 2018-01-01 2018-01-01
(2) 1 2018-01-04 2018-01-02
(3) 1 2018-01-05 2018-01-02
(4) 2 2018-01-01 NULL
(5) 2 2018-01-03 2018-01-02
(6) 2 2018-01-09 2018-01-02
Therefore, the external coupling condition for record (1) is
A.no=B.no ANDA.timestamp='2018-01-01'
That's what it means.From table_b, combine all records that match this condition with this record in table_a.A.timestamp='2018-01-01'
has been established, so you will combine all records that match A.no=B.no
from table_b externally.
That's this part of the Query C result.
1 2018-01-01101
1 2018-01-01 102
On the other hand, for records (2) through (6), the A.timestamp=(...MAX(B.timestamp)...)
part is never established, so the join condition is always false and there is no record of B to be externally joined.As a result, the value corresponding to column B is NULL.
If you could comment on which part was difficult to understand, I might be able to write in more detail.
The ON clause in Query B and C is
The last timestamp is not the timestamp of table_a, but the timestamp of table_b.
In other words, the conditions are for table_a like Query C
ANDA.timestamp=(...)
but for table_b like Query B
AND B.timestamp=(...)
should be .
Also, I thought that using the alias name B
was confusing (it looks like only two characters).
If you write Query B in a more Japanese style, it will look like the following.
SELECT*
FROM table_a A
LEFT JOIN TABLE_bB1
ON B1.no = A.no
AND
B1.timestamp=(
SELECT MAX (B2.timestamp)
FROM table_b B2
WHERE B2.no = B1.no
AND
B2.timestamp<=A.timestamp
GROUP BY B2.no
)
;
© 2024 OneMinuteCode. All rights reserved.