Using Teradata SQL: How to find the first or last record in a sequence
If your reading a lot of records from a database and you need to find the first or last record in a sequence, you can do this in MS SQL by using TOP 1, or in MySQL by using LIMIT 1.
QUALIFY OVER PARTITION
But TOP and LIMIT don't exist in Teradata SQL so we need another way to find it. That way is by using the QUALIFY ... OVER ... PARTITION construct.
For example -
SELECT * FROM EMP QUALIFY ROW_NUMBER() OVER ( PARTITION BY EMP_NAME ORDER BY DATE_EFFECTIVE DESC) = 1;`
will give you the most recent record in a set by date.
How does it work?
Basically in this example the SQL groups all the records for a particular employee together, sorts them by the effective date, assign a sequential number to each record and then returns the first record it found. Because we specified that the records were to be sorted in descending order, this will be the latest record.
If we wanted to find the earliest record, we'd simply change DESC to ASC.
Similarly if we want to find the top 5 records, we'd change the QUALIFY line to read
QUALIFY ROW_NUMBER() OVER ( PARTITION BY EMP_NAME ORDER BY DATE_EFFECTIVE DESC) <= 5;
Note: If you try and do this in a sub-query you’ll get an error -
 Syntax error: Order Analytical Functions are not allowed in subqueries,
So you'll either need to put the result in a temporary table and use that, or have a look at an alternative method (I do have one in mind, but I haven't written the blog post for that yet so I can't link to it :-) ).
The QUALIFY OVER PARTITION BY construct is a very powerful combination that allows you to find the first or last records in a sequence easily.
"The only thing to do with good advice is pass it on. It is never of any use to oneself." (Oscar Wilde)
Header image: 123rf.com