SELECT SINGLE vs. SELECT UP TO 1 ROWS
Some time ago I´ve written a small block Selecting One Row From a Database Table in order to explain the usage of SELECT SINGLE and SELECT UP TO 1 ROWS. Obviously, there are still discussions about this. OK, let´s derive some rules based on my previous blog:
TASK: You want to select one row with a fully specified key into a tabular result
You need neither SINGLE nor UP TO 1 ROWS:
SELECT …
FROM dbtab
WHERE full_key
INTO TABLE itab.
The result is tabular containing one line. It is written to an internal table.
TASK: You want to select one row with a fully specified key into a structure
You use SINGLE:
SELECT SINGLE …
FROM dbtab
WHERE full_key
INTO wa.
The result is tabular containing one line. You use SINGLE to prevent a SELECT loop. The line is copied directly to wa.
TASK: You want to select one row with a partly specified key
You use UP TO 1 ROWS:
SELECT …
FROM dbtab
WHERE part_key
ORDER BY …
INTO TABLE itab
UP TO 1 ROWS.
or
SELECT …
FROM dbtab
WHERE part_key
ORDER BY …
INTO wa
UP TO 1 ROWS.
…
ENDSELECT.
The result is tabular containing one line for which the usage of ORDER BY is highly recommended. Usage of SINGLE is not appropriate here, because the resulting line is undefined. The line can be written to an internal table or a workarea.
TASK: You want to check the existence of a row
You use SINGLE:
SELECT SINGLE col
FROM dbtab
WHERE any_key
INTO (field)
##warn_ok.
IF sy-subrc = 0.
…
ENDIF.
The result is written to a single data object. There is no need for a tabular evaluation of the result and the usage of ORDER BY. From 7.40, SP05 on, you can even specify a literal ´X´ or a constant for col in order to prevent any data transport from DB to ABAP:
SELECT SINGLE ‘X’
FROM dbtab
WHERE any_key
INTO (field)
##warn_ok.
IF sy-subrc = 0.
…
ENDIF.
If you or your Q-Manager don´t like the pragma ##warn_ok, you can also use:
SELECT ‘X’
FROM dbtab
WHERE any_key
INTO (field)
UP TO 1 ROWS.
ENDSELECT.
IF sy-subrc = 0.
…
ENDIF.
Practically there is no difference in performance.
The paper is open for discussion.
Hello Horst,
in the TASK: You want to check the existence of a row
There is a performance recommendation (by Hermann Gahm) to fetch the first column (col in the SELECT statement should be the first column or at least a part of the primary key) to avoid reading many blocks at database level.
best regards,
JNN
OK, thanks for the hint. But you don`t need to fetch anything any more from 7.40, SP05 on. From then on the recommendation is to use a literal or a constant for col.
Horst
I think you should use one of the fields from the index you hit by your selection. Otherwise you have the same effect: The database has to read at least one block for the table (depending on the organization of the data inside the database). Using a field from the index, the database can take the already read value by the index access and return it.
But the suggestion from Horst I'd like much more: simply using a constant. So the database don't need any additional read or determination.
@Horst
Many thanks for this blog, especially after the discussion you mentioned 🙂
Hi Horst,
Thanks for such a nice blog on SELECT SINGLE. Since i was checking the new SELECT SINGLE statement for checking the existence of row.
SELECT SINGLE @abap_true
FROM t100
WHERE sprsl = @sy-langu AND
arbgb = 'SABAPDEMOS'
INTO @DATA(result).
IF sy-subrc = 0.
...
ENDIF.
Since you have already stated in one of the blog that the above SQL expression will not transport a single row from DB to application server so this is even faster than regular SELECT SINGLE statement??? or in case the row exist on DB level then 'X' value will be populated to 'result' variable, making both SQL expressions same in the performance wise???
Hi Rakshit,
transporting a fixed single character value from database to server should be the fastest thing you can do. The database does no actual reading of a column and there is no dependency from column size. For small columns the difference might be marginal. But using that way on ABAP side offers the databse the best opportunity to optimize.
Hi Horst,
Like Rakshith, I too remember a recent discussion about the same "SELECT SINGLE @abap_true". I thought I would share another "new" ABAP way I used this along with the CASE statement to flip a field's contents.
We have a settings table where we can turn off a function, sort of like a deletion flag. However, to make other parts of code more readable I want to return TRUE when the function is turned on. So instead of...
SELECT SINGLE am_i_off INTO l_am_i_off
FROM z_table
WHERE my_setting = 'MY_FEATURE'.
IF sy-subrc = 0.
IF l_am_i_off = abap_true.
return_i_am_alive = abap_false.
ELSE.
return_i_am_alive = abap_true.
ENDIF.
ELSE.
return_i_am_alive = abap_true.
ENDIF.
I was able to write this in the new ABAP as...
SELECT SINGLE
CASE WHEN am_i_off = @abab_true THEN @abap_false
WHEN am_i_off = @abap_false THEN @abab_true
END
FROM z_table
WHERE my_setting = 'MY_FEATURE'
INTO @return_i_am_alive.
IF sy-subrc <> 0.
return_i_am_alive = abap_true.
ENDIF.
On "TASK: You want to check the existence of a row" I'm wondering:
Shouldn't there be a way to get rid of the "INTO" part? Something similar to "TRANSPORTING NO FIELDS" in a loop?
-> that would make it a lot clearer (without a comment) that you're just checking for existence.
Forwarded to development and under discussion ...
Hi Horst
I just have to ask. I normally use SELECT COUNT(*) when I want to check for existence, there is no fetching of any columns from the database or storing it into any variable during runtime.
Sample below (ignore the table and condition - they are dynamically determined - just sample of what I could find from one of my programs)
I've been doing this since before 7.40. How does this differ from the new 7.40 SP05 syntax that prevents data transport from DB to ABAP?
Regards
Eng Swee
SELECT COUNT isn't efficient performance-wise.
Depending in your selection criteria (Full key or not), the DB fetches multiple rows instead of only (the first) one.
Please do not count rows, if you only want to know, if there is at least one entry. The database has to read all the affected data by your selection. In a worst case (no index applies on your selection) the database has to analyse the whole table data.
In real life you won't have the idea to count all, if you only want to know that there is anything (like people in a room).
Hi Shai, Armin
Thank you for your replies.
I might have missed certain aspects of performance in my logic to check for existence of a row. It's a good time to revisit this.
The main intention of my logic above was to prevent data transfer from DB to ABAP from a performance perspective. I did not consider the performance cost of evaluation of the aggregate expression at the database level.
I think the cost is only relevant when a full key is not specified.
Can you confirm that evaluation of the aggregate expression COUNT is costly at database level and this is avoided by using UP TO 1 ROWS instead?
Regards
Eng Swee
PS: Don't have developer key at my current role so can't do the performance comparison myself
I ran a number of tests on the LIPS table with about 4.5 million records dating back to 1999 and came across some interesting findings...
Based on these findings, I think it is safe to say that the newer SELECT SINGLE @abap_true is the best performing, caveat, you need ABAP 7.40 and above to use this form of the SELECT statement.
Hope this helps...
I have posted my raw results in a google sheet
Raghu
What I'm wondering about ...
Of course one should always use the best method possible, but is the performance of an existence check really that important?
Isn't the performance of a mass access much more important?
So, if you have mass existence checks, Ok, but otherwise, hmm.
What do you think?
PS: If you wonder why I say this after writing such a blog: the blog is more about semantics, less about performance, assuming that the performance of the shown alternatives isn't too different.
Depends on the application... in my test of finding an entry at the end of the table without a primary key, the difference in times were about 2000 fold - from 1 microsecond to 2 seconds!
For the most part, I agree with you that in most cases this is semantics and splitting hairs. But I think this post of yours is very valuable, because I have worked for countless clients who have an internal ABAP rule book that states NEVER to use SELECT SINGLE. My simple question to them is "then why does the statement even exist?". Between your post and the tests I just ran, I will have something to show the next client who puts up a fuss about me using SELECT SINGLE 😛
However, regarding performance, I have seen applications which work fine in Development only to crash miserably in Production because of relatively small performance things like this. I also deal a lot with shop floor barcoding systems where a 2 second delay between scans or other data entry would be unacceptable... another reason why I think this post is valuable.
" I also deal a lot with shop floor barcoding systems where a 2 second delay between scans or other data entry would be unacceptable… another reason why I think this post is valuable".
Exactly, 2 seconds delay matter a lot in such instance.
Thanks Horst, for sharing your expertise.
K.Kiran.
Hi!,
Very interesting topic!
I always use this way (with the 'SINGLE' addition):
Is it incorrect from a praxis point of view? (Also counting that the Netweaver versions that I work in are not so up-to-date...).
And that raises another doubt for me: If I want to check the existence of a certain value in a field of the target row, is it correct to extend the WHERE condition even further than the full key? Or should I get the value in a local variable and check it afterwards?
Thanks for your contributions!,
Eloi
So glad I found this blog again, to look up on this issue.
I had an existence check with select count(*) today, and ATC told me that this would bypass the buffer - an aspect I think that wasn't mentioned here yet.
So I changed it as suggested here, into:
I'm still a little unhappy, that I have to define a field I don't need (into @data(unused_dummy_field) )....
@Horst, any news on the "TRANSPORTING NO FIELDS"-equivalent ?
Best
Joachim
"any news on the “TRANSPORTING NO FIELDS”-equivalent ?"
Sorry, no.
Horst Keller
How about now, 2 Years later?
Hi Horst,
Thanks for sharing this blog with the community. I have a question. I generally use COUNT(*) variant of SELECT when i need to check for existence. Is it not a good practice? We are into release 731.
Thanks,
Abdul Hakim
Please see the already done discusion about this topic above. In short words: No, it's not a good practice. I vote for: It's an ugly practice.
Thanks Keller this blog helpfull to understand select query
Using SELECT SINGLE COL for checking the existence is fine for buffered tables,
but we faced that for non-buffered tables it is even better to do
SELECT SINGLE COUNT( COL )
as 1) you really don't fetch any data and 2) you don't have to define a variable, which you're not using anywhere else...
It's quite bad from performance perspective.
Please avoid it.
Really? How do you determine this result?
To 1): Of course you fetch data, at least your database do it because it has to count rows
... and of course you also need a variable behind INTO 😉
Need or must have?
Until NW 7.31 it was even mentioned in the official documentation (but was omitted on later releases):
Ooops, right. I always forget about that 😐
Unbeleivable 😛 I always thought you knew the ABAP documentation like the back of your hand 😀 justjoking
“Usage of SINGLE is not appropriate here, because the resulting line is undefined.”
If you use SELECT UP TO 1 ROWS with a partial key, and without ORDER BY, the result set is similarly undefined. In those cases, you should use SELECT SINGLE as is it is simpler – and it gives you a warning!
Really, SELECT UP TO 1 ROWS should give the same warning as SELECT SINGLE, if the ORDER BY is missing.
But with UP TO you can and should use ORDER BY. With SINGLE you simply can't.
I understand what you say, but I can't think in 20 years I've ever used a select single where it mattered that the key was not completely specified. I guess it only really comes into play when people think there's some kind of order that they're relying on. (Which is where I came in 🙂 )
I don't know if I understand you correctly? There is no difference in performance if I use SELECT SINGLE or SELECT...UP TO 1 ROWS, independent from WHERE clauses content? What I learned (many years ago) is: Use SELECT SINGLE only with fully qualified key to prevent a sequentially DB search (which can effect in bad cases), use SELECT....UP TO 1 ROWS if you want to read without having a fully qualified key (because the statement uses secondary keys, if needed).
And I learned it here in SCN, boy!
Ralf *shocked