SELECT ... FOR UPDATE not working?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

SELECT ... FOR UPDATE not working?

Kristian Waagan-4
Hello,

I stumbled across a possible bug with the SELECT ... FOR UPDATE clause.
I have found several sources of information regarding this statement,
with partly conflicting contents.

First of all, I wrote a simple JDBC test application making use of the
statement. I ran this on Derby and two other database systems. Derby did
not behave as the two others (more on this later).

Second, the reference manual states that the statement is supported, and
that it must be used to obtain updateable resultsets.

Third, the JIRA issue 231
(http://issues.apache.org/jira/browse/DERBY-231) is regarding support
for SELECT ... FOR UPDATE. It is unresolved and unassigned. Is this a
stale issue?

The JDBC application I wrote consisted of two threads accessing a single
table: locktesttbl (ID INT, VALUE INT). I inserted 5 rows
(1,1),(2,2),...,(5,5). The first thread executes 'SELECT * FROM
locktesttbl WHERE ID=1 FOR UPDATE', waits 10 seconds, closes the
resultset, executes 'SELECT * FROM locktesttbl WHERE ID=1', closes
resultset and commits. The second thread, which is started 2 seconds
after the first one, executes 'UPDATE locktesttbl SET VALUE=100 WHERE
ID=1' then commits.

The only time the selected VALUE field in the first thread was equal at
the beginning and the end of the transcation, was when the transcation
isolation level was set to SERIALIZABLE. At all other levels, VALUE was
100 at the end of the transaction (before commit). When I did this with
the two other systems (MySQL and PostgreSQL), VALUE was always 1 within
the transaction. This suggests SELECT .. FOR UPDATE is broken in Derby,
and that the single instance of correct behavior seen is due to the
transaction isolation level alone. I have not looked into the source
code on this.

Does anyone have any comments on this?

I will add a JIRA bug issue under category SQL for this one in a few
days (awaiting comments).


--
Kristian


--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.338 / Virus Database: 267.10.12/75 - Release Date: 17.08.2005

Reply | Threaded
Open this post in threaded view
|

Re: SELECT ... FOR UPDATE not working?

mikem_app
In order to understand and compare across multiple db's there are 2
important things we need to know:  transaction isolation and autocommit
behavior.

The default in derby is read committed isolation level, and
autocommit=true.  I don't know what the defaults are for the other
databases.  To be safe when comparing I suggest any test explicitly
set both as part of the test.  From your description it sounds like
you have autocommit=false, but it is good to be explicit.

Does your test do any next() calls on the result set?  Derby almost
always streams rows to the application, so locking for all rows
affected by a query is only guaranteed if the application actually
does a next() through all rows.  Because your table does not have
any indexes as described, derby will use row locking for repeatable
read, and read committed, but will use table locking for serializable.

Derby in read committed will release locks at end of statement for
any row which is actually not updated as part of select for update,
in read committed isolation level.
During processing of the query it actually gets update locks which
are either upgraded to exclusive and held to end of transaction or
released if the row is not updated.   If your application needs
to hold locks until end transaction on selects it must use either
repeatable read or serializable isolation.

Did you see the described behavior using repeatable read and doing
next() calls in both the first and second select?

Kristian Waagan wrote:

> Hello,
>
> I stumbled across a possible bug with the SELECT ... FOR UPDATE clause.
> I have found several sources of information regarding this statement,
> with partly conflicting contents.
>
> First of all, I wrote a simple JDBC test application making use of the
> statement. I ran this on Derby and two other database systems. Derby did
> not behave as the two others (more on this later).
>
> Second, the reference manual states that the statement is supported, and
> that it must be used to obtain updateable resultsets.
>
> Third, the JIRA issue 231
> (http://issues.apache.org/jira/browse/DERBY-231) is regarding support
> for SELECT ... FOR UPDATE. It is unresolved and unassigned. Is this a
> stale issue?
>
> The JDBC application I wrote consisted of two threads accessing a single
> table: locktesttbl (ID INT, VALUE INT). I inserted 5 rows
> (1,1),(2,2),...,(5,5). The first thread executes 'SELECT * FROM
> locktesttbl WHERE ID=1 FOR UPDATE', waits 10 seconds, closes the
> resultset, executes 'SELECT * FROM locktesttbl WHERE ID=1', closes
> resultset and commits. The second thread, which is started 2 seconds
> after the first one, executes 'UPDATE locktesttbl SET VALUE=100 WHERE
> ID=1' then commits.
>
> The only time the selected VALUE field in the first thread was equal at
> the beginning and the end of the transcation, was when the transcation
> isolation level was set to SERIALIZABLE. At all other levels, VALUE was
> 100 at the end of the transaction (before commit). When I did this with
> the two other systems (MySQL and PostgreSQL), VALUE was always 1 within
> the transaction. This suggests SELECT .. FOR UPDATE is broken in Derby,
> and that the single instance of correct behavior seen is due to the
> transaction isolation level alone. I have not looked into the source
> code on this.
>
> Does anyone have any comments on this?
>
> I will add a JIRA bug issue under category SQL for this one in a few
> days (awaiting comments).
>
>
> --
> Kristian
>
>
Reply | Threaded
Open this post in threaded view
|

Re: SELECT ... FOR UPDATE not working?

Kristian Waagan-4
Thanks for the input Mike. Great explanation!
Should a link to "Locking, concurrency, and isolation" (or even "Types
and scope of locks in Derby systems" - "Update locks") in the "Tuning
Derby" manual  be added to the reference manual for the description of
the SELECT statement?

Regarding my test, you are right about the Derby behavior - my mistake;
the update is not allowed when running with repeatable read or
serializable. My test sets both isolation level and autocommit
explicitly. It also does next() on all resultsets.

The behavior of Derby compared with MySQL (started with
--default-storage-engine=innodb) and PostgreSQL is still different
though. PostgreSQL and MySQL does not allow the update no matter what
transaction isolation level is used. Can you say anything about which
behavior is correct/expected?



--
Kristian

Mike Matrigali wrote:

>In order to understand and compare across multiple db's there are 2
>important things we need to know:  transaction isolation and autocommit
>behavior.
>
>The default in derby is read committed isolation level, and
>autocommit=true.  I don't know what the defaults are for the other
>databases.  To be safe when comparing I suggest any test explicitly
>set both as part of the test.  From your description it sounds like
>you have autocommit=false, but it is good to be explicit.
>
>Does your test do any next() calls on the result set?  Derby almost
>always streams rows to the application, so locking for all rows
>affected by a query is only guaranteed if the application actually
>does a next() through all rows.  Because your table does not have
>any indexes as described, derby will use row locking for repeatable
>read, and read committed, but will use table locking for serializable.
>
>Derby in read committed will release locks at end of statement for
>any row which is actually not updated as part of select for update,
>in read committed isolation level.
>During processing of the query it actually gets update locks which
>are either upgraded to exclusive and held to end of transaction or
>released if the row is not updated.   If your application needs
>to hold locks until end transaction on selects it must use either
>repeatable read or serializable isolation.
>
>Did you see the described behavior using repeatable read and doing
>next() calls in both the first and second select?
>
>Kristian Waagan wrote:
>
>  
>
>>Hello,
>>
>>I stumbled across a possible bug with the SELECT ... FOR UPDATE clause.
>>I have found several sources of information regarding this statement,
>>with partly conflicting contents.
>>
>>First of all, I wrote a simple JDBC test application making use of the
>>statement. I ran this on Derby and two other database systems. Derby did
>>not behave as the two others (more on this later).
>>
>>Second, the reference manual states that the statement is supported, and
>>that it must be used to obtain updateable resultsets.
>>
>>Third, the JIRA issue 231
>>(http://issues.apache.org/jira/browse/DERBY-231) is regarding support
>>for SELECT ... FOR UPDATE. It is unresolved and unassigned. Is this a
>>stale issue?
>>
>>The JDBC application I wrote consisted of two threads accessing a single
>>table: locktesttbl (ID INT, VALUE INT). I inserted 5 rows
>>(1,1),(2,2),...,(5,5). The first thread executes 'SELECT * FROM
>>locktesttbl WHERE ID=1 FOR UPDATE', waits 10 seconds, closes the
>>resultset, executes 'SELECT * FROM locktesttbl WHERE ID=1', closes
>>resultset and commits. The second thread, which is started 2 seconds
>>after the first one, executes 'UPDATE locktesttbl SET VALUE=100 WHERE
>>ID=1' then commits.
>>
>>The only time the selected VALUE field in the first thread was equal at
>>the beginning and the end of the transcation, was when the transcation
>>isolation level was set to SERIALIZABLE. At all other levels, VALUE was
>>100 at the end of the transaction (before commit). When I did this with
>>the two other systems (MySQL and PostgreSQL), VALUE was always 1 within
>>the transaction. This suggests SELECT .. FOR UPDATE is broken in Derby,
>>and that the single instance of correct behavior seen is due to the
>>transaction isolation level alone. I have not looked into the source
>>code on this.
>>
>>Does anyone have any comments on this?
>>
>>I will add a JIRA bug issue under category SQL for this one in a few
>>days (awaiting comments).
>>
>>
>>--
>>Kristian
>>
>>
>>    
>>



--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.338 / Virus Database: 267.10.12/75 - Release Date: 17.08.2005

Reply | Threaded
Open this post in threaded view
|

Re: SELECT ... FOR UPDATE not working?

oysteing
>>>>> "KW" == Kristian Waagan <[hidden email]> writes:

    KW> The   behavior   of   Derby   compared  with   MySQL   (started   with
    KW> --default-storage-engine=innodb)  and  PostgreSQL  is still  different
    KW> though. PostgreSQL and MySQL does  not allow the update no matter what
    KW> transaction isolation level is used.  Can you say anything about which
    KW> behavior is correct/expected?

I think what creates confusion here is that there is two different
uses of "FOR UPDATE" in SQL:

  1. The standard "DECLARE CURSOR ... FOR UPDATE" which declares
     updatability of a cursor.  In JDBC, one will use ResultSets
     instead of SQL Cursors and Connection.prepareStatement() allows
     you to specify the updatability of the result set.

  2. The non-standard "SELECT ... FOR UPDATE" which is implemented in
     many databases.  Traditionally, it seems like the intention is to
     indicate that the records selected by such a select statement may
     be updated by later statements of the same transaction.  In most
     databases this will cause the records to be locked exclusively by
     the select statement.  This way, later update/delete statements
     will not have to upgrade shared locks to exclusive and potential
     lock conflicts at that stage are avoided.

Derby seems to have mixed these two cases and requires "SELECT ... FOR
UPDATE" for the result set to be updatable.  As Kristian's example
shows the locking behavior is not the same as you would expect from
other databases that have this extension.  In other words, "SELECT
... FOR UPDATE" is only used to indicate updatability, not to indicate
what may be updated by succeeding statements.

In my opinion, the best thing would be to deprecate the use of "SELECT
... FOR UPDATE".  For JDBC programmers, I am not sure the "SELECT
... FOR UPDATE" extension to the standard is necessary.  Using
updatable result set should in most cases be preferred to using two
statements to select and update a record.  I also think it is a
mistake to reuse this non-standard syntax for another purpose, i.e.,
to specify updatability, when there are standard ways to achieve this
through JDBC.  It is even worse that Derby requires this non-standard
syntax in order to get updatable result sets.

--
?ystein

Reply | Threaded
Open this post in threaded view
|

Re: SELECT ... FOR UPDATE not working?

Edward Rayl

>I think what creates confusion here is that there is two different
>uses of "FOR UPDATE" in SQL:
>
>  1. The standard "DECLARE CURSOR ... FOR UPDATE" which declares
>     updatability of a cursor.  In JDBC, one will use ResultSets
>     instead of SQL Cursors and Connection.prepareStatement() allows
>     you to specify the updatability of the result set.
>
>  2. The non-standard "SELECT ... FOR UPDATE" which is implemented in
>     many databases.  Traditionally, it seems like the intention is to
>     indicate that the records selected by such a select statement may
>     be updated by later statements of the same transaction.  In most
>     databases this will cause the records to be locked exclusively by
>     the select statement.  This way, later update/delete statements
>     will not have to upgrade shared locks to exclusive and potential
>     lock conflicts at that stage are avoided.
>
>Derby seems to have mixed these two cases and requires "SELECT ... FOR
>UPDATE" for the result set to be updatable.  As Kristian's example
>shows the locking behavior is not the same as you would expect from
>other databases that have this extension.  In other words, "SELECT
>... FOR UPDATE" is only used to indicate updatability, not to indicate
>what may be updated by succeeding statements.
>
>In my opinion, the best thing would be to deprecate the use of "SELECT
>... FOR UPDATE".  For JDBC programmers, I am not sure the "SELECT
>... FOR UPDATE" extension to the standard is necessary.  Using
>updatable result set should in most cases be preferred to using two
>statements to select and update a record.  I also think it is a
>mistake to reuse this non-standard syntax for another purpose, i.e.,
>to specify updatability, when there are standard ways to achieve this
>through JDBC.  It is even worse that Derby requires this non-standard
>syntax in order to get updatable result sets.
>  
>
I agree.  See Derby-231, "FOR UPDATE" required for updatable result set
to work.  All major databases seem to support "select...for update", but
don't _require_ it.  Derby is the exception.  Deprecation might be
impractical if source code compatibility with other databases is important.

Ed

Reply | Threaded
Open this post in threaded view
|

Re: SELECT ... FOR UPDATE not working?

mikem_app
In reply to this post by Kristian Waagan-4
I welcome SQL standard experts opinion, but I believe either
implementation is correct.  Especially with read committed I don't
think the standard says anything one way or another.

My experience with users is that most would prefer the Derby
implementation, as they tend to have the system retain the least
amount of locks possible - especially in read committed mode.  I
agree that it is confusing.  I also think some DB's add to the
SQL to allow the behavior that you were expecting, ie. lock every
row in the select exclusively and hold the lock until end of
transaction (or maybe in UPDATE vs. exclusive mode).  But I don't
think the standard addresses this, but again I am not an expert
in the SQL standards.


The above comment assumes a lock based isolation control system.
Versioning would actually be better for many db clients.  Does anyone
know of public domain information about implementing a version
isolation system - sort of like what Oracle does.

Kristian Waagan wrote:

> Thanks for the input Mike. Great explanation!
> Should a link to "Locking, concurrency, and isolation" (or even "Types
> and scope of locks in Derby systems" - "Update locks") in the "Tuning
> Derby" manual  be added to the reference manual for the description of
> the SELECT statement?
>
> Regarding my test, you are right about the Derby behavior - my mistake;
> the update is not allowed when running with repeatable read or
> serializable. My test sets both isolation level and autocommit
> explicitly. It also does next() on all resultsets.
>
> The behavior of Derby compared with MySQL (started with
> --default-storage-engine=innodb) and PostgreSQL is still different
> though. PostgreSQL and MySQL does not allow the update no matter what
> transaction isolation level is used. Can you say anything about which
> behavior is correct/expected?
>
>
>
> --
> Kristian
>
> Mike Matrigali wrote:
>
>> In order to understand and compare across multiple db's there are 2
>> important things we need to know:  transaction isolation and autocommit
>> behavior.
>>
>> The default in derby is read committed isolation level, and
>> autocommit=true.  I don't know what the defaults are for the other
>> databases.  To be safe when comparing I suggest any test explicitly
>> set both as part of the test.  From your description it sounds like
>> you have autocommit=false, but it is good to be explicit.
>>
>> Does your test do any next() calls on the result set?  Derby almost
>> always streams rows to the application, so locking for all rows
>> affected by a query is only guaranteed if the application actually
>> does a next() through all rows.  Because your table does not have
>> any indexes as described, derby will use row locking for repeatable
>> read, and read committed, but will use table locking for serializable.
>>
>> Derby in read committed will release locks at end of statement for
>> any row which is actually not updated as part of select for update,
>> in read committed isolation level.
>> During processing of the query it actually gets update locks which
>> are either upgraded to exclusive and held to end of transaction or
>> released if the row is not updated.   If your application needs
>> to hold locks until end transaction on selects it must use either
>> repeatable read or serializable isolation.
>>
>> Did you see the described behavior using repeatable read and doing
>> next() calls in both the first and second select?
>>
>> Kristian Waagan wrote:
>>
>>  
>>
>>> Hello,
>>>
>>> I stumbled across a possible bug with the SELECT ... FOR UPDATE clause.
>>> I have found several sources of information regarding this statement,
>>> with partly conflicting contents.
>>>
>>> First of all, I wrote a simple JDBC test application making use of the
>>> statement. I ran this on Derby and two other database systems. Derby did
>>> not behave as the two others (more on this later).
>>>
>>> Second, the reference manual states that the statement is supported, and
>>> that it must be used to obtain updateable resultsets.
>>>
>>> Third, the JIRA issue 231
>>> (http://issues.apache.org/jira/browse/DERBY-231) is regarding support
>>> for SELECT ... FOR UPDATE. It is unresolved and unassigned. Is this a
>>> stale issue?
>>>
>>> The JDBC application I wrote consisted of two threads accessing a single
>>> table: locktesttbl (ID INT, VALUE INT). I inserted 5 rows
>>> (1,1),(2,2),...,(5,5). The first thread executes 'SELECT * FROM
>>> locktesttbl WHERE ID=1 FOR UPDATE', waits 10 seconds, closes the
>>> resultset, executes 'SELECT * FROM locktesttbl WHERE ID=1', closes
>>> resultset and commits. The second thread, which is started 2 seconds
>>> after the first one, executes 'UPDATE locktesttbl SET VALUE=100 WHERE
>>> ID=1' then commits.
>>>
>>> The only time the selected VALUE field in the first thread was equal at
>>> the beginning and the end of the transcation, was when the transcation
>>> isolation level was set to SERIALIZABLE. At all other levels, VALUE was
>>> 100 at the end of the transaction (before commit). When I did this with
>>> the two other systems (MySQL and PostgreSQL), VALUE was always 1 within
>>> the transaction. This suggests SELECT .. FOR UPDATE is broken in Derby,
>>> and that the single instance of correct behavior seen is due to the
>>> transaction isolation level alone. I have not looked into the source
>>> code on this.
>>>
>>> Does anyone have any comments on this?
>>>
>>> I will add a JIRA bug issue under category SQL for this one in a few
>>> days (awaiting comments).
>>>
>>>
>>> --
>>> Kristian
>>>
>>>
>>>  
>
>
>
>