Quadratic unconstrained binary optimization by Ciro Santilli 35 Updated 2024-12-23 +Created 1970-01-01
Exampes under nodejs/sequelize/raw:
- nodejs/sequelize/raw/index.js: Sequelize raw query hello world. Ideally one should never use a raw query in a real project. We use raw examples mostly as a SQL tutorial under SQL example, and will not comment on them much further on this section.
@cirosantilli/_file/nodejs/sequelize/nodejs/sequelize/parallel_select_and_update.js by Ciro Santilli 35 Updated 2024-12-23 +Created 1970-01-01
This example is the same as nodejs/sequelize/raw/parallel_select_and_update.js, but going through Sequelize rather than with Sequelize raw queries.
NONE
is not supported for now to not have a transaction at all because lazy.The examples illustrates: stackoverflow.com/questions/55452441/for-share-and-for-update-statements-in-sequelize
Sample invocation:where:
node --unhandled-rejections=strict ./parallel_select_and_update.js p 10 100 READ_COMMITTED UPDATE
READ_COMMITTED
: one of the keys documented at: sequelize.org/master/class/lib/transaction.js~Transaction.html which correspond to the standard sQL isolation levels. It not given, don't set one, defaulting to the database's/sequelize's default level.UPDATE
: one of the keys documented at: sequelize.org/master/class/lib/transaction.js~Transaction.html#static-get-LOCK. Update generates aSELECT FOR UPDATE
in PostgreSQL for example. If not given, don't use anyFOR xxx
explicit locking.
Other examples:
node --unhandled-rejections=strict ./parallel_select_and_update.js p 10 100 READ_COMMITTED UPDATE
Then, the outcome is exactly as described at: nodejs/sequelize/raw/parallel_select_and_update.js:
READ_COMMITTED
: failsREAD_COMMITTED UPDATE
: worksREPEATABLE_READ
: works, but is a bit slower, as it does rollbacksThis case also illustrates Sequelize transaction retries, since in this transaction isolation level transactions may fail:
@cirosantilli/_file/nodejs/sequelize/raw/nodejs/sequelize/raw/parallel_update_async.js by Ciro Santilli 35 Updated 2024-12-23 +Created 1970-01-01
nodejs/sequelize/raw/parallel_update_worker_threads.js contains a base example that can be used to test what can happen when queries are being run in parallel. But it is broken due to a
sqlite3
Node.js package bug: github.com/mapbox/node-sqlite3/issues/1381...nodejs/sequelize/raw/parallel_update_async.js is an
async
version of it. It should be just parallel enough to allow observing the same effects.This is an example of a transaction where the SQL READ COMMITTED isolation level if sufficient.
These examples run queries of type:
UPDATE "MyInt" SET i = i + 1
Sample execution:which does:
node --unhandled-rejections=strict ./parallel_update_async.js p 10 100
- PostgreSQL, see other databases options at SQL example
- 10 threads
- 100 increments on each thread
The fear then is that of a classic read-modify-write failure.
But as www.postgresql.org/docs/14/transaction-iso.html page makes very clear, including with an explicit example of type
UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 12345;
, that the default isolation level, SQL READ COMMITTED isolation level, already prevents any problems with this, as the update always re-reads selected rows in case they were previously modified.If the first updater commits, the second updater will ignore the row if the first updater deleted it, otherwise it will attempt to apply its operation to the updated version of the row
Since in PostgreSQL "Read uncommitted" appears to be effectively the same as "Read committed", we won't be able to observe any failures on that database system for this example.
nodejs/sequelize/raw/parallel_create_delete_empty_tag.js contains an example where things can actually blow up in read committed.
@cirosantilli/_file/nodejs/sequelize/raw/nodejs/sequelize/raw/parallel_select_and_update.js by Ciro Santilli 35 Updated 2024-12-23 +Created 1970-01-01
This example is similar to nodejs/sequelize/raw/parallel_update_async.js, but now we are doing a separate SELECT, later followed by an update:
SELECT FROM
to get i- update on Js code
newI = i + 1
UPDATE SET
thenewI
Although this specific example is useless in itself, as we could just use
UPDATE "MyInt" SET i = i + 1
as in nodejs/sequelize/raw/parallel_update_async.js, which automatically solves any concurrency issue, this kind of code could be required for example if the update was a complex function not suitably implemented in SQL, or if the update depends on some external data source.Sample execution:which does:
node --unhandled-rejections=strict ./parallel_select_and_update.js p 2 10 'READ COMMITTED'
- PostgreSQL, see other databases options at SQL example
- 2 threads
- 10 increments on each thread
Another one:this will run SELECT FOR UPDATE rather than just SELECT
node --unhandled-rejections=strict ./parallel_select_and_update.js p 2 10 'READ COMMITTED' 'FOR UPDATE'
Observed behaviour under different SQL transaction isolation levels:
READ COMMITTED
: fails. Nothing in this case prevents:- thread 1: SELECT, obtains i = 0
- thread 2: SELECT, obtains i = 0
- thread 2: newI = 1
- thread 2: UPDATE i = 1
- thread 1: newI = 1
- thread 1: UPDATE i = 1
REPEATABLE READ
: works. the manual mentions that if multiple concurrent updates would happen, only the first commit succeeds, and the following ones fail and rollback and retry, therefore preventing the loss of an update.READ COMMITTED
+SELECT FOR UPDATE
: works. And does not do rollbacks, which probably makes it faster. Withp 10 100
,REPEATABLE READ
was about 4.2s andREAD COMMITTED
+SELECT FOR UPDATE
3.2s on Lenovo ThinkPad P51 (2017).SELECT FOR UPDATE
should be enough as mentioned at: www.postgresql.org/docs/13/explicit-locking.html#LOCKING-ROWSFOR UPDATE causes the rows retrieved by the SELECT statement to be locked as though for update. This prevents them from being locked, modified or deleted by other transactions until the current transaction ends. That is, other transactions that attempt UPDATE, DELETE, SELECT FOR UPDATE, SELECT FOR NO KEY UPDATE, SELECT FOR SHARE or SELECT FOR KEY SHARE of these rows will be blocked until the current transaction ends; conversely, SELECT FOR UPDATE will wait for a concurrent transaction that has run any of those commands on the same row, and will then lock and return the updated row (or no row, if the row was deleted). Within a REPEATABLE READ or SERIALIZABLE transaction, however, an error will be thrown if a row to be locked has changed since the transaction started. For further discussion see Section 13.4.
A non-raw version of this example can be seen at: nodejs/sequelize/parallel_select_and_update.js.
@cirosantilli/_file/nodejs/sequelize/raw/nodejs/sequelize/raw/parallel_select_and_update_deterministic.js by Ciro Santilli 35 Updated 2024-12-23 +Created 1970-01-01
This example contains a deterministic demo of when postgreSQL serialization failures may happen.
Tested on PostgreSQL 13.5.
@cirosantilli/_file/nodejs/sequelize/raw/nodejs/sequelize/raw/parallel_create_delete_empty_tag.js by Ciro Santilli 35 Updated 2024-12-23 +Created 1970-01-01
In this example, posts have tags. When a post is deleted, we check to see if there are now any empty tags, and now we want to delete any empty tags that the post deletion may have created.
If we are creating and deleting posts concurrently, a naive implementation might wrongly delete the tags of a newly created post.
This could be due to a concurrency issue of the following types.
Failure case 1:which would result in the new post incorrectly not having the
- thread 2: delete old post
- thread 2: find all tags with 0 posts. Finds
tag0
from the deleted old post which is now empty. - thread 1: create new post, which we want to have tag
tag0
- thread 1: try to create a new tag
tag0
, but don't because it already exists, this is done using SQLite'sINSERT OR IGNORE INTO
or PostgreSQL'sINSERT ... ON CONFLICT DO NOTHING
- thread 1: assign
tag0
to the new post by adding an entry to the join table - thread 2: delete all tags with 0 posts. It still sees from its previous search that
tag0
is empty, and deletes it, which then cascades into the join table
tag0
.Failure case 2:which leads to a foreign key failure, because the tag does not exist anymore when the assignment happens.
- thread 2: delete old post
- thread 2: find all tags with 0 posts
- thread 1: create new post
- thread 1: try to create a new tag
tag0
, but don't because it already exists - thread 2: delete all tags with 0 posts. It still sees from its previous search that
tag0
is empty, and deletes it - thread 1: assign
tag0
to the new post
Failure case 3:which leads to a foreign key failure, because the tag does not exist anymore when the assignment happens.
- thread 2: delete old post
- thread 1: create new post, which we want to have tag
tag0
- thread 1: try to create a new tag
tag0
, and succeed because it wasn't present - thread 2: find all tags with 0 posts, finds the tag that was just created
- thread 2: delete all tags with 0 posts, deleting the new tag
- thread 1: assign
tag0
to the new post
Sample executions:All executions use 2 threads.
node --unhandled-rejections=strict ./parallel_create_delete_empty_tag.js p 9 1000 'READ COMMITTED'
: PostgreSQL, 9 tags, DELETE/CREATE thetag0
test tag 1000 times, useREAD COMMITTED
Execution often fails, although not always. The failure is always:because the:error: insert or update on table "PostTag" violates foreign key constraint "PostTag_tagId_fkey"
tries to insert a tag that was deleted in the other thread, as it didn't have any corresponding posts, so this is the foreign key failure.INSERT INTO "PostTag"
TODO: we've never managed to observe the failure case in whichtag0
is deleted. Is it truly possible? And if not, by which guarantee?node --unhandled-rejections=strict ./parallel_create_delete_empty_tag.js p 9 1000 'READ COMMITTED' 'FOR UPDATE'
: do aSELECT ... FOR UPDATE
before trying toINSERT
.This is likely correct and the fastest correct method according to our quick benchmarking, about 20% faster thanREPEATABLE READ
.We are just now 100% sure it is corret becase we can't find out if theSELECT
in theDELETE
subquery could first select some rows, which are then locked by the tag creator, and only then locked byDELETE
after selection. Or does it re-evaludate theSELECT
even though it is in a subquery?node --unhandled-rejections=strict ./parallel_create_delete_empty_tag.js p 9 1000 'REPEATABLE READ'
: repeatable readWe've never observed any failures with this level. This should likely fix the foreign key issue according to the PostgreSQL docs, since:- the
DELETE "Post"
commit cannot start to be seen only in the middle of the thread 1 transaction - and then if DELETE happened, the thread 1 transaction will detect it, ROLLBACK, and re-run. TODO how does it detect the need rollback? Is it because of the foreign key? It is very hard to be sure about this kind of thing, just can't find the information. Related: postgreSQL serialization failure.
- the
node --unhandled-rejections=strict ./parallel_create_delete_empty_tag.js p 9 1000 'SERIALIZABLE'
: serializablenode --unhandled-rejections=strict ./parallel_create_delete_empty_tag.js p 9 1000 'NONE'
: magic value, don't use any transaction. Can blow up of course, since even less restrictions thanREAD COMMITTED
Some theoretical notes:
- Failure case 3 is averted by a
READ COMMITTED
transaction, because thread 2 won't see the uncommitted tag that thread 1 created, and therefore won't be able to delete it
stackoverflow.com/questions/10935850/when-to-use-select-for-update from SELECT FOR UPDATE also talks about a similar example, and has relevant answers.
Unlisted articles are being shown, click here to show only listed articles.