A software that implements some database system, e.g. PostgreSQL or MySQL are two (widely extended) SQL implementations.
Infinitely many SQL answers.
As mentioned at Ciro Santilli's Stack Overflow contributions, he just answers every semi-duplicate immediatly as it is asked, and is therefore able to overcome the Stack Overflow maximum 200 daily reputation limit by far. E.g. in 2018, Gordon reached 135k (archive), thus almost double the 73k yearly limit due to the 200 daily limit, all of that with accepts.
This is in contrast to Ciro Santilli's contribution style which is to only answer questions as he needs the subject, or generally important questions that aroused his interest.
2014 Blog post describing his activity: blog.data-miners.com/2014/08/an-achievement-on-stack-overflow.html, key quote:so that suggests his contributions also take a meditative value.
For a few months, I sporadically answered questions. Then, in the first week of May, my Mom's younger brother passed away. That meant lots of time hanging around family, planning the funeral, and the like. Answering questions on Stack Overflow turned out to be a good way to get away from things. So, I became more intent.
www.data-miners.com/linoff.htm mentions he's an SQL consultant that consulted for several big companies.
2021 Reddit thread about him: www.reddit.com/r/programming/comments/puok1h/a_single_person_answered_76k_questions_about_sql/ mentions that by then he had:
answered 76k questions about SQL on StackOverflow. Averaging 22.8 answers per day, every day, for the past 8.6 years.
How to decide if an ORM is decent? Just try to replicate every SQL query from nodejs/sequelize/raw/many_to_many.js on PostgreSQL and SQLite.
There is only a very finite number of possible reasonable queries on a two table many to many relationship with a join table. A decent ORM has to be able to do them all.
If it can do all those queries, then the ORM can actually do a good subset of SQL and is decent. If not, it can't, and this will make you suffer. E.g. Sequelize v5 is such an ORM that makes you suffer.
The next thing to check are transactions.
Basically, all of those come up if you try to implement a blog hello world world such as gothinkster/realworld correctly, i.e. without unnecessary inefficiencies due to your ORM on top of underlying SQL, and dealing with concurrency.
As of the 2020's, a slumbering giant.
But the pre-Internet impact of IBM was insane! Including notably:
- some of the most important business computers of the pre-personal computer era
- SQL
- IBM Generalized Markup Language, which is a predecessor to XML and HTML
Determines what can or cannot happen when multiple queries are running in parallel.
See Section "SQL transaction isolation level" for the most common context under which this is discussed: SQL.
This is particularly important in SQL: Nested set model in SQL, as it is an efficient way to transverse trees there, since querying parents every time would require multiple disk accesses.
The ASCII art visualizations from stackoverflow.com/questions/192220/what-is-the-most-efficient-elegant-way-to-parse-a-flat-table-into-a-tree/194031#194031 are worth reproducing.
As a tree:
- Root 1
- Child 1.1
- Child 1.1.1
- Child 1.1.2
- Child 1.2
- Child 1.2.1
- Child 1.2.2
- Child 1.1
As the sets:
__________________________________________________________________________
| Root 1 |
| ________________________________ ________________________________ |
| | Child 1.1 | | Child 1.2 | |
| | ___________ ___________ | | ___________ ___________ | |
| | | C 1.1.1 | | C 1.1.2 | | | | C 1.2.1 | | C 1.2.2 | | |
1 2 3___________4 5___________6 7 8 9___________10 11__________12 13 14
| |________________________________| |________________________________| |
|__________________________________________________________________________|
Consider the following nested set:
0, 8, root
1, 7, mathematics
2, 3, geometry
3, 6, calculus
4, 5, derivative
5, 6, integral
6, 7, algebra
7, 8, physics
When we want to insert one element, e.g. so we have a method:
limit
, normally under calculus
, we have to specify:- parent
- index within parent
insert(parent, previousSibling)
How to implement Nested set model in SQL:
- stackoverflow.com/questions/192220/what-is-the-most-efficient-elegant-way-to-parse-a-flat-table-into-a-tree/42781302#42781302 contains the correct left/size representation and update queries, which makes it much easier to maintain the tree without having to worry about the sizes of siblings which are constant
- stackoverflow.com/questions/192220/what-is-the-most-efficient-elegant-way-to-parse-a-flat-table-into-a-tree/194031#194031 amazing ASCII art representations of the structure. Unfortunatly uses a wonky left/right representation, rather than the much more natural left/size representation from the other post
Evil company that desecrated the beauty created by Sun Microsystems, and was trying to bury Java once and or all in the 2010's.
Their database is already matched by open source e.g. PostgreSQL, and ERP and CRM specific systems are boring.
Oracle basically grew out of selling one of the first SQL implementations in the late 70's, and notably to the United States Government and particularly the CIA. They did deliver a lot of value in those early pre-internet days, but now open source is and will supplant them entirely.
Exampes under nodejs/sequelize/raw:
- nodejs/sequelize/raw/index.js: Sequelize raw query hello world. Ideally one should never use a raw query in a real project. We use raw examples mostly as a SQL tutorial under SQL example, and will not comment on them much further on this section.
The new default homepage for a logged out user how shows a list of the topics with the most articles.
This is a reasonable choice for default homepage, and it immediately exposes users to this central feature of the website: the topic system.
Doing this required in particular calculating the best title for a topic, since it is possible to have different titles with the same ID, the most common way being with capitalization changes, e.g.:would both have topic ID
JavaScript
Javascript
javascript
.With this in place we also added the preferred topic title to the top topic page.
The algorithm chosen is to pick the top 10 most upvoted topics, and select the most common title from amongst them. This should make topic title vandalism quite hard. This was made in a single SQL query, and became the most complext SQL query Ciro Santilli has ever written so far: twitter.com/cirosantilli2/status/1549721815832043522
These examples are written in the Sequelize library using raw queries.
Sequelize is used minimally, just to feed raw queries in transparently to any underlying database, and get minimally parsed results out for us, which we then assert with standard JavaScript. The queries themselves are all written by hand.
By default the examples run on SQLite. Just like the examples from sequelize example, you can set the database at runtime as:
./index.js
or./index.js l
: SQLite./index.js p
: PostgreSQL. You must manually create a database calledtmp
and ensure that peer authentication works for it
Here we list only examples which we believe are standard SQL, and should therefore work across different SQL implementations:
- nodejs/sequelize/raw/index.js: basic hello world to demonstrate the setup and very simple functionality
- nodejs/sequelize/raw/many_to_many.js: illustrates many-to-many relations with JOIN. Contains:
- SQL transaction examples:
- nodejs/sequelize/raw/commit_error.js: stackoverflow.com/questions/27245101/why-should-we-use-rollback-in-sql-explicitly/27245234#27245234 and stackoverflow.com/questions/48277519/how-to-use-commit-and-rollback-in-a-postgresql-function/48277708#48277708 suggest that on PostgreSQL, once something fails inside a transaction, all queries in the current transaction are ignored, and
COMMIT
simply does aROLLBACK
. Let's check. Yup, true for Postgres, but false for SQLite, SQLite just happily runs anything it can, you really needROLLBACK
for it. - SQL isolation level example
- nodejs/sequelize/raw/commit_error.js: stackoverflow.com/questions/27245101/why-should-we-use-rollback-in-sql-explicitly/27245234#27245234 and stackoverflow.com/questions/48277519/how-to-use-commit-and-rollback-in-a-postgresql-function/48277708#48277708 suggest that on PostgreSQL, once something fails inside a transaction, all queries in the current transaction are ignored, and
- GROUP BY and SQL aggregate functions:
- nodejs/sequelize/raw/group_by_extra_column.js: let's see if it blows up or not on different DB systems,
sqlite3
Node.js package allows it:- github.com/sequelize/sequelize/issues/5481#issuecomment-964387232
- dba.stackexchange.com/questions/141594/how-select-column-does-not-list-in-group-by-clause/141600 says that it was allowed in SQL:1999 when there are no ambiguities due to constraints, e.g. when grouping by unique columns
- github.com/postgres/postgres/blob/REL_13_5/src/test/regress/sql/functional_deps.sql#L27 shows that PostgreSQL wants it to work for
UNIQUE NOT NULL
, but they just haven't implemented it as of 13.5, where it only works if you group byPRIMARY KEY
- dba.stackexchange.com/questions/158015/why-can-i-select-all-fields-when-grouping-by-primary-key-but-not-when-grouping-b also says that
UNIQUE NOT NULL
doesn't work. Dan Lenski then points to rationale mailing list thread:
- nodejs/sequelize/raw/group_by_max_full_row.js: here we try to get the full row of each group at which a given column reaches the max of the group
- Postgres: has
SELECT DISCINTCT ON
which works perfectly if you only want one row in case of multiple rows attaining the max.ON
is an extension to the standard unfortunately: www.postgresql.org/docs/9.3/sql-select.html#SQL-DISTINCT Docs specify that it always respectsORDER BY
when selecting the row.- stackoverflow.com/questions/586781/postgresql-fetch-the-row-which-has-the-max-value-for-a-column asks it without the multiple matches use case
- stackoverflow.com/questions/586781/postgresql-fetch-the-rows-which-have-the-max-value-for-a-column-in-each-group/587209#587209 also present in simpler form at stackoverflow.com/questions/121387/fetch-the-rows-which-have-the-max-value-for-a-column-for-each-distinct-value-of/123481#123481 gives a very nice OUTER JOIN only solution! Incredible, very elegant.
- dba.stackexchange.com/questions/171938/get-only-rows-with-max-group-value asks specifically the case of multiple matches to the max
- stackoverflow.com/questions/586781/postgresql-fetch-the-row-which-has-the-max-value-for-a-column asks it without the multiple matches use case
- SQLite:
- stackoverflow.com/questions/48326957/row-with-max-value-per-group-sqlite
- stackoverflow.com/questions/48326957/row-with-max-value-per-group-sqlite/48328243#48328243 teaches us that in SQLite min and max are magic and guarantee that the matching row is returned
- stackoverflow.com/questions/48326957/row-with-max-value-per-group-sqlite/72996649#72996649 Ciro Santilli uses the magic of
ROW_NUMBER
- stackoverflow.com/questions/17277152/sqlite-select-distinct-of-one-column-and-get-the-others/71924314#71924314 get any full row without specifying which, we teach how to specify
- code.djangoproject.com/ticket/22696 WONTFIXed
DISTINCT ON
- stackoverflow.com/questions/50846722/what-is-the-difference-between-postgres-distinct-vs-distinct-on/72997494#72997494
DISTINCT
vsDISTINCT ON
, somewhat related question
- stackoverflow.com/questions/50846722/what-is-the-difference-between-postgres-distinct-vs-distinct-on/72997494#72997494
- stackoverflow.com/questions/48326957/row-with-max-value-per-group-sqlite
- stackoverflow.com/questions/5803032/group-by-to-return-entire-row asks how to take the top N with distinct after order limit. I don't know how to do it in Postgres
- Postgres: has
- nodejs/sequelize/raw/most_frequent.js: illustrates a few variants of findind the mode, including across GROUP
- nodejs/sequelize/raw/group_by_max_n.js: get the top N in each group
- nodejs/sequelize/raw/group_by_extra_column.js: let's see if it blows up or not on different DB systems,
- order results in the same order as
IN
: - LIMIT by a running total: TODO links
SQL's implementation of database triggers.
This feature is really cool, as it allows you to keep caches up to date!
In particular, everything that happens in a trigger happens as if it were in a transaction. This way, you can do less explicit transactions when you use triggers. It is a bit like the advantages of SQL CASCADE.
DBMS: