Any solution for solving DB reserved word conflict?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Folk, I would like to ask for some support and suggestion for a 6-year-old issue: DB reserved word conflict within Drupal core (http://drupal.org/node/371). Support multiple databases within core should be one of the possible solution for expand market sharing. BTW, the conflict with DB specific reserved words is an always pain for abstraction layer designer and developer, and so affect the daily work of contribute developers. Drupal has also faced this problem since June 30, 2002. As Drupal 7.x is now open for public development, this issues is activated once again. Based on research and compare with other OSS web application, there is some possible solutions: ~ 1. Avoid the use of ALL reserved words within core. This is our traditional approach since 2002; on the other hand, Moodle 1.7+ also apply this solution, which support totally 4 databases: MySQL, PostgreSQL, Oracle and MSSQL (http://docs.moodle.org/en/XMLDB_reserved_words). ~ 2. Escape ALL potential reserved words with escape characters. Gallery2 is using this solution, and is able to support totally 6 databases: MySQL, PostgreSQL, Oracle, DB2, MSSQL and SQLite (http://www.garfieldtech.com/blog/database-abstraction#comment-446). Some hybrid suggestions are also proposed, but most of them can't solve this problem perfectly. E.g. Handling the escape of reserved word within each database drivers implementation (slow, complicated, and always miss), only handle the conflict of table name (can't solve the problem), using non-pair-up token as escape character (potentially buggy and not the optimal solution), etc. Both Moodle and Gallery2 are using ADOdb for DB abstraction, but finally solve the reserved word conflict problem in totally different direction. We can find that what DB abstraction can do for this issues is very limited, and usually require for the help from standardizing core query syntax, and be careful in our daily development. This issue can't be solved by normal simple patch, and usually need the help from most core developers. We may need to find out the best solution which we are all agreed to follow it. If you are interested in solving this 6-year-old issue, please refer to http://drupal.org/node/371 for more technical detail. Thanks a lot. Regard - -- Edison Wong hswong3i@gmail.com http://edin.no-ip.com/html/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHt+zABPIQaq+ZRd8RAtQQAJ9/cT7ewpFdWwG4ckStooiz/MlcBwCfS/rW 5g8txeXaitkSqkSN62wTpjg= =vZiU -----END PGP SIGNATURE-----
I would like to ask for some support and suggestion for a 6-year-old issue: DB reserved word conflict within Drupal core (http://drupal.org/node/371).
I don't really know the particulars of the issue, but in general reserved word issues can be corrected by using delimited identifiers. Unfortunately, delimited identifiers can be dbms specific. Still, that wheedles the problem down to defining the delimiter for the DBMSs you want to support.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 |> I would like to ask for some support and suggestion for a 6-year-old |> issue: DB reserved word conflict within Drupal core |> (http://drupal.org/node/371). | | I don't really know the particulars of the issue, but in general | reserved word issues can be corrected by using delimited identifiers. | Unfortunately, delimited identifiers can be dbms specific. Still, | that wheedles the problem down to defining the delimiter for the DBMSs | you want to support. Totally correct. Each DB come with its specific delimited identifiers, e.g. ` (MySQL), " (PostgreSQL, Oracle, DB2, SQLite), [ ] (MSSQL). So what I am proposing in http://drupal.org/node/371 is quite simple: use an universal token, as like as the idea of { } for table prefix. First of all, quota all potential reserved word with [ ] (since it is a pair up token, easy for regex), and further more replace it as DB specific delimited identifiers within each driver implementation. Because we define a black-and-white definition manually and under control, this solution is much accuracy than any others :-) BTW, since { } is target for table prefix replacement under our existing DB API implementation, I am not going to merge it with the idea of using [ ] (target as universal token of delimited identifier). Less merge, less dependent and conflict, so more flexible :-) - -- Edison Wong hswong3i@gmail.com http://edin.no-ip.com/html/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHuR2lBPIQaq+ZRd8RAh8rAKCexy2IFJbZMQNdujbn0kzPJ7fNyQCfYEvl f5UnXLwEvmkKuRdwxIEbs5E= =Dn4W -----END PGP SIGNATURE-----
First of all, quota all potential reserved word with [ ] (since it is a pair up token, easy for regex), and further more replace it as DB specific delimited identifiers within each driver implementation.
If we go that route, I suggest using `` over []. As you said, backticks are the MySQL quotes and since 90% of all Drupal sites use MySQL it seems straightforward to use that method. Konstantin
On Monday, 18. February 2008, Konstantin Käfer wrote:
First of all, quota all potential reserved word with [ ] (since it is a pair up token, easy for regex), and further more replace it as DB specific delimited identifiers within each driver implementation.
If we go that route, I suggest using `` over []. As you said, backticks are the MySQL quotes and since 90% of all Drupal sites use MySQL it seems straightforward to use that method.
[] has the advantage of being easily translatable to ``, whereas for the other way round you'd need to parse the string in order to find out which one is the opening character and which one is the closing one.
[] has the advantage of being easily translatable to ``, whereas for the other way round you'd need to parse the string in order to find out which one is the opening character and which one is the closing one.
No. Since identifiers don't really contain whitespace, it's just s/ `([^`\s]+)`/[$1]/. You can add an additional whitespace to be matched before and after the backticks, but they are not necessary when the backticks are used correctly. Konstantin
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Well, the useful of pair up syntax is not only about a simple regex implementation: /\[([A-Za-z0-9_]+)\]/, but also a simple syntax which is more human friendly. It is more readable than ``, so the debug process will be much simpler. Most of our developers are MySQL based, but not everyone like/know the useful of `` and use it daily. For me, I am usually lazy with it if possible. The [ ] syntax may bother those professional MySQL developers; BTW, at least I just seems [ ] as like as the use of { } (a special drupal-like query syntax), and hopefully this may also be the case for normal contribute developers? Konstantin Käfer wrote: |> [] has the advantage of being easily translatable to ``, whereas for the other |> way round you'd need to parse the string in order to find out which one is |> the opening character and which one is the closing one. | | No. Since identifiers don't really contain whitespace, it's just s/`([^`\s]+)`/[$1]/. You can add an additional whitespace to be matched before and after the backticks, but they are not necessary when the backticks are used correctly. | | Konstantin - -- Edison Wong hswong3i@gmail.com http://edin.no-ip.com/html/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHuczJBPIQaq+ZRd8RAihNAKCbjfipMXoQ2J0WZWMCdmy1qGPxZQCeNP2V /x61Jr7/6P6UAwJCxUra4uI= =ffoR -----END PGP SIGNATURE-----
On Mon, 18 Feb 2008 18:52:29 +0100 Konstantin Käfer <kkaefer@gmail.com> wrote:
[] has the advantage of being easily translatable to ``, whereas for the other way round you'd need to parse the string in order to find out which one is the opening character and which one is the closing one.
No. Since identifiers don't really contain whitespace, it's just s/ `([^`\s]+)`/[$1]/. You can add an additional whitespace to be matched before and after the backticks, but they are not necessary when the backticks are used correctly.
I wouldn't bet on this but I think MS SQL can have such beasts as "my table". Or did I get it wrong? Furthermore I'd prefer readability over MySQL friendliness at this level. -- Ivan Sergio Borgonovo http://www.webthatworks.it
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I have an indeed brain storming about using `` or [ ], too. The main point of choosing [ ] is about its pair up style, so the regex for replace will be much simpler, and similar idea as { } (not new to drupaler). I am not trying to trade off our MySQL users; BTW, just because `` is less optimal in this case, if we are asking for a suitable and universal escape character token ;-( Konstantin Käfer wrote: |> First of all, quota all potential reserved word with [ ] (since it is a pair up token, easy for regex), and further more replace it as DB specific delimited identifiers within each driver implementation. | | If we go that route, I suggest using `` over []. As you said, backticks are the MySQL quotes and since 90% of all Drupal sites use MySQL it seems straightforward to use that method. | | Konstantin - -- Edison Wong hswong3i@gmail.com http://edin.no-ip.com/html/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHucr1BPIQaq+ZRd8RAokZAJ96pcp/taKXaF1sruCkqAldE5G+IgCglWDy GXQJVUsYgZ9ErGpNOgpzF6I= =/Xc3 -----END PGP SIGNATURE-----
Totally correct. Each DB come with its specific delimited identifiers, e.g. ` (MySQL), " (PostgreSQL, Oracle, DB2, SQLite), [ ] (MSSQL). So what I am proposing in http://drupal.org/node/371 is quite simple: use an universal token, as like as the idea of { } for table prefix.
Well, I see you're already on the right track. I should have read that first ... :) One thing to keep in mind is that delimited identifiers are case sensitive per the SQL standard. That's probably not an issue, I just didn't see it brought up, and it's something to be aware of. This is something that an ORM deals with easily because it already has the concept of identifiers. -Dave
On Mon, 18 Feb 2008 11:18:13 -0600, "David Durham, Jr." <david.durham.jr@gmail.com> wrote:
Totally correct. Each DB come with its specific delimited identifiers, e.g. ` (MySQL), " (PostgreSQL, Oracle, DB2, SQLite), [ ] (MSSQL). So what I am proposing in http://drupal.org/node/371 is quite simple: use an universal token, as like as the idea of { } for table prefix.
Well, I see you're already on the right track. I should have read that first ... :) One thing to keep in mind is that delimited identifiers are case sensitive per the SQL standard. That's probably not an issue, I just didn't see it brought up, and it's something to be aware of. This is something that an ORM deals with easily because it already has the concept of identifiers.
-Dave
Actually, that's something a query builder helps with since it deals with structured data and the builder can do whatever escaping is needed without a regex. An ORM is a different matter. One of the take-aways for me from the Data API Design Sprint was that we do not, in fact, want a classic ORM. An ORM is an Object-*Relational*-Mapper. We need to be able to handle more data than just that coming from a local relational database. That means we do not want an ORM, we want an object-storage-mapper. A query builder, on the other hand, is a different and still perfectly usable animal. It's entirely SQL-centric, which for many uses is fine. I'm actually against adding more regexes to the database system. Regexing a serialized data structure is a design flaw. If we're colliding with reserved words, let's just update the schema to not use reserved words per the SQL 99 spec. Problem solved. --Larry Garfield
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 | I'm actually against adding more regexes to the database system. Regexing a serialized data structure is a design flaw. If we're colliding with reserved words, let's just update the schema to not use reserved words per the SQL 99 spec. Problem solved. Hope to point out an interesting founding: the performance is not much different from our existing implementation, even I try to add some stuff in order to support totally 5 databases, both legacy and PDO implementation. This is proved by benchmarking result (http://edin.no-ip.com/project/siren/siren-1.0#benchmarking_result). We may not need to seems this minor regex as a monster. Using escape character can solve the conflict *forever*; BTW, we always need to keep trace our source code in order to avoid the use of conflict wording (e.g. if database upgrade with some new reserved word, we will also need to upgrade our core; who can for sure that MySQL won't add some new wordings?). IMHO, I would prefer a permanent solution, rather than an open issue always stay beside us ;-) - - Edison Wong hswong3i@gmail.com http://edin.no-ip.com/html/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHudunBPIQaq+ZRd8RAifOAJ0XUr2E15rPvIIo6aEy6EnXWTtlRQCeLlTg YCbLUbow/JiGjPYGn/HqZjM= =e0GS -----END PGP SIGNATURE-----
On Feb 18, 2008 12:57 PM, Larry Garfield <larry@garfieldtech.com> wrote:
Actually, that's something a query builder helps with since it deals with structured data and the builder can do whatever escaping is needed without a regex. An ORM is a different matter.
You're technically right, of course, but can you really call something an ORM if it doesn't have one or more query builders? And that's the bug right there, you need more than one query builder, so how to neatly plug those in? -Dave
On Mon, 18 Feb 2008 13:40:40 -0600, "David Durham, Jr." <david.durham.jr@gmail.com> wrote:
On Feb 18, 2008 12:57 PM, Larry Garfield <larry@garfieldtech.com> wrote:
Actually, that's something a query builder helps with since it deals with structured data and the builder can do whatever escaping is needed without a regex. An ORM is a different matter.
You're technically right, of course, but can you really call something an ORM if it doesn't have one or more query builders? And that's the bug right there, you need more than one query builder, so how to neatly plug those in?
-Dave
An ORM requires a query builder. A query builder alone is not an ORM. I don't think we need/want a complete database-ORM, but a query builder is very valuable to have. --Larry Garfield
On Feb 18, 2008 2:07 PM, Larry Garfield <larry@garfieldtech.com> wrote:
An ORM requires a query builder. A query builder alone is not an ORM. I don't think we need/want a complete database-ORM, but a query builder is very valuable to have.
I know what you're saying, but a single query builder doesn't seem to suffice. You need one for mysql, postgres .. etc.
On Feb 18, 2008 2:19 PM, David Durham, Jr. <david.durham.jr@gmail.com> wrote:
On Feb 18, 2008 2:07 PM, Larry Garfield <larry@garfieldtech.com> wrote:
An ORM requires a query builder. A query builder alone is not an ORM. I don't think we need/want a complete database-ORM, but a query builder is very valuable to have.
I know what you're saying, but a single query builder doesn't seem to suffice. You need one for mysql, postgres .. etc.
forgot to mention that something as seemingly trivial as join tables can required different sql. Think someone brought up a point about the new version of postgres requires casts when the datatypes used to join are different.
On Mon, 18 Feb 2008 14:20:38 -0600 "David Durham, Jr." <david.durham.jr@gmail.com> wrote:
forgot to mention that something as seemingly trivial as join tables can required different sql. Think someone brought up a point about the new version of postgres requires casts when the datatypes used to join are different.
Last version became much stricter about auto casts. I do think it is a good thing even if it may have an initial cost. Better fail early. -- Ivan Sergio Borgonovo http://www.webthatworks.it
On Monday 18 February 2008, David Durham, Jr. wrote:
On Feb 18, 2008 2:19 PM, David Durham, Jr. <david.durham.jr@gmail.com> wrote:
On Feb 18, 2008 2:07 PM, Larry Garfield <larry@garfieldtech.com> wrote:
An ORM requires a query builder. A query builder alone is not an ORM. I don't think we need/want a complete database-ORM, but a query builder is very valuable to have.
I know what you're saying, but a single query builder doesn't seem to suffice. You need one for mysql, postgres .. etc.
forgot to mention that something as seemingly trivial as join tables can required different sql. Think someone brought up a point about the new version of postgres requires casts when the datatypes used to join are different.
First, there's only so far a database can customize its syntax before it doesn't count as SQL anymore. :-) Second, please have a look at the database_query_builder.inc file in the pdo directory of my sandbox. (My CVS name is crell.) It should handle any reasonably standard SQL database, and is fully overridable for a specific database type using simple class inheritance. I am working on the patch to integrate it into core, which I hope to have ready for testing by the end of the week. At that point I welcome feedback on its compatibility with other databases. -- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
I poked around in your sandbox a bit Larry. Saw some cool ideas pretty well on their way to implementation there! :-) ..chris On Feb 19, 2008 2:57 AM, Larry Garfield <larry@garfieldtech.com> wrote:
On Monday 18 February 2008, David Durham, Jr. wrote:
On Feb 18, 2008 2:19 PM, David Durham, Jr. <david.durham.jr@gmail.com> wrote:
On Feb 18, 2008 2:07 PM, Larry Garfield <larry@garfieldtech.com> wrote:
An ORM requires a query builder. A query builder alone is not an ORM. I don't think we need/want a complete database-ORM, but a query builder is very valuable to have.
I know what you're saying, but a single query builder doesn't seem to suffice. You need one for mysql, postgres .. etc.
forgot to mention that something as seemingly trivial as join tables can required different sql. Think someone brought up a point about the new version of postgres requires casts when the datatypes used to join are different.
First, there's only so far a database can customize its syntax before it doesn't count as SQL anymore. :-)
Second, please have a look at the database_query_builder.inc file in the pdo directory of my sandbox. (My CVS name is crell.) It should handle any reasonably standard SQL database, and is fully overridable for a specific database type using simple class inheritance. I am working on the patch to integrate it into core, which I hope to have ready for testing by the end of the week. At that point I welcome feedback on its compatibility with other databases.
-- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012
"If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
participants (7)
-
Chris Johnson -
David Durham, Jr. -
Edison Wong -
Ivan Sergio Borgonovo -
Jakob Petsovits -
Konstantin Käfer -
Larry Garfield