SQL and connection strings
I'm working on a contrib module that will be used as an enterprise report writer for drupal and other external databases and XML feeds. Some of my users are interested in the capability to store ad-hoc queries and then write reports off of them, (kind of like Crystal Reports). My question is two fold: 1. Do people think its untenable to store ad hoc sql queries either in the drupal site files structure, or in the db? Right now my module assumes that all SQL that will be used for reporting is stored on the file system. I did this because for most enterprise reporting systems I feel it would be advantageous to be able to version control the SQL that is used in these reports. But ad hoc reporting really means writing your own queries, right? So do you think it's wrong to store them on file system or in db? Is it irresponsible to provide such a feature? 2. The connection string (with passwords). Right now I have the site admins hack in connection strings for external databases into settings.php in the sites folder. That probably won't work for a true ad hoc reporting solution, since we may want users to be able to connect to databases from the UI? Is it untenable to store these using variable_set? Are there any ideas for securing them inside the drupal DB? I realize that there will be many opinions on this topic, but I really want to get a feel for the opinions and the why of seasoned developers. Opinions that talk about risk trade offs would be much more welcome than simple, "no you should never do that" statements. More info on the project can be found at: http://drupal.org/project/forena Thanks for any input you all are willing to provide. Dave
For the connection string in D6, I believe you can muck with $GLOBAL['db_url'] at any time to add connections to it. You may need $db_url to already be using the array format; not sure. (In D7 you absolutely can add connections at any time, although there's a dedicated API for it.) That said, if you're talking about totally arbitrary SQL strings without variable components you may be better off bypassing the D6 DB layer entirely and just using PDO directly. Or, for that matter, use the D6 backport of the D7 DB layer, which is far more robust: http://drupal.org/project/dbtng As far as the queries themselves, it depends on how much you trust your users. You certainly could setup a system quite easily where users can enter SQL into a node body or CCK text field and then a formatter or something like that takes that query, runs it against the DB, and shows the results in a table. That's not at all hard to do, and much more robust and extensible than a variable_set(). The question is security. Anyone who can write a data modification query (insert, update, delete) has root access to your site. End of story. I presume you don't want that. In your case you're hitting a DB other than the main Drupal DB so that's the DB that would be at risk. So first off, you'd need to whitelist SELECT queries and disallow anything else. Then you'd need to be extremely careful to make sure the queries run only against the 3rd party database. Then you need to determine if there are any tables in that DB that users shouldn't have access to and whitelist/blacklist to protect those. That will likely take some interesting regular expressions. All doable, but touchy depending on the details. And it requires having intimate knowledge of SQL and that particular DB to write a query, and if you make a syntax error you have to re-edit the node, which means your query runner cannot possibly crash the request if there is a syntax error because you'll make it impossible to use the site. Alternatively, what type of SQL DB are we talking about here? If it's MySQL... Views can support 3rd party SQL databases just fine, even in Views 2. Just throw Views at it, give selected users access to the Views UI, and call it a day. :-) On Thursday 15 July 2010 10:42:19 pm David Metzler wrote:
I'm working on a contrib module that will be used as an enterprise report writer for drupal and other external databases and XML feeds. Some of my users are interested in the capability to store ad-hoc queries and then write reports off of them, (kind of like Crystal Reports).
My question is two fold: 1. Do people think its untenable to store ad hoc sql queries either in the drupal site files structure, or in the db? Right now my module assumes that all SQL that will be used for reporting is stored on the file system. I did this because for most enterprise reporting systems I feel it would be advantageous to be able to version control the SQL that is used in these reports. But ad hoc reporting really means writing your own queries, right? So do you think it's wrong to store them on file system or in db? Is it irresponsible to provide such a feature?
2. The connection string (with passwords). Right now I have the site admins hack in connection strings for external databases into settings.php in the sites folder. That probably won't work for a true ad hoc reporting solution, since we may want users to be able to connect to databases from the UI? Is it untenable to store these using variable_set? Are there any ideas for securing them inside the drupal DB?
I realize that there will be many opinions on this topic, but I really want to get a feel for the opinions and the why of seasoned developers. Opinions that talk about risk trade offs would be much more welcome than simple, "no you should never do that" statements.
More info on the project can be found at:
http://drupal.org/project/forena
Thanks for any input you all are willing to provide.
Dave
Or, for that matter, use the D6 backport of the D7 DB layer, which is far more robust: http://drupal.org/project/dbtng ... The question is security. Anyone who can write a data modification query (insert, update, delete) has root access to your site. End of story. I presume you don't want that. In your case you're hitting a DB other than the main Drupal DB so that's the DB that would be at risk. ... So first off, you'd need to whitelist SELECT queries and disallow anything else.
Crell's considerations are awesome, as always. Reading them, I'd recommend to 1) Create a new database user, having read (SELECT) privileges only, on selected tables only. 2) Switch to that user at runtime to perform the queries. Ideally using the backport of DBTNG. Perhaps try to parse the user input, or as Crell suggests, use Views + Views UI as query builder. 3) When thinking Views, your question about storage translates into "export into code" for file storage. However, I'm not sure whether Views is ready for end-user consumption, usability-wise. Of course, if you didn't have any query builder UI in mind in the first place, then Views sounds like ice-cream. Lastly, don't forget to report about your evaluation, progress, and decisions on Planet Drupal. :) I guess this topic is quite interesting for many users, perhaps especially enterprise users. sun
On Fri, 16 Jul 2010 00:16:48 -0500 Larry Garfield <larry@garfieldtech.com> wrote:
As far as the queries themselves, it depends on how much you trust your users. You certainly could setup a system quite easily where users can enter SQL into a node body or CCK text field and then a formatter or something like that takes that query, runs it against the DB, and shows the results in a table. That's not at all hard to do, and much more robust and extensible than a variable_set().
That's pretty nice since you can use revisions. Of course it is not going to work as a full fledged rcs.
The question is security. Anyone who can write a data modification query (insert, update, delete) has root access to your site. End of story. I presume you don't want that. In your case you're hitting a DB other than the main Drupal DB so that's the DB that would be at risk.
So first off, you'd need to whitelist SELECT queries and disallow anything else. Then you'd need to be extremely careful to make sure the queries run only against the 3rd party database. Then you need to determine if there are any tables in that DB that users shouldn't have access to and whitelist/blacklist to protect those. That will likely take some interesting regular expressions.
Unfortunately if he is going to support multiple engines that's not going to be easy. Filtering SELECT may not be enough: The SQL standard should forbid stuff like select * from (insert into pippo values('pluto') returning topolino); Still: - I wouldn't blindly trust all the implementations. Maybe they offer some "feature" that let circumvent the standard. (dynamic definition of statement assigned to a variable that get "EXEC", "PERFORM") - functions that produce side effects eg. PostgreSQL has nextval() but other engines may have others (that shouldn't be a problem if you can use GRANT) For engines that support GRANT and REVOKE it is easier to enforce who can do what in the DB. Then he could associate roles with connection strings (different users with different access levels already enforced in the DB). Unfortunately not all the DB engine supported by PDO support GRANT and REVOKE. If he wants to offer some layer of security for those DB he'll actually have to filter SELECT on his own. Maybe in real cases where you don't have GRANT the DB engine won't make it hard to prevent side effects of SELECT. He'd be able to find pre-packaged php snippet to filter SQL and avoid to forget all things other people thought before (UN/* */ION -- etc...). Anyway I don't think programs like Crystal Report solve this problem. So people plugging a reporting engine in a DB shouldn't expect it to take care of their security concern. It may be useful for reporting to be able to insert and execute multiple statement (many things get faster and even easier to think with temporary tables). That's something that is not generally permitted through drupal abstraction layer. He'd think if Drupal DB abstraction layer will mostly come into his way or still be useful. I'd keep using Drupal DB AL mainly to reuse Views for the most simple queries. I'd say that the most challenging part of emulating something like Crystal Report would be the UI and the WYSWYG part, and if he is planning to go that route then... Views is going to be just a Pre-view. Summing it up he may have roles that: - map to different connections, db users with different GRANT - have different SQL filters: - GUI only - SELECT only - Full SQL And add a paragraph to the module documentation that warns about security of letting people enter unfiltered SQL. I'd warn not only about data modification but also about resource exhaustion. Maybe hiding too well the DB behind a nice UI will make them forget there is still a DB sweating behind ;)
All doable, but touchy depending on the details. And it requires having intimate knowledge of SQL and that particular DB to write a query, and if you make a syntax error you have to re-edit the node, which means your query runner cannot possibly crash the request if there is a syntax error because you'll make it impossible to use the site.
?
Alternatively, what type of SQL DB are we talking about here? If it's MySQL... Views can support 3rd party SQL databases just fine, even in Views 2. Just throw Views at it, give selected users access to the Views UI, and call it a day. :-)
If he is going after the same users of Crystal Report he will have to let people insert SQL. There are things that simply can't be done with aggregates and join only and most of them are the real juice of reporting. -- Ivan Sergio Borgonovo http://www.webthatworks.it
2. The connection string (with passwords). Right now I have the site admins hack in connection strings for external databases into settings.php in the sites folder. That probably won't work for a true ad hoc reporting solution, since we may want users to be able to connect to databases from the UI? Is it untenable to store these using variable_set? Are there any ideas for securing them inside the drupal DB?
I can confirm that you can connect to dbs by dynamically adding connection strings stored in the DB to $GLOBAL['db_url'] at any time, which does save having people editing settings.php We do it to connect to other Drupal databases for queries. For external DBs might be interesting to see if you could use Data ( drupal.org/project/data) to integrate with Views and let users build the queries via the Views UI. Never tried dynamically doing that but I think it is worth considering. Best, Ronald --- http://www.istos.it
Adding connection strings to globals avoids the question of, "where are they stored and where do they come from?" If the user is "configuring a database connection" for ad hoc queries, do you think the security team would consider it a vulnerability to be storing these connection strings with variable_set, or some other created table? I haven't really heard a "that's crazy,man" response yet... so maybe this is not a concern. After all, we do store password hashes in the DB right? Oath tokens, etc. ? FYI: When I use forena, I'll be sure and create users that have read only access to the database. That way you can create data sources that use the power of posgtres and oracles XML generating functions, and seems more secure. I like the idea of white listing SELECT statements, but I don't think that you could really write one that is purely, and I think I might just always wrap the ad-hoc querys in SELECT * FROM ( ad hoc query here ) . Make sure there are no comment characters and I think I'm good to go? Regarding Views: Views is great! I love and use this module all the time. It does not however pretend to really provide ad hoc query mechanisms, nor an ad hoc reporting system, nor does it provide a database centric reporting engine, and the development of the api is far too much overhead for my purposes. With my prior XML reporting systems you can go from the SQL to a presentable report in less that 5 minutes. I simply cannot match that pace with views. Views 2 doesn't provide an api that will let you select from say a table of moodle users or the financial transactions of a company or any other non-drupal content without significant non-sql code development, and certainly not in 5minutes or less. With forena you'll be able to make form letters, etc. and use a wysiwyg editor to design custom report stuff. All you need to know is HTML, plus some custom report tags. Really, I'm not trying to compete with the views developers here. You guys ROCK for sure! On Jul 15, 2010, at 10:34 PM, Ronald Ashri wrote:
2. The connection string (with passwords). Right now I have the site admins hack in connection strings for external databases into settings.php in the sites folder. That probably won't work for a true ad hoc reporting solution, since we may want users to be able to connect to databases from the UI? Is it untenable to store these using variable_set? Are there any ideas for securing them inside the drupal DB?
I can confirm that you can connect to dbs by dynamically adding connection strings stored in the DB to $GLOBAL['db_url'] at any time, which does save having people editing settings.php
We do it to connect to other Drupal databases for queries.
For external DBs might be interesting to see if you could use Data (drupal.org/project/data) to integrate with Views and let users build the queries via the Views UI. Never tried dynamically doing that but I think it is worth considering.
Best,
Ronald --- http://www.istos.it
Adding connection strings to globals avoids the question of, "where are they stored and where do they come from?" If the user is "configuring a database connection" for ad hoc queries, do you think the security team would consider it a vulnerability to be storing these connection strings with variable_set, or some other created table? I haven't really heard a "that's crazy,man" response yet... so maybe this is not a concern. After all, we do store password hashes in the DB right? Oath tokens, etc. ?
You want reversible encryption here. Seems like there is a handy API module for it: http://drupal.org/project/encrypt
Excellent suggestion. Looks really promising. Thanks, Dave On Jul 16, 2010, at 6:53 AM, Moshe Weitzman wrote:
Adding connection strings to globals avoids the question of, "where are they stored and where do they come from?" If the user is "configuring a database connection" for ad hoc queries, do you think the security team would consider it a vulnerability to be storing these connection strings with variable_set, or some other created table? I haven't really heard a "that's crazy,man" response yet... so maybe this is not a concern. After all, we do store password hashes in the DB right? Oath tokens, etc. ?
You want reversible encryption here. Seems like there is a handy API module for it: http://drupal.org/project/encrypt
David Metzler wrote:
Regarding Views:
Have you considered creating database views for your user created ad-hoc query? Then the reporting is ready to go and the next user can leverage on an already created view. -- Earnie -- http://progw.com -- http://www.for-my-kids.com
Not really. It's a clever idea, but I think views tend to be too permanent. We use them extensively, of course, in all of our enterprise dbs. We've got almost 100 reports that have been converted to this tool already, and they use views as their data sources, but those are for what I'd classify as operational reports. Truly ad hoc queries are typically to support research requests, (e.g. government inquiries) and tend to be one time or very occasional use. Users may export to excel, or they may make a web report and run with it. We hope that they'll have a coder write sql for the operational ones, to make sure that they perform well. In most cases the ad-hoc queries are already against views that our developers have created to begin with. Views often also require special/elevated permissions to create. Another concern is database portability. Views in mysql take some uncomfortable file system access, I've heard, and it might be hard to create database agnostic view creation interface. Thanks for the thoughts though. P.S. I don't want to clog this list any further with a single project specific issue, but I welcome ideas and collaboration on this project. Forena is nearing beta release, and I'm getting pretty excited about it's potential. Meet me in the issue queue for more info. Dave On Jul 16, 2010, at 7:20 AM, Earnie Boyd wrote:
David Metzler wrote:
Regarding Views:
Have you considered creating database views for your user created ad-hoc query? Then the reporting is ready to go and the next user can leverage on an already created view.
-- Earnie -- http://progw.com -- http://www.for-my-kids.com
participants (7)
-
Daniel F. Kudwien -
David Metzler -
Earnie Boyd -
Ivan Sergio Borgonovo -
Larry Garfield -
Moshe Weitzman -
Ronald Ashri