Curl-based content generation
Hi! I'm working on a small curl-based script to add content to a drupal-6 site. Logging-in (getting cookie etc) works fine but when I try to add content I get a validation error. I guess it's about security. Is it at all possible to do what I'm trying? Is there documention how these form tokens actually work and what is required? I couldn't find that in the formAPI docs. Here's the code: # login curl http://localhost/USA08/user \ -F 'name=Martin' \ -F 'pass=pass' \ -F 'form_id=user_login' \ -F 'op=Log in' \ --output /Users/martin/response.html \ -s \ -c /tmp/curl-cookies.txt \ -b /tmp/curl-cookies.txt # -> works: response.html says I'm logged-in # add page curl http://localhost/USA08/node/add/page \ -F 'title=xyz' \ -F 'teaser_include=1' \ -F 'body=abc' \ -F 'format=1' \ -F 'changed=' \ -F 'form_id=page_node_form' \ -F 'log=' \ -F 'comment=0' \ -F 'name=Martin' \ -F 'date=' \ -F 'status=1' \ -F 'op=Save' \ --output /Users/martin/response2.html \ -s \ -c /tmp/curl-cookies.txt \ -b /tmp/curl-cookies.txt # -> does not work: response2.html says 'Validation error, please try again. If this error persists, please contact the site administrator.' Regards, Martin
On Thu, Feb 05, 2009 13:46:07 PM +0100, Martin Stadler wrote:
Hi!
I'm working on a small curl-based script to add content to a drupal-6 site. Logging-in (getting cookie etc) works fine but when I try to add content I get a validation error. I guess it's about security. Is it at all possible to do what I'm trying?
Martin, possible, it's possible. I *had* made something like this work with Drupal 4.x. Then I abandoned that project but asked here about the best way to resurrect it just one month ago (the "Info needed to add content to Drupal via shell/curl script"). Due to more urgent tasks I have had to abandon that project again for the moment, but I'm still very interested, so please keep me posted even off list if you succeed, please. This said,
Is there documention how these form tokens actually work and what is required? I couldn't find that in the formAPI docs.
this is the same problem I had, IIRC. The basic flow is simple, conceptually. The problem is: 1) how does one know in advance what is the exact sequence of pages to visit, that is of Urls to pass to curl? 2) how does one know **in advance**, without going by trial and error, what are all the fields one should pass to curl each time, via -F? 3) how can one know **in advance** the list of admissible values for each field of each form (especially taxonomy terms??)? 4) can we be confident minor upgrades won't break the script, that is that the answers above won't change going from 6.x to 6.x+1? 5) Am I missing something else here? As a wild, really wild guess, I **believe** that this problem you have:
# -> does not work: response2.html says 'Validation error, please try again. If this error persists, please contact the site administrator.'
may be due to point 1 above, that is you didn't call all the pages Drupal wants you to visit, in the right order. One part of the problem is that answers to questions 2 and 3 are different for each site, that is they depend on which modules and node types you have active, how many categories there are and how many values they have, if you can upload attachments or not, etc... For me, what would be great would be to have some simple method to know these information via mysql. In other words, what is/are the mysql query(es) on the server that would return a lists of all the parameters asked by questions 2 and 3 above? So, I have no specific answers, but I hope that this post helps us to get them from the developers. HTH, Marco -- Your own civil rights and the quality of your life heavily depend on how software is used *around* you: http://digifreedom.net/node/84
This cannot be a general approach to add content to an arbitrary Drupal site. Your points are valid and if you need that you need to write a module providing an API or use SQL directly. What I'm trying to do is writing a very custom one-time script to add content automatically to a certain site. So I can find out about the form details once in advance and that should be it. Yes, it may depend on what modules you have enabled and form details may change even in minor Drupal versions (security fixes i.e.). I'm aware of that and I'd adjust my script accordingly in that case. I just want to know what I need to send to add content to the default page content type with no modifications for the latest stable Drupal 6 release. Martin Am 05.02.2009 um 14:03 schrieb M. Fioretti:
On Thu, Feb 05, 2009 13:46:07 PM +0100, Martin Stadler wrote:
Hi!
I'm working on a small curl-based script to add content to a drupal-6 site. Logging-in (getting cookie etc) works fine but when I try to add content I get a validation error. I guess it's about security. Is it at all possible to do what I'm trying?
Martin,
possible, it's possible. I *had* made something like this work with Drupal 4.x. Then I abandoned that project but asked here about the best way to resurrect it just one month ago (the "Info needed to add content to Drupal via shell/curl script"). Due to more urgent tasks I have had to abandon that project again for the moment, but I'm still very interested, so please keep me posted even off list if you succeed, please. This said,
Is there documention how these form tokens actually work and what is required? I couldn't find that in the formAPI docs.
this is the same problem I had, IIRC. The basic flow is simple, conceptually. The problem is:
1) how does one know in advance what is the exact sequence of pages to visit, that is of Urls to pass to curl?
2) how does one know **in advance**, without going by trial and error, what are all the fields one should pass to curl each time, via -F?
3) how can one know **in advance** the list of admissible values for each field of each form (especially taxonomy terms??)?
4) can we be confident minor upgrades won't break the script, that is that the answers above won't change going from 6.x to 6.x+1?
5) Am I missing something else here?
As a wild, really wild guess, I **believe** that this problem you have:
# -> does not work: response2.html says 'Validation error, please try again. If this error persists, please contact the site administrator.'
may be due to point 1 above, that is you didn't call all the pages Drupal wants you to visit, in the right order.
One part of the problem is that answers to questions 2 and 3 are different for each site, that is they depend on which modules and node types you have active, how many categories there are and how many values they have, if you can upload attachments or not, etc... For me, what would be great would be to have some simple method to know these information via mysql. In other words, what is/are the mysql query(es) on the server that would return a lists of all the parameters asked by questions 2 and 3 above?
So, I have no specific answers, but I hope that this post helps us to get them from the developers.
HTH, Marco -- Your own civil rights and the quality of your life heavily depend on how software is used *around* you: http://digifreedom.net/node/84
Hi Martin, The FormAPI generates a security token, I'm not sure if it's something in the session or returned in the form iteself (the latter I think). So you'll probably want to visit the page your form is on, grab the variables and post them back in your submission. Regards Steven Jones ComputerMinds ltd - Perfect Drupal Websites Phone : 0121 288 0434 Mobile : 07951 270 026 http://www.computerminds.co.uk 2009/2/5 Martin Stadler <martin@siarp.de>:
This cannot be a general approach to add content to an arbitrary Drupal site. Your points are valid and if you need that you need to write a module providing an API or use SQL directly.
What I'm trying to do is writing a very custom one-time script to add content automatically to a certain site. So I can find out about the form details once in advance and that should be it. Yes, it may depend on what modules you have enabled and form details may change even in minor Drupal versions (security fixes i.e.). I'm aware of that and I'd adjust my script accordingly in that case.
I just want to know what I need to send to add content to the default page content type with no modifications for the latest stable Drupal 6 release.
Martin
Am 05.02.2009 um 14:03 schrieb M. Fioretti:
On Thu, Feb 05, 2009 13:46:07 PM +0100, Martin Stadler wrote:
Hi!
I'm working on a small curl-based script to add content to a drupal-6 site. Logging-in (getting cookie etc) works fine but when I try to add content I get a validation error. I guess it's about security. Is it at all possible to do what I'm trying?
Martin,
possible, it's possible. I *had* made something like this work with Drupal 4.x. Then I abandoned that project but asked here about the best way to resurrect it just one month ago (the "Info needed to add content to Drupal via shell/curl script"). Due to more urgent tasks I have had to abandon that project again for the moment, but I'm still very interested, so please keep me posted even off list if you succeed, please. This said,
Is there documention how these form tokens actually work and what is required? I couldn't find that in the formAPI docs.
this is the same problem I had, IIRC. The basic flow is simple, conceptually. The problem is:
1) how does one know in advance what is the exact sequence of pages to visit, that is of Urls to pass to curl?
2) how does one know **in advance**, without going by trial and error, what are all the fields one should pass to curl each time, via -F?
3) how can one know **in advance** the list of admissible values for each field of each form (especially taxonomy terms??)?
4) can we be confident minor upgrades won't break the script, that is that the answers above won't change going from 6.x to 6.x+1?
5) Am I missing something else here?
As a wild, really wild guess, I **believe** that this problem you have:
# -> does not work: response2.html says 'Validation error, please try again. If this error persists, please contact the site administrator.'
may be due to point 1 above, that is you didn't call all the pages Drupal wants you to visit, in the right order.
One part of the problem is that answers to questions 2 and 3 are different for each site, that is they depend on which modules and node types you have active, how many categories there are and how many values they have, if you can upload attachments or not, etc... For me, what would be great would be to have some simple method to know these information via mysql. In other words, what is/are the mysql query(es) on the server that would return a lists of all the parameters asked by questions 2 and 3 above?
So, I have no specific answers, but I hope that this post helps us to get them from the developers.
HTH, Marco -- Your own civil rights and the quality of your life heavily depend on how software is used *around* you: http://digifreedom.net/node/84
On Thu, Feb 05, 2009 15:41:15 PM +0100, Martin Stadler wrote:
This cannot be a general approach to add content to an arbitrary Drupal site. Your points are valid and if you need that you need to write a module providing an API or use SQL directly.
I am well aware that this job must be (partly) redone for each Drupal site one needs to work. I'm fine with that, but I want a solution which works even if I have no shell access to the server where Drupal runs, or if I cannot install modules there.
What I'm trying to do is writing a very custom one-time script to add content automatically to a certain site.
same for me, really. Ditto for the rest of your reply. So we agree (I think) that the problem is not writing scripts manually for each site we want to access, is to avoid, if possible, lots of guesses and blind trials just to know what we should write in those scripts. This is where, I think, we'd both need help from the core developers who knows Drupal sql tables in detail. Important: at least for me, there is no problem to run sql queries on the server to get all the fields I should fill via Curl. That is a one-time job I can do myself or ask the webmaster to do for me. When I said above "solution which works even without shell access" I refer to when I _use_ the script (**) which I'd write using the info which I hope to find through this list. (**) Once written, a curl script would only need the http port open, so it would work behind any firewall I've come across so far. Whereas to run scripts on the server, even if I had a shell account, I would need the ssh port to use them, which may or may not be open in some places. Marco -- Your own civil rights and the quality of your life heavily depend on how software is used *around* you: http://digifreedom.net/node/84
Quoting Martin Stadler <martin@siarp.de>:
This cannot be a general approach to add content to an arbitrary Drupal site. Your points are valid and if you need that you need to write a module providing an API or use SQL directly.
What I'm trying to do is writing a very custom one-time script to add content automatically to a certain site. So I can find out about the form details once in advance and that should be it. Yes, it may depend on what modules you have enabled and form details may change even in minor Drupal versions (security fixes i.e.). I'm aware of that and I'd adjust my script accordingly in that case.
I just want to know what I need to send to add content to the default page content type with no modifications for the latest stable Drupal 6 release.
I build a node object and pass it to node_save [1] and execute cron batch style. [1] http://api.drupal.org/api/function/node_save -- Earnie http://r-feed.com Make a Drupal difference and review core patches. -- http://for-my-kids.com/ -- http://give-me-an-offer.com/
Use drupal_execute(). http://api.drupal.org/api/function/drupal_execute/6 On Thu, Feb 5, 2009 at 7:46 AM, Martin Stadler <martin@siarp.de> wrote:
Hi!
I'm working on a small curl-based script to add content to a drupal-6 site. Logging-in (getting cookie etc) works fine but when I try to add content I get a validation error. I guess it's about security. Is it at all possible to do what I'm trying? Is there documention how these form tokens actually work and what is required? I couldn't find that in the formAPI docs.
Here's the code:
# login curl http://localhost/USA08/user \ -F 'name=Martin' \ -F 'pass=pass' \ -F 'form_id=user_login' \ -F 'op=Log in' \ --output /Users/martin/response.html \ -s \ -c /tmp/curl-cookies.txt \ -b /tmp/curl-cookies.txt
# -> works: response.html says I'm logged-in
# add page curl http://localhost/USA08/node/add/page \ -F 'title=xyz' \ -F 'teaser_include=1' \ -F 'body=abc' \ -F 'format=1' \ -F 'changed=' \ -F 'form_id=page_node_form' \ -F 'log=' \ -F 'comment=0' \ -F 'name=Martin' \ -F 'date=' \ -F 'status=1' \ -F 'op=Save' \ --output /Users/martin/response2.html \ -s \ -c /tmp/curl-cookies.txt \ -b /tmp/curl-cookies.txt
# -> does not work: response2.html says 'Validation error, please try again. If this error persists, please contact the site administrator.'
Regards, Martin
On Thu, Feb 05, 2009 13:50:46 PM -0500, Earl Dunovant wrote:
Use drupal_execute().
I don't know about Martin, but as far as I'm concerned, this goes hand in hand with Jimmy's suggestion to use the SimpleTest browser. Both are elegant solutions but require shell access to the server and/or changes/additions to the Drupal installation and/or a working PHP command line interpreter to do things remotely: all things much harder to have or set up than a bash/curl script running from home or even the barest Live Linux distro on a USB key The only problem is that to write such a script one needs to know in advance: - the exact sequence of URLs to call (isn't this documented anywhere but in the source code itself?) - a list of the names and admissible values of all the form variables to pass to curl when it calls each one of those URLs. (**) With all respect, can it really be that getting/providing just this information is such a complex task that it's simpler to learn the drupal API or set up a full Drupal testing environment? Thanks in advance for any pointer to a shell+curl-only solution to this problem. Marco (**) as I already said previously, I'd have no problem to extract those lists myself with a mysql query to the database, if only I knew what query(es) to make to get all and only that information. Is this documented somewhere? -- Your own civil rights and the quality of your life heavily depend on how software is used *around* you: http://digifreedom.net/node/84
On Thu, Feb 5, 2009 at 3:37 PM, M. Fioretti <mfioretti@nexaima.net> wrote:
On Thu, Feb 05, 2009 13:50:46 PM -0500, Earl Dunovant wrote:
Use drupal_execute().
The only problem is that to write such a script one needs to know in advance:
- the exact sequence of URLs to call (isn't this documented anywhere but in the source code itself?) - a list of the names and admissible values of all the form variables to pass to curl when it calls each one of those URLs. (**)
You need to know the fields in the form to create the node type you want to create. You put the data in $form_state['values'], just as is passed into, hook_form_alter(). You can get that by writing a hook_form_alter() function that calls var_export($form_state['values'], 1);
Then, 1: call drupal_get+form() to get the from-defining array 2: populate your $form_state['values'} array with your new node data 3: call drupal_execute($form_id, $form, $form_state); Lather, rinse, repeat until you're done. No URLs involved. Stick it in a module and call it from your script or a menu as you prefer. Read the docs and you'll see how it's done.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Earl Dunovant schrieb:
On Thu, Feb 5, 2009 at 3:37 PM, M. Fioretti <mfioretti@nexaima.net> wrote:
On Thu, Feb 05, 2009 13:50:46 PM -0500, Earl Dunovant wrote:
Use drupal_execute().
The only problem is that to write such a script one needs to know in advance:
- the exact sequence of URLs to call (isn't this documented anywhere but in the source code itself?) - a list of the names and admissible values of all the form variables to pass to curl when it calls each one of those URLs. (**)
You need to know the fields in the form to create the node type you want to create. You put the data in $form_state['values'], just as is passed into, hook_form_alter(). You can get that by writing a hook_form_alter() function that calls var_export($form_state['values'], 1);
Then, 1: call drupal_get+form() to get the from-defining array 2: populate your $form_state['values'} array with your new node data 3: call drupal_execute($form_id, $form, $form_state);
Lather, rinse, repeat until you're done. No URLs involved. Stick it in a module and call it from your script or a menu as you prefer.
Read the docs and you'll see how it's done.
While I agree that this is the right approach, it also has some problems. The most important one is that access permissions for the forms aren't checked. You can get around that by checking yourself in hook_form_alter (you also need to unset #programmed in case you want to deny access for some reason). I've not gotten field level cck permissions to work. Also be careful to set #redirect to false in case you want to get useful feedback (e.g. from the services module). Cheers, Gerhard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkmLWr0ACgkQfg6TFvELooTREwCeL+A73ohtJIc19yT153Ww0UAe FvoAn3t19iG65vhqS1m6JaR0gINrcyBH =pNtf -----END PGP SIGNATURE-----
On 5-Feb-09, at 1:50 PM, Earl Dunovant wrote:
Use drupal_execute().
Unfortunately, there are some static caches which can cause problems when you call drupal_execute() multiple times in one request. You don't say what content type you're using, but there are some specific issues with the book module and the menu system. If you do end up using drupal_execute(), either do one node add per POST, or look at applying and reviewing the following patches: http://drupal.org/node/360377 book_get_books() cache becomes stale when batch-inserting book pages http://drupal.org/node/364529 menu_tree_all_data() static cache can become stale --Andrew
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Andrew Berry schrieb:
On 5-Feb-09, at 1:50 PM, Earl Dunovant wrote:
Use drupal_execute().
Unfortunately, there are some static caches which can cause problems when you call drupal_execute() multiple times in one request.
Ah, yes, I should have mentioned that. hook_forms is your friend there (but you will have to re-attach taxonomy in hook_form_alter and other things might be a bit strange too). Another problem is that you will need a lot of memory if you have either a lot of forms or quite complicated ones. Cheers, Gerhard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkmLXW8ACgkQfg6TFvELooS/hgCgvZTlIaXOCJyzhGuB6f3pVTN/ HmAAn1mh/XnHCOnmZ9QYFbSNTSGcD1uj =cXIa -----END PGP SIGNATURE-----
You can look at the name attributes of the input elements in the source page. The thing you won't get is the form token: the form token is a mechanism designed to prevent exactly what you're trying to do. Form token is an input field that is generated with the rest of the form and stored server side. If you send a valid token back to the server, it process your request. otherwise it doesn't because it thinks you're a spammer. Dmitri On Feb 5, 2009, at 4:46 AM, Martin Stadler wrote:
Hi!
I'm working on a small curl-based script to add content to a drupal-6 site. Logging-in (getting cookie etc) works fine but when I try to add content I get a validation error. I guess it's about security. Is it at all possible to do what I'm trying? Is there documention how these form tokens actually work and what is required? I couldn't find that in the formAPI docs.
Here's the code:
# login curl http://localhost/USA08/user \ -F 'name=Martin' \ -F 'pass=pass' \ -F 'form_id=user_login' \ -F 'op=Log in' \ --output /Users/martin/response.html \ -s \ -c /tmp/curl-cookies.txt \ -b /tmp/curl-cookies.txt
# -> works: response.html says I'm logged-in
# add page curl http://localhost/USA08/node/add/page \ -F 'title=xyz' \ -F 'teaser_include=1' \ -F 'body=abc' \ -F 'format=1' \ -F 'changed=' \ -F 'form_id=page_node_form' \ -F 'log=' \ -F 'comment=0' \ -F 'name=Martin' \ -F 'date=' \ -F 'status=1' \ -F 'op=Save' \ --output /Users/martin/response2.html \ -s \ -c /tmp/curl-cookies.txt \ -b /tmp/curl-cookies.txt
# -> does not work: response2.html says 'Validation error, please try again. If this error persists, please contact the site administrator.'
Regards, Martin
Quoting Dmitri Gaskin <dmitrig01@gmail.com>:
You can look at the name attributes of the input elements in the source page.
The devel module will give you this on the form itself.
The thing you won't get is the form token: the form token is a mechanism designed to prevent exactly what you're trying to do.
Unfortunately, the devel module doesn't give you this, either that I've been able to find. I tend to create a hook_form_alter and display it in the log table. -- Earnie http://r-feed.com Make a Drupal difference and review core patches. -- http://for-my-kids.com/ -- http://give-me-an-offer.com/
I was pretty close. I needed to add the security token and the form ID. What I was really searching for was this documentation about cURL and Drupal: http://drupal.org/node/80548 . I'm basically using this now but originally I wanted to get along with a shell script only. Actually I'm trying to export images from iPhoto to Drupal using AppleScript as I didn't want to dive into plugin development and you can call shell scripts from AppleScript. What I learned: The security token (form_token) depends on the session ID (plus form ID and private key) and thus you need to create a session with the client you want to use to create the content (or create a cookie with such session ID). Just send a request for the add- page and extract the token from the HTML form. This token can be used from now on to add content using a tool like cURL. Checking the resulting HTML pages I also got confused because the redirect did not work with cURL so I didn't get the success message I expected though the creation was successful. Here's my working cURL code: # login curl http://localhost/drupal/?q=user \ -s \ -c cookie.txt \ -b cookie.txt \ -F 'name=user' \ -F 'pass=pass' \ -F 'form_id=user_login' \ -F 'op=Log in' \ --output response0.html # get form curl http://localhost/drupal/?q=node/add/page \ -s \ -c cookie.txt \ -b cookie.txt \ --output response1.html # -> extract token from response1.html (/edit-page-node-form-form- token" *value="([^"]*)"/) # add page # -> use extracted token curl http://localhost/drupal/?q=node/add/page \ -s \ -c cookie.txt \ -b cookie.txt \ -F 'title=xyz' \ -F 'body=abc' \ -F 'form_id=page_node_form' \ -F 'form_token=63fe773e820d2a4565720ab3bd0fc991' \ -F 'status=1' \ -F 'revision=1' \ -F 'op=Save' \ --output response2.html
participants (8)
-
Andrew Berry -
Dmitri Gaskin -
Earl Dunovant -
Earnie Boyd -
Gerhard Killesreiter -
M. Fioretti -
Martin Stadler -
Steven Jones