Info needed to add content to Drupal via shell/curl script
Greetings, I would like to ask all Drupal developers where to find the info to do what I describe below. I have already found almost identical questions asked both on the support list and in the forums at drupal.org, but they did not receive complete answers, so here I am. I want to write a shell script which takes as input an HTML file and other parameters (title, category, etc...) and then, using curl and the POST method, logs into a Drupal website, adds a node with that text and parameters, logs off and returns the complete URL assigned by drupal to that page. The information I need is: - what is the exact sequence of pages (relative URLs) that drupal presents to users who login and then want to add a node? - what is the complete list of POST parameters (assuming there is only one custom category CAT_1) that drupal wants to see POSTed to each of those pages? - how much the two answers above depend on drupal version, or will change in the future? Important: I know I can look at all the http headers going back and forth between browser and drupal, and using this approach I have *already* written a working draft of the script myself, but I'm looking for a better, more reliable way to do this. I found that the script won't work consistantly, meaning that I'd have to tweak it every time if Drupal version changes or (usign it on other sites) depending on which modules are installed. So, is there any official documentation which contains complete, reliable answers to the questions above? In other words, is there a better, more reliable and future-proof way to get those answers than studying source code or raw http sessions by trial and error? Thank you in advance for any feedback and happy 2009! Marco -- Help *everybody* love Free Standards and Software: http://digifreedom.net
On Sun, Jan 4, 2009 at 12:39 AM, M. Fioretti <mfioretti@nexaima.net> wrote:
I want to write a shell script which takes as input an HTML file and other parameters (title, category, etc...) and then, using curl and the POST method, logs into a Drupal website, adds a node with that text and parameters, logs off and returns the complete URL assigned by drupal to that page.
The information I need is:
- what is the exact sequence of pages (relative URLs) that drupal presents to users who login and then want to add a node? - what is the complete list of POST parameters (assuming there is only one custom category CAT_1) that drupal wants to see POSTed to each of those pages? - how much the two answers above depend on drupal version, or will change in the future?
There was a discussion at the Portland Drupal Group about using the SimpleTest module to do this type of thing. If nothing else it should be a good place to steal some code. There are likely unit tests for HEAD that would do what you're describing. Good luck, andrew
On Sun, January 4, 2009 12:13 pm, andrew morton wrote:
There was a discussion at the Portland Drupal Group about using the SimpleTest module to do this type of thing.... There are likely unit tests for HEAD that would do what you're describing.
I have just downloaded simpletest and will study it later today, thanks, but I have one question anyway: From what I read, simpletest looks like: 1) a set of functions/classes/code to do in Php what I was planning to do with bash and curl, but... 2) something that requires anyway, in advance the same information I am asking for, that is the real subject of my search. I am looking for documentation that tells me: - define this exact series of form fields - send them to this url - define this second series of form fields - send them to this other url - etc... I mean, if using simpletest from the command line means, eventually, to type less than writing a bash/curl implementation, that's great, I'll do that. But simpletest looks to me as a way to do automatically and much more quickly something you *already* know in all details, that is a way to go through a constant sequence of predefined steps: maybe I'm missing something, but can it make faster to learn what that sequence *is* ? thanks, Marco -- Help *everybody* love Free Standards and Software: http://digifreedom.net
On Sun, Jan 4, 2009 at 3:27 AM, M. Fioretti <mfioretti@nexaima.net> wrote:
But simpletest looks to me as a way to do automatically and much more quickly something you *already* know in all details, that is a way to go through a constant sequence of predefined steps: maybe I'm missing something, but can it make faster to learn what that sequence *is* ?
Which is exactly why I suggested looking at the tests. There are tests for logging in and test for creating nodes: http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/simpletest/test...
I developed something similar which uses the XML-RPC mechanism to create nodes. I wrote a custom module to receive the data, create a node object, and call node_save() on it. To submit the data, I used Perl. If you're interested, I can send you the code. Ricky On Jan 4, 2009, at 3:39 AM, M. Fioretti wrote:
Greetings,
I would like to ask all Drupal developers where to find the info to do what I describe below. I have already found almost identical questions asked both on the support list and in the forums at drupal.org, but they did not receive complete answers, so here I am.
I want to write a shell script which takes as input an HTML file and other parameters (title, category, etc...) and then, using curl and the POST method, logs into a Drupal website, adds a node with that text and parameters, logs off and returns the complete URL assigned by drupal to that page.
The information I need is:
- what is the exact sequence of pages (relative URLs) that drupal presents to users who login and then want to add a node? - what is the complete list of POST parameters (assuming there is only one custom category CAT_1) that drupal wants to see POSTed to each of those pages? - how much the two answers above depend on drupal version, or will change in the future?
Important: I know I can look at all the http headers going back and forth between browser and drupal, and using this approach I have *already* written a working draft of the script myself, but I'm looking for a better, more reliable way to do this. I found that the script won't work consistantly, meaning that I'd have to tweak it every time if Drupal version changes or (usign it on other sites) depending on which modules are installed.
So, is there any official documentation which contains complete, reliable answers to the questions above? In other words, is there a better, more reliable and future-proof way to get those answers than studying source code or raw http sessions by trial and error?
Thank you in advance for any feedback and happy 2009!
Marco
-- Help *everybody* love Free Standards and Software: http://digifreedom.net
The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
On the Drupal side, you should also look at the Services module, which offers XML-RPC services for creating nodes as well as many other Drupal functions. On Wed, Jan 7, 2009 at 8:01 AM, Richard Morse <remorse@partners.org> wrote:
I developed something similar which uses the XML-RPC mechanism to create nodes. I wrote a custom module to receive the data, create a node object, and call node_save() on it. To submit the data, I used Perl.
If you're interested, I can send you the code.
Ricky
On Jan 4, 2009, at 3:39 AM, M. Fioretti wrote:
Greetings,
I would like to ask all Drupal developers where to find the info to do what I describe below. I have already found almost identical questions asked both on the support list and in the forums at drupal.org, but they did not receive complete answers, so here I am.
I want to write a shell script which takes as input an HTML file and other parameters (title, category, etc...) and then, using curl and the POST method, logs into a Drupal website, adds a node with that text and parameters, logs off and returns the complete URL assigned by drupal to that page.
The information I need is:
- what is the exact sequence of pages (relative URLs) that drupal presents to users who login and then want to add a node? - what is the complete list of POST parameters (assuming there is only one custom category CAT_1) that drupal wants to see POSTed to each of those pages? - how much the two answers above depend on drupal version, or will change in the future?
Important: I know I can look at all the http headers going back and forth between browser and drupal, and using this approach I have *already* written a working draft of the script myself, but I'm looking for a better, more reliable way to do this. I found that the script won't work consistantly, meaning that I'd have to tweak it every time if Drupal version changes or (usign it on other sites) depending on which modules are installed.
So, is there any official documentation which contains complete, reliable answers to the questions above? In other words, is there a better, more reliable and future-proof way to get those answers than studying source code or raw http sessions by trial and error?
Thank you in advance for any feedback and happy 2009!
Marco
-- Help *everybody* love Free Standards and Software: http://digifreedom.net
The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
Hi! In this particular case, I had some special variables I needed added, and I wanted to restrict the ability to post to certain IP addresses and user(s). I did look at services, and it was more than I needed or wanted... But thanks! Ricky On Jan 7, 2009, at 11:16 AM, Greg Dunlap wrote:
On the Drupal side, you should also look at the Services module, which offers XML-RPC services for creating nodes as well as many other Drupal functions.
On Wed, Jan 7, 2009 at 8:01 AM, Richard Morse <remorse@partners.org> wrote:
I developed something similar which uses the XML-RPC mechanism to create nodes. I wrote a custom module to receive the data, create a node object, and call node_save() on it. To submit the data, I used Perl.
If you're interested, I can send you the code.
Ricky
On Jan 4, 2009, at 3:39 AM, M. Fioretti wrote:
Greetings,
I would like to ask all Drupal developers where to find the info to do what I describe below. I have already found almost identical questions asked both on the support list and in the forums at drupal.org, but they did not receive complete answers, so here I am.
I want to write a shell script which takes as input an HTML file and other parameters (title, category, etc...) and then, using curl and the POST method, logs into a Drupal website, adds a node with that text and parameters, logs off and returns the complete URL assigned by drupal to that page.
The information I need is:
- what is the exact sequence of pages (relative URLs) that drupal presents to users who login and then want to add a node? - what is the complete list of POST parameters (assuming there is only one custom category CAT_1) that drupal wants to see POSTed to each of those pages? - how much the two answers above depend on drupal version, or will change in the future?
Important: I know I can look at all the http headers going back and forth between browser and drupal, and using this approach I have *already* written a working draft of the script myself, but I'm looking for a better, more reliable way to do this. I found that the script won't work consistantly, meaning that I'd have to tweak it every time if Drupal version changes or (usign it on other sites) depending on which modules are installed.
So, is there any official documentation which contains complete, reliable answers to the questions above? In other words, is there a better, more reliable and future-proof way to get those answers than studying source code or raw http sessions by trial and error?
Thank you in advance for any feedback and happy 2009!
Marco
-- Help *everybody* love Free Standards and Software: http://digifreedom.net
The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
Hi, Ricky, Seems very useful. Are you planning to package and provide it as a module at d.o.? Then people could comment and supply patches to develop it further. If there are no such plans, I'll appreciate getting the existing code. Cheers, Tomáš -- Tomáš J. Fülöpp http://vacilando.net On Wed, Jan 7, 2009 at 17:01, Richard Morse <remorse@partners.org> wrote:
I developed something similar which uses the XML-RPC mechanism to create nodes. I wrote a custom module to receive the data, create a node object, and call node_save() on it. To submit the data, I used Perl.
If you're interested, I can send you the code.
Ricky
On Jan 4, 2009, at 3:39 AM, M. Fioretti wrote:
Greetings,
I would like to ask all Drupal developers where to find the info to do what I describe below. I have already found almost identical questions asked both on the support list and in the forums at drupal.org, but they did not receive complete answers, so here I am.
I want to write a shell script which takes as input an HTML file and other parameters (title, category, etc...) and then, using curl and the POST method, logs into a Drupal website, adds a node with that text and parameters, logs off and returns the complete URL assigned by drupal to that page.
The information I need is:
- what is the exact sequence of pages (relative URLs) that drupal presents to users who login and then want to add a node? - what is the complete list of POST parameters (assuming there is only one custom category CAT_1) that drupal wants to see POSTed to each of those pages? - how much the two answers above depend on drupal version, or will change in the future?
Important: I know I can look at all the http headers going back and forth between browser and drupal, and using this approach I have *already* written a working draft of the script myself, but I'm looking for a better, more reliable way to do this. I found that the script won't work consistantly, meaning that I'd have to tweak it every time if Drupal version changes or (usign it on other sites) depending on which modules are installed.
So, is there any official documentation which contains complete, reliable answers to the questions above? In other words, is there a better, more reliable and future-proof way to get those answers than studying source code or raw http sessions by trial and error?
Thank you in advance for any feedback and happy 2009!
Marco
-- Help *everybody* love Free Standards and Software: http://digifreedom.net
The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
Hi! The code I developed is a small custom module that handles one particular case very well. The more general case is probably better served by services.module. However, I will send you the code I have, which also demonstrates the Perl side of it... Ricky On Jan 7, 2009, at 11:18 AM, Tomáš Fülöpp wrote:
Hi, Ricky,
Seems very useful. Are you planning to package and provide it as a module at d.o.? Then people could comment and supply patches to develop it further.
If there are no such plans, I'll appreciate getting the existing code.
Cheers,
Tomáš
-- Tomáš J. Fülöpp http://vacilando.net
On Wed, Jan 7, 2009 at 17:01, Richard Morse <remorse@partners.org> wrote:
I developed something similar which uses the XML-RPC mechanism to create nodes. I wrote a custom module to receive the data, create a node object, and call node_save() on it. To submit the data, I used Perl.
If you're interested, I can send you the code.
Ricky
On Jan 4, 2009, at 3:39 AM, M. Fioretti wrote:
Greetings,
I would like to ask all Drupal developers where to find the info to do what I describe below. I have already found almost identical questions asked both on the support list and in the forums at drupal.org, but they did not receive complete answers, so here I am.
I want to write a shell script which takes as input an HTML file and other parameters (title, category, etc...) and then, using curl and the POST method, logs into a Drupal website, adds a node with that text and parameters, logs off and returns the complete URL assigned by drupal to that page.
The information I need is:
- what is the exact sequence of pages (relative URLs) that drupal presents to users who login and then want to add a node? - what is the complete list of POST parameters (assuming there is only one custom category CAT_1) that drupal wants to see POSTed to each of those pages? - how much the two answers above depend on drupal version, or will change in the future?
Important: I know I can look at all the http headers going back and forth between browser and drupal, and using this approach I have *already* written a working draft of the script myself, but I'm looking for a better, more reliable way to do this. I found that the script won't work consistantly, meaning that I'd have to tweak it every time if Drupal version changes or (usign it on other sites) depending on which modules are installed.
So, is there any official documentation which contains complete, reliable answers to the questions above? In other words, is there a better, more reliable and future-proof way to get those answers than studying source code or raw http sessions by trial and error?
Thank you in advance for any feedback and happy 2009!
Marco
-- Help *everybody* love Free Standards and Software: http://digifreedom.net
The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
On Jan 7, 2009, at 11:18 AM, Tomáš Fülöpp wrote:
Hi, Ricky,
Seems very useful. Are you planning to package and provide it as a module at d.o.? Then people could comment and supply patches to develop it further.
If there are no such plans, I'll appreciate getting the existing code.
Hi! Here is the code. First is the php code, then the perl code. I've obfuscated some specific values (ip addresses, user names, etc), but it should be fairly clear... <?php /** * Implementation of hook_xmlrpc. * * This function returns a list of the available XML-RPC calls provided * by this module. * * This module provides one call, which allows XXXX to submit reports * without having to do it manually. */ function reports_upload_xmlrpc() { return array( array('reports.submit', 'reports_upload_receive', array('string', 'struct'), t('Receive a report')), ); } /** * This function is the actual processor of the XML-RPC call */ function reports_upload_receive($params) { // first, make sure that the remote computer is XXXX, otherwise fail if($_SERVER['REMOTE_ADDR'] != 'XXX.XXX.XXX.XXX') { return xmlrpc_error(1, t('Not allowed')); } /* we receive: [title] => title, [report_date] => date, [body] => body, [input_format] => format (if empty, default format is used) [oi_restrict] => oi_restriction, [terms] => [ taxonomy_identifier, * ] [attachments] => ** this one is optional 0 => [name] [mime] [data] */ // validate the structure of the submission if (empty($params['title'])) { return xmlrpc_error(10, t('Missing title')); } if (empty($params['report_date'])) { return xmlrpc_error(11, t('Missing report date')); } if (empty($params['body'])) { return xmlrpc_error(12, t('Missing title')); } if (empty($params['input_format'])) { $params['input_format'] = variable_get('filter_default_format', 1); } else if (is_numeric($params['input_format'])) { // do nothing } else { // this is broken right now, I don't know why... $input_filter_name = $params['input_format']; if (false !== strpos($input_filter_name, '"')) { return xmlrpc_error(12, t('Invalid filter format specified: quotes not allowed')); } $matching_formats = array_filter(filter_formats(), create_function('$f', 'return $f->name == "' . $input_filter_name . '";')); if (count($matching_formats) == 1) { $params['input_format'] = array_shift($matching_formats)->format; } else { return xmlrpc_error(12, t('Invalid filter format specified. No matching formats to !format', array('!format' => $input_filter_name))); } } if (empty($params['oi_restrict'])) { return xmlrpc_error(13, t('Missing restriction')); } if (!is_array($params['terms'])) { return xmlrpc_error(14, t('Bad terms parameter')); } if (!empty($params['attachments'])) { foreach ( $params['attachments'] as $attachment_num => $attachment_info ) { if (!is_numeric($attachment_num)) { return xmlrpc_error(15, t('Bad attachment index: !index', array('!index' => $attachment_num))); } if (empty($attachment_info['name'])) { return xmlrpc_error(16, t('Missing attachment name for index ! index', array('!index' => $attachment_num))); } if (empty($attachment_info['mime'])) { return xmlrpc_error(16, t('Missing attachment mime-type for index !index', array('!index' => $attachment_num))); } if (empty($attachment_info['data'])) { return xmlrpc_error(16, t('Missing attachment data for index ! index', array('!index' => $attachment_num))); } } } // set the proper user global $user; $orig_user = $user; $new_user = user_load(array('name' => 'XXXX')); $user = $new_user; // create a new node as an array $node = array( // our particular information 'type' => 'report', 'title' => $params['title'], 'field_report_date' => array(array('value' => $params['report_date'])), 'body' => $params['body'], 'format' => $params['input_format'], 'oi_nodeapi_restrict_eid' => oi_get_entity_eid($params['oi_restrict']), // some variable, but for our purposes static, information 'uid' => $user->uid, // misc stuff that is needed to make things function 'status' => 1, 'promote' => 0, 'sticky' => 0, ); // add in the taxonomy for the report type $node['taxonomy'] = array(); foreach($params['terms'] as $term) { $tids = taxonomy_get_term_by_name($term); if (count($tids) != 1) { xmlrpc_error(14, t('Invalid report type -- returned !count terms for !term', array('!term' => $term, '!count' => count($tids)))); } $tid_obj = $tids[0]; $node['taxonomy'][$tid_obj->vid][$tid_obj->tid] = $tid_obj->tid; } if (!count($node['taxonomy'])) { delete ( $node['taxonomy'] ); } // add in any attachments if (!empty($params['attachments'])) { $node['files'] = array(); foreach ($params['attachments'] as $a_index => $a_info) { $file_temp = file_save_data($a_info['data'], file_directory_path() . '/' . $a_info['name'], FILE_EXISTS_RENAME); $node['files']['upload_' . $a_index] = array( 'fid' => 'upload_' . $a_index, 'title' => basename($file_temp), 'description' => $a_info['name'], 'filename' => basename($file_temp), 'filepath' => $file_temp, 'filesize' => filesize($file_temp), 'filemime' => $a_info['mime'], 'remove' => 0, 'list' => 1, ); } } // convert to an object $node_obj = (object) $node; // save it node_save($node_obj); // fix the user to the original user $user = $orig_user; // return the node id created return "created : " . $node_obj->nid; } ?> #------------------------ Perl --------- #!/usr/bin/perl use strict; use warnings; use Getopt::Long; # handle command line parameters use Date::Format; # provides date formatting use File::Slurp qw( slurp ); # makes it easier to read a file use XMLRPC::Lite; # communicate with the website use Data::Dumper; # dispaly any error messages # the directory to take files from my $in_dir = ''; my $in_file = ''; GetOptions('indir=s' => \$in_dir, 'file=s' => \$in_file); if ($in_dir eq '') { die("You must supply the '-indir' parameter\n"); } if ($in_file eq '') { die("You must supply the '-file' parameter\n"); } # two date formats we'll need my $todays_date = strftime('%Y-%m-%dT00:00:00', @{[localtime()]}); my $rep_month = strftime('%Y-%m', @{[localtime()]}); # read in the 'processed.html' file my $body = slurp($in_dir .'/' . $in_file); # find all of the referenced images my @imgs = ($body =~ m/\[inline:([-A-Za-z0-9_]+\.(?:png|gif|jpg))\]/g); # start creating the submission object my $submission = { 'title' => "Report - $rep_month", 'report_date' => $todays_date, 'body' => $body, 'input_format' => 6, # should be: 'Markdown (full html)', but this doesn't work right now 'oi_restrict' => 'XXXX', 'terms' => [ 'XXXX', 'YYYY', 'ZZZZ', 'Report' ], }; my $count = 0; foreach my $file (@imgs) { my $fdata = slurp($in_dir .'/' . $file, 'binmode' => ':raw'); $submission->{'attachments'}{"$count"} = { 'name' => $file, 'mime' => 'image/' . get_mime_type($file), 'data' => $fdata, }; $count++; } print "Submitting .. "; my $xmlobj = XMLRPC::Lite ->proxy('http://www.example.com/xmlrpc.php') ->call('reports.submit', $submission); if ($xmlobj->fault()) { print "error:\n", Dumper($xmlobj->fault()), "\n"; } else { print $xmlobj->result(), "\n"; } print "done\n"; ########## sub get_mime_type { my ($fname) = @_; my ($ext) = ($fname =~ m/\.(png|gif|jpg)$/); return { 'png' => 'png', 'gif' => 'gif', 'jpg' => 'jpeg', }->{$ext}; } __END__ The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
Bah humbug. I forgot to check the "to" field... Sorry for spamming everyone. Ricky On Jan 7, 2009, at 11:33 AM, Richard Morse wrote:
On Jan 7, 2009, at 11:18 AM, Tomáš Fülöpp wrote:
Hi, Ricky,
Seems very useful. Are you planning to package and provide it as a module at d.o.? Then people could comment and supply patches to develop it further.
If there are no such plans, I'll appreciate getting the existing code.
Hi! Here is the code. First is the php code, then the perl code. I've obfuscated some specific values (ip addresses, user names, etc), but it should be fairly clear...
The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
On Wed, 2009-01-07 at 11:35 -0500, Richard Morse wrote:
Bah humbug. I forgot to check the "to" field...
Sorry for spamming everyone.
I was going to email you offlist to ask for a copy of the code anyway. :-) I'm developing an unrelated XMLRPC module for a one-off internal app for my employer, and the Perl example is very helpful. Thanks! Scott -- Syscrusher <syscrusher@4th.com>
On Sun, January 4, 2009 9:39 am, M. Fioretti wrote:
Greetings,
I would like to ask all Drupal developers where to find the info to... write a shell script which takes as input an HTML file and other parameters (title, category, etc...) and then, using curl and the POST method, logs into a Drupal website, adds a node with that text and parameters, logs off and returns the complete URL assigned by drupal to that page.
I would just like to thank all the developers who answered, either here or privately. I haven't found a solution yet, primarily because I didn't explain myself clearly, sorry for the confusion. When I wrote "shell script" above, I didn't make clear that I am (almost) only interested in shell running remotely, not on the server which is running drupal. The tips and pointers I got are less useful (=require more extra tweaking to work) in such a context, so I *think* I'll retry, when I have again time, with the bash+curl approach Thanks anyway for the info, it was quite useful to learn something on how drupal works and is developed. Ciao, Marco -- Help *everybody* love Free Standards and Software: http://digifreedom.net
participants (6)
-
andrew morton -
Greg Dunlap -
M. Fioretti -
Richard Morse -
Syscrusher -
Tomáš Fülöpp