Am Mittwoch, 7. Dezember 2011, 20:19:58 schrieb Florian Auer:
Is there someone who successfully got Drupal 7, Search API and Tika working together?
Finally I got it working. The documentation is somewhat incomplete, so here's what i did to get Drupal 7, Search API and Tika running on Debian Squeeze:
== 1. Download Tika source archive == Go to [1] and copy the link URL to the archive on your favourite mirror and download it using wget:
$ wget [URL]
== 2. Extract Tika source archive to /opt == $ cd /opt # unzip /path/to/apache-tika-X.Y-src.zip
== 3. Install maven2 package == # apt-get install maven2
== 4. Compile Tika using Maven == $ cd tika-X.Y # MAVEN_OPTS=-Xmx256m mvn clean install
(This might take a while…)
== 5. Download and enable the required Drupal modules == drush dl search_api search_api_attachments search_api_db drush en search_api search_api_attachments search_api_db
== 6. Configure Drupal to use Tika == - Login to Drupal admin backend - Open Search API settings - Create a new server (Database) - Create a new index or use existing one - In your index settings, switch to "Workflow" tab - In "Data alterations" area enable "File attachments" - Got to "Fields" tab - Enable "File content" field for indexing
== 7. Edit Search API attachment module == Note: This is only needed if you use version 7.x-1.0, should be already fixed in newer versions (see patch 3048482a89a1a587feab78f2d5ea92c4b5642898 on [2])
- Go to the module's directory (if you used drush, this should be DRUPAL_HOME/sites/all/modules/search_api_attachments) - Open file include/callback_attachments_settings.inc in your favourite editor - Replace any occurences of "entity_type" by "item_type" (see issue on [3])
== 8. Verify Tika is working and called by Drupal == - Open file include/callback_attachments_settings.inc again - Add the following PHP code at the end of the file, right before the last return command (line 141-ish)
syslog(LOG_INFO, 'Calling Tika: ' . $cmd);
- Save and close the file - Tail your syslog (# tail -f /var/log/syslog) - Got to Search API settings in Drupal backend - Re-index your site - You should see some messages telling you the Tika command and the file which is indexed
This is a rather quick'n'dirty documentation, but I don't have time for more and the git repo for Search AP attachments isn't working properly, so I cannot create patches right now. If you have any questions, let me know!