[Elastica](https://github.com/ruflin/Elastica) integration in Symfony2 ### Installation #### Bundle and Dependencies For Symfony 2.0.x projects, you must use a 1.x release of this bundle. Please check the bundle [tags](https://github.com/FriendsOfSymfony/FOSElasticaBundle/tags) or the [Packagist](https://packagist.org/packages/friendsofsymfony/elastica-bundle) page for information on Symfony and Elastica compatibility. Add FOSElasticaBundle to your application's `composer.json` file: ```json { "require": { "friendsofsymfony/elastica-bundle": "3.0.*@dev" } } ``` Install the bundle and its dependencies with the following command: ```bash $ php composer.phar update friendsofsymfony/elastica-bundle ``` You may rely on Composer to fetch the appropriate version of Elastica. Lastly, enable the bundle in your application kernel: ```php // app/AppKernel.php public function registerBundles() { $bundles = array( // ... new FOS\ElasticaBundle\FOSElasticaBundle(), ); } ``` #### Elasticsearch Instructions for installing and deploying Elasticsearch may be found [here](http://www.elasticsearch.org/guide/reference/setup/installation/). ### Basic configuration #### Declare a client Elasticsearch client is comparable to a database connection. Most of the time, you will need only one. #app/config/config.yml fos_elastica: clients: default: { host: localhost, port: 9200 } If your client requires Basic HTTP Authentication, you can specify an Authorization Header to include in HTTP requests. The Authorization Header value is a ``base64`` encoded string that includes the authentication username and password, and can be obtained by running the following command in your terminal: php -r "Print 'Basic ' . base64_encode('your_auth_username' . ':' . 'your_auth_password');" A sample configuration with Basic HTTP Authentication is: #app/config/config.yml fos_elastica: clients: default: host: example.com port: 80 headers: Authorization: "Basic jdumrGK7rY9TMuQOPng7GZycmxyMHNoir==" A client configuration can also override the Elastica logger to change the used class ```logger: ``` or to simply disable it ```logger: false```. Disabling the logger should be done on production because it can cause a memory leak. #### Declare a serializer Elastica can handle objects instead of data arrays if a serializer callable is configured #app/config/config.yml fos_elastica: clients: default: { host: localhost, port: 9200 } serializer: callback_class: callback_class serializer: serializer ``callback_class`` is the name of a class having a public method serialize($object) and should extends from ``FOS\ElasticaBundle\Serializer\Callback``. ``serializer`` is the service id for the actual serializer, e.g. ``serializer`` if you're using JMSSerializerBundle. If this is configured you can use ``\Elastica\Type::addObject`` instead of ``\Elastica\Type::addDocument`` to add data to the index. The bundle provides a default implementation with a serializer service id 'serializer' that can be turned on by adding the following line to your config. #app/config/config.yml fos_elastica: serializer: ~ #### Declare an index Elasticsearch index is comparable to Doctrine entity manager. Most of the time, you will need only one. fos_elastica: clients: default: { host: localhost, port: 9200 } serializer: callback_class: FOS\ElasticaBundle\Serializer\Callback serializer: serializer indexes: website: client: default Here we created a "website" index, that uses our "default" client. Our index is now available as a service: `fos_elastica.index.website`. It is an instance of `\Elastica\Index`. If you need to have different index name from the service name, for example, in order to have different indexes for different environments then you can use the ```index_name``` key to change the index name. The service name will remain the same across the environments: fos_elastica: clients: default: { host: localhost, port: 9200 } indexes: website: client: default index_name: website_qa The service id will be `fos_elastica.index.website` but the underlying index name is website_qa. #### Declare a type Elasticsearch type is comparable to Doctrine entity repository. fos_elastica: clients: default: { host: localhost, port: 9200 } serializer: callback_class: FOS\ElasticaBundle\Serializer\Callback serializer: serializer indexes: website: client: default types: user: mappings: username: { boost: 5 } firstName: { boost: 3 } lastName: { boost: 3 } aboutMe: ~ Our type is now available as a service: `fos_elastica.index.website.user`. It is an instance of `\Elastica\Type`. ### Declaring serializer groups If you are using the JMSSerializerBundle for serializing objects passed to elastica you can define serializer groups per type. fos_elastica: clients: default: { host: localhost, port: 9200 } serializer: callback_class: %classname% serializer: serializer indexes: website: client: default types: user: mappings: username: { boost: 5 } firstName: { boost: 3 } lastName: { boost: 3 } aboutMe: serializer: groups: [elastica, Default] ### Declaring parent field fos_elastica: clients: default: { host: localhost, port: 9200 } serializer: callback_class: FOS\ElasticaBundle\Serializer\Callback serializer: serializer indexes: website: client: default types: comment: mappings: date: { boost: 5 } content: ~ _parent: { type: "post", property: "post", identifier: "id" } The parent field declaration has the following values: * `type`: The parent type. * `property`: The property in the child entity where to look for the parent entity. It may be ignored if is equal to the parent type. * `identifier`: The property in the parent entity which has the parent identifier. Defaults to `id`. Note that to create a document with a parent, you need to call `setParent` on the document rather than setting a _parent field. If you do this wrong, you will see a `RoutingMissingException` as elasticsearch does not know where to store a document that should have a parent but does not specify it. ### Declaring `nested` or `object` Note that object can autodetect properties fos_elastica: clients: default: { host: localhost, port: 9200 } serializer: callback_class: FOS\ElasticaBundle\Serializer\Callback serializer: serializer indexes: website: client: default types: post: mappings: date: { boost: 5 } title: { boost: 3 } content: ~ comments: type: "nested" properties: date: { boost: 5 } content: ~ user: type: "object" approver: type: "object" properties: date: { boost: 5 } #### Doctrine ORM and `object` mappings Objects operate in the same way as the nested results but they need to have associations set up in Doctrine ORM so that they can be referenced correctly when indexing. If an "Entity was not found" error occurs while indexing, a null association has been discovered in the database. A custom Doctrine query must be used to utilize left joins instead of the default inner join. ### Populate the types php app/console fos:elastica:populate This command deletes and creates the declared indexes and types. It applies the configured mappings to the types. This command needs providers to insert new documents in the elasticsearch types. There are 2 ways to create providers. If your elasticsearch type matches a Doctrine repository or a Propel query, go for the persistence automatic provider. Or, for complete flexibility, go for a manual provider. #### Persistence automatic provider If we want to index the entities from a Doctrine repository or a Propel query, some configuration will let ElasticaBundle do it for us. fos_elastica: clients: default: { host: localhost, port: 9200 } serializer: callback_class: FOS\ElasticaBundle\Serializer\Callback serializer: serializer indexes: website: client: default types: user: mappings: username: { boost: 5 } firstName: { boost: 3 } # more mappings... persistence: driver: orm # orm, mongodb, propel are available model: Application\UserBundle\Entity\User provider: ~ Three drivers are actually supported: orm, mongodb, and propel. ##### Use a custom Doctrine query builder You can control which entities will be indexed by specifying a custom query builder method. persistence: driver: orm model: Application\UserBundle\Entity\User provider: query_builder_method: createIsActiveQueryBuilder Your repository must implement this method and return a Doctrine query builder. > **Propel** doesn't support this feature yet. ##### Change the batch size By default, ElasticaBundle will index documents by packets of 100. You can change this value in the provider configuration. persistence: driver: orm model: Application\UserBundle\Entity\User provider: batch_size: 100 ##### Change the document identifier field By default, ElasticaBundle will use the `id` field of your entities as the elasticsearch document identifier. You can change this value in the persistence configuration. persistence: driver: orm model: Application\UserBundle\Entity\User identifier: id #### Manual provider Create a service with the tag "fos_elastica.provider" and attributes for the index and type for which the service will provide. Its class must implement `FOS\ElasticaBundle\Provider\ProviderInterface`. userType = $userType; } /** * Insert the repository objects in the type index * * @param \Closure $loggerClosure * @param array $options */ public function populate(\Closure $loggerClosure = null, array $options = array()) { if ($loggerClosure) { $loggerClosure('Indexing users'); } $document = new Document(); $document->setData(array('username' => 'Bob')); $this->userType->addDocuments(array($document)); } } You will find a more complete implementation example in `src/FOS/ElasticaBundle/Doctrine/AbstractProvider.php`. ### Search You can just use the index and type Elastica objects, provided as services, to perform searches. /** var Elastica\Type */ $userType = $this->container->get('fos_elastica.index.website.user'); /** var Elastica\ResultSet */ $resultSet = $userType->search('bob'); #### Doctrine/Propel finder If your elasticsearch type is bound to a Doctrine entity repository or a Propel query, you can get your entities instead of Elastica results when you perform a search. Declare that you want a Doctrine/Propel finder in your configuration: fos_elastica: clients: default: { host: localhost, port: 9200 } serializer: callback_class: FOS\ElasticaBundle\Serializer\Callback serializer: serializer indexes: website: client: default types: user: mappings: # your mappings persistence: driver: orm model: Application\UserBundle\Entity\User provider: ~ finder: ~ You can now use the `fos_elastica.finder.website.user` service: /** var FOS\ElasticaBundle\Finder\TransformedFinder */ $finder = $container->get('fos_elastica.finder.website.user'); /** var array of Acme\UserBundle\Entity\User */ $users = $finder->find('bob'); /** var array of Acme\UserBundle\Entity\User limited to 10 results */ $users = $finder->find('bob', 10); You can even get paginated results! Pagerfanta: /** var Pagerfanta\Pagerfanta */ $userPaginator = $finder->findPaginated('bob'); /** Number of results to be used for paging the results */ $countOfResults = $userPaginator->getNbResults(); Knp paginator: $paginator = $this->get('knp_paginator'); $userPaginator = $paginator->paginate($finder->createPaginatorAdapter('bob')); You can also get both the Elastica results and the entities together from the finder. You can then access the score, highlights etc. from the Elastica\Result whilst still also getting the entity. /** var array of FOS\ElasticaBundle\HybridResult */ $hybridResults = $finder->findHybrid('bob'); foreach ($hybridResults as $hybridResult) { /** var Acme\UserBundle\Entity\User */ $user = $hybridResult->getTransformed(); /** var Elastica\Result */ $result = $hybridResult->getResult(); } If you would like to access facets while using Pagerfanta they can be accessed through the Adapter seen in the example below. ```php $query = new \Elastica\Query(); $facet = new \Elastica\Facet\Terms('tags'); $facet->setField('companyGroup'); $query->addFacet($facet); $companies = $finder->findPaginated($query); $companies->setMaxPerPage($params['limit']); $companies->setCurrentPage($params['page']); $facets = $companies->getAdapter()->getFacets()); ``` ##### Index wide finder You can also define a finder that will work on the entire index. Adjust your index configuration as per below: fos_elastica: indexes: website: client: default finder: ~ You can now use the index wide finder service `fos_elastica.finder.website`: /** var FOS\ElasticaBundle\Finder\MappedFinder */ $finder = $container->get('fos_elastica.finder.website'); // Returns a mixed array of any objects mapped $results = $finder->find('bob'); #### Repositories As well as using the finder service for a particular Doctrine/Propel entity you can use a manager service for each driver and get a repository for an entity to search against. This allows you to use the same service rather than the particular finder. For example: /** var FOS\ElasticaBundle\Manager\RepositoryManager */ $repositoryManager = $container->get('fos_elastica.manager.orm'); /** var FOS\ElasticaBundle\Repository */ $repository = $repositoryManager->getRepository('UserBundle:User'); /** var array of Acme\UserBundle\Entity\User */ $users = $repository->find('bob'); You can also specify the full name of the entity instead of the shortcut syntax: /** var FOS\ElasticaBundle\Repository */ $repository = $repositoryManager->getRepository('Application\UserBundle\Entity\User'); > The **2.0** branch doesn't support using `UserBundle:User` style syntax and you must use the full name of the entity. . ##### Default Manager If you are only using one driver then its manager service is automatically aliased to `fos_elastica.manager`. So the above example could be simplified to: /** var FOS\ElasticaBundle\Manager\RepositoryManager */ $repositoryManager = $container->get('fos_elastica.manager'); /** var FOS\ElasticaBundle\Repository */ $repository = $repositoryManager->getRepository('UserBundle:User'); /** var array of Acme\UserBundle\Entity\User */ $users = $repository->find('bob'); If you use multiple drivers then you can choose which one is aliased to `fos_elastica.manager` using the `default_manager` parameter: fos_elastica: default_manager: mongodb #defaults to orm clients: default: { host: localhost, port: 9200 } #-- ##### Custom Repositories As well as the default repository you can create a custom repository for an entity and add methods for particular searches. These need to extend `FOS\ElasticaBundle\Repository` to have access to the finder: ``` find($query); } } ``` To use the custom repository specify it in the mapping for the entity: fos_elastica: clients: default: { host: localhost, port: 9200 } indexes: website: client: default types: user: mappings: # your mappings persistence: driver: orm model: Application\UserBundle\Entity\User provider: ~ finder: ~ repository: Acme\ElasticaBundle\SearchRepository\UserRepository Then the custom queries will be available when using the repository returned from the manager: /** var FOS\ElasticaBundle\Manager\RepositoryManager */ $repositoryManager = $container->get('fos_elastica.manager'); /** var FOS\ElasticaBundle\Repository */ $repository = $repositoryManager->getRepository('UserBundle:User'); /** var array of Acme\UserBundle\Entity\User */ $users = $repository->findWithCustomQuery('bob'); Alternatively you can specify the custom repository using an annotation in the entity: ``` **Propel** doesn't support this feature yet. Declare that you want to update the index in real time: fos_elastica: clients: default: { host: localhost, port: 9200 } indexes: website: client: default types: user: mappings: # your mappings persistence: driver: orm model: Application\UserBundle\Entity\User listener: ~ # by default, listens to "insert", "update" and "delete" and updates `postFlush` Now the index is automatically updated each time the state of the bound Doctrine repository changes. No need to repopulate the whole "user" index when a new `User` is created. You can also choose to only listen for some of the events: persistence: listener: insert: true update: false delete: true By default, the ElasticSearch index will be updated after flush. To update before flushing, set `immediate` to `true`: persistence: listener: insert: true update: false delete: true immediate: true > Using `immediate` to update ElasticSearch before flush completes may cause the ElasticSearch index to fall out of > sync with the source database in the event of a crash during the flush itself, such as in the case of a bad query. ### Checking an entity method for listener If you use listeners to update your index, you may need to validate your entities before you index them (e.g. only index "public" entities). Typically, you'll want the listener to be consistent with the provider's query criteria. This may be achieved by using the `is_indexable_callback` config parameter: persistence: listener: is_indexable_callback: "isPublic" If `is_indexable_callback` is a string and the entity has a method with the specified name, the listener will only index entities for which the method returns `true`. Additionally, you may provide a service and method name pair: persistence: listener: is_indexable_callback: [ "%custom_service_id%", "isIndexable" ] In this case, the callback_class will be the `isIndexable()` method on the specified service and the object being considered for indexing will be passed as the only argument. This allows you to do more complex validation (e.g. ACL checks). If you have the [Symfony ExpressionLanguage](https://github.com/symfony/expression-language) component installed, you can use expressions to evaluate the callback: persistence: listener: is_indexable_callback: "user.isActive() && user.hasRole('ROLE_USER')" As you might expect, new entities will only be indexed if the callback_class returns `true`. Additionally, modified entities will be updated or removed from the index depending on whether the callback_class returns `true` or `false`, respectively. The delete listener disregards the callback_class. > **Propel** doesn't support this feature yet. ### Ignoring missing index results By default, FOSElasticaBundle will throw an exception if the results returned from Elasticsearch are different from the results it finds from the chosen persistence provider. This may pose problems for a large index where updates do not occur instantly or another process has removed the results from your persistence provider without updating Elasticsearch. The error you're likely to see is something like: 'Cannot find corresponding Doctrine objects for all Elastica results.' To solve this issue, each mapped object can be configured to ignore the missing results: persistence: elastica_to_model_transformer: ignore_missing: true ### Advanced elasticsearch configuration Any setting can be specified when declaring a type. For example, to enable a custom analyzer, you could write: fos_elastica: indexes: doc: settings: index: analysis: analyzer: my_analyzer: type: custom tokenizer: lowercase filter : [my_ngram] filter: my_ngram: type: "nGram" min_gram: 3 max_gram: 5 types: blog: mappings: title: { boost: 8, analyzer: my_analyzer } ### Overriding the Client class to suppress exceptions By default, exceptions from the Elastica client library will propagate through the bundle's Client class. For instance, if the elasticsearch server is offline, issuing a request will result in an `Elastica\Exception\Connection` being thrown. Depending on your needs, it may be desirable to suppress these exceptions and allow searches to fail silently. One way to achieve this is to override the `fos_elastica.client.class` service container parameter with a custom class. In the following example, we override the `Client::request()` method and return the equivalent of an empty search response if an exception occurred. ``` container->get('fos_elastica.finder.website.article'); $boolQuery = new \Elastica\Query\Bool(); $fieldQuery = new \Elastica\Query\Text(); $fieldQuery->setFieldQuery('title', 'I am a title string'); $fieldQuery->setFieldParam('title', 'analyzer', 'my_analyzer'); $boolQuery->addShould($fieldQuery); $tagsQuery = new \Elastica\Query\Terms(); $tagsQuery->setTerms('tags', array('tag1', 'tag2')); $boolQuery->addShould($tagsQuery); $categoryQuery = new \Elastica\Query\Terms(); $categoryQuery->setTerms('categoryIds', array('1', '2', '3')); $boolQuery->addMust($categoryQuery); $data = $finder->find($boolQuery); ``` Configuration: ```yaml fos_elastica: clients: default: { host: localhost, port: 9200 } indexes: site: settings: index: analysis: analyzer: my_analyzer: type: snowball language: English types: article: mappings: title: { boost: 10, analyzer: my_analyzer } tags: categoryIds: persistence: driver: orm model: Acme\DemoBundle\Entity\Article provider: finder: ``` ### Filtering Results and Executing a Default Query If may want to omit certain results from a query, filtering can be more performant than a basic query because the filter results can be cached. In turn, the query is run against only a subset of the results. A common use case for filtering would be if your data has fields that indicate whether records are "active" or "inactive". The following example illustrates how to issue such a query with Elastica: ```php $query = new \Elastica\Query\QueryString($queryString); $term = new \Elastica\Filter\Term(array('active' => true)); $filteredQuery = new \Elastica\Query\Filtered($query, $term); $results = $this->container->get('fos_elastica.finder.index.type')->find($filteredQuery); ``` ### Date format example If you want to specify a [date format](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-date-format.html): ```yaml fos_elastica: clients: default: { host: localhost, port: 9200 } indexes: site: types: user: mappings: username: { type: string } lastlogin: { type: date, format: basic_date_time } birthday: { type: date, format: "yyyy-MM-dd" } ``` #### Dynamic templates Dynamic templates allow to define mapping templates that will be applied when dynamic introduction of fields / objects happens. [Documentation](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-root-object-type.html#_dynamic_templates) ```yaml fos_elastica: clients: default: { host: localhost, port: 9200 } indexes: site: types: user: dynamic_templates: my_template_1: match: apples_* mapping: type: float my_template_2: match: * match_mapping_type: string mapping: type: string index: not_analyzed mappings: username: { type: string } ```