Exporter

A base class to help export documents to elasticsearch.

reference

class elastipy.Exporter(client=None, index_prefix: Optional[str] = None, index_postfix: Optional[str] = None, update_index: bool = True)[source]

Bases: object

Base class helper to export stuff to elasticsearch.

Derive from class and define class attributes:

  • INDEX_NAME: str Name of index, might contain a wildcard *

  • MAPPINGS: dict The mapping definition for the index.

And optionally override methods:

property client

Access to the elasticsearch client. If none was defined in constructor then elastipy.connections.get("default") is returned.

delete_index() bool[source]

Try to delete the index. Ignore if not found.

Returns

bool True if deleted, False otherwise.

If the index name contains a wildcard *, True is always returned.

export_list(object_list: Iterable[Any], chunk_size: int = 500, refresh: bool = False, verbose: bool = False, verbose_total: Optional[int] = None, file=None, **kwargs)[source]

Export a list of objects.

Parameters
  • object_listsequence of dict This can be a list or generator of dictionaries, containing the objects that should be exported.

  • chunk_sizeint Number of objects per bulk request.

  • refreshbool if True require the immediate refresh of the index when finished exporting.

  • verbosebool If True print some progress to stderr (using tqdm if present)

  • verbose_totalint Provide the number of objects for the verbosity if object_list is a generator.

  • file – Optional string stream to output verbose info, default is stderr.

All other parameters are passed to elasticsearch.helpers.bulk

Returns

dict Response of elasticsearch bulk call.

get_document_id(es_data: Mapping)[source]

Override this to return a single elasticsearch object’s id.

Parameters

es_datadict Single object as returned by transform_document()

Returns

str, int etc..

get_document_index(es_data: Mapping) str[source]

Override to define an index per document.

The default function returns the result from index_name() but it’s possible to put objects into separate indices.

For example you might define INDEX_NAME = "documents-*"

and get_document_index might return

self.index_name().replace("*", es_data["type"]
Parameters

es_datadict Single document as returned by transform_document()

Returns

str

get_index_params() dict[source]

Returns the complete index parameters.

Override if you need to specialize things.

Returns

dict

index_name() str[source]

Returns the configured index_prefix - INDEX_NAME - index_suffix

Returns

str

search(**kwargs) Search[source]

Return a new Search object for this index and client.

Returns

Search instance

transform_document(data: Mapping) Union[Mapping, Iterable[Mapping]][source]

Override this to transform each documents’s data into an elasticsearch document.

It’s possible to return a list or yield multiple elasticsearch documents.

Parameters

data – dict

Returns

dict or iterable of dict

update_index() None[source]

Create the index or update changes to the mapping.

Can only be called if INDEX_NAME does not contain a '*' :return: None