Exporter
A base class to help export documents to elasticsearch.
reference
- class elastipy.Exporter(client=None, index_prefix: Optional[str] = None, index_postfix: Optional[str] = None, update_index: bool = True)[source]
Bases:
object
Base class helper to export stuff to elasticsearch.
Derive from class and define class attributes:
INDEX_NAME
:str
Name of index, might contain a wildcard *MAPPINGS
:dict
The mapping definition for the index.
And optionally override methods:
transform_document()
Convert a document to elasticsearch.get_document_id()
Return a unique id for the elasticsearch document.get_document_index()
Return an alternative index name for the document.
- property client
Access to the elasticsearch client. If none was defined in constructor then
elastipy.connections.get("default")
is returned.
- delete_index() bool [source]
Try to delete the index. Ignore if not found.
- Returns
bool
True if deleted, False otherwise.If the index name contains a wildcard
*
, True is always returned.
- export_list(object_list: Iterable[Any], chunk_size: int = 500, refresh: bool = False, verbose: bool = False, verbose_total: Optional[int] = None, file=None, **kwargs)[source]
Export a list of objects.
- Parameters
object_list –
sequence of dict
This can be a list or generator of dictionaries, containing the objects that should be exported.chunk_size –
int
Number of objects per bulk request.refresh –
bool
ifTrue
require the immediate refresh of the index when finished exporting.verbose –
bool
If True print some progress to stderr (using tqdm if present)verbose_total –
int
Provide the number of objects for the verbosity ifobject_list
is a generator.file – Optional string stream to output verbose info, default is
stderr
.
All other parameters are passed to elasticsearch.helpers.bulk
- Returns
dict
Response of elasticsearch bulk call.
- get_document_id(es_data: Mapping)[source]
Override this to return a single elasticsearch object’s id.
- Parameters
es_data –
dict
Single object as returned by transform_document()- Returns
str, int etc..
- get_document_index(es_data: Mapping) str [source]
Override to define an index per document.
The default function returns the result from
index_name()
but it’s possible to put objects into separate indices.For example you might define
INDEX_NAME = "documents-*"
and
get_document_index
might returnself.index_name().replace("*", es_data["type"]
- Parameters
es_data –
dict
Single document as returned by transform_document()- Returns
str
- get_index_params() dict [source]
Returns the complete index parameters.
Override if you need to specialize things.
- Returns
dict
- index_name() str [source]
Returns the configured
index_prefix - INDEX_NAME - index_suffix
- Returns
str
- search(**kwargs) Search [source]
Return a new
Search
object for this index and client.- Returns
Search instance