Search¶
The Search class is the main entry for all queries and aggregation requests against elasticsearch.
supported queries¶
compound
full-text
match
term-level
search interface¶
The Search
class combines the query
and the aggregation
interface.
-
class
elastipy.
Search
(index: Optional[str] = None, client: Optional[Union[str, Callable, elasticsearch.client.Elasticsearch, Any]] = None, timestamp_field: str = 'timestamp')[source]¶ Bases:
elastipy.query.generated_interface.QueryInterface
,elastipy.aggregation.generated_interface.AggregationInterface
Interface to elasticsearch
/search
.All changes to a search object create and return a copy. Except for aggregations, which are attached to the search instance.
-
agg
(*aggregation_name_type, **params) → elastipy.aggregation.aggregation.Aggregation¶ Creates an aggregation.
- Either call
aggregation(“sum”, field=…) to create an automatic name
- or call
aggregation(“my_name”, “sum”, field=…) to set aggregation name explicitly
- Parameters
aggregation_name_type – one or two strings, meaning either “type” or “name”, “type”
params – all parameters of the aggregation function
- Returns
Aggregation instance
-
agg_adjacency_matrix
(*aggregation_name: Optional[str], filters: Mapping[str, Union[Mapping, QueryInterface]], separator: Optional[str] = None)¶ A bucket aggregation returning a form of adjacency matrix. The request provides a collection of named filter expressions, similar to the filters aggregation request. Each bucket in the response represents a non-empty cell in the matrix of intersecting filters.
The matrix is said to be symmetric so we only return half of it. To do this we sort the filter name strings and always use the lowest of a pair as the value to the left of the
"&"
separator.- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.filters –
Mapping[str, Union[Mapping, 'QueryInterface']]
separator –
Optional[str]
An alternative separator parameter can be passed in the request if clients wish to use a separator string other than the default of the ampersand.
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_auto_date_histogram
(*aggregation_name: Optional[str], field: Optional[str] = None, buckets: int = 10, minimum_interval: Optional[str] = None, time_zone: Optional[str] = None, format: Optional[str] = None, keyed: bool = False, missing: Optional[Any] = None, script: Optional[dict] = None)¶ A multi-bucket aggregation similar to the Date histogram except instead of providing an interval to use as the width of each bucket, a target number of buckets is provided indicating the number of buckets needed and the interval of the buckets is automatically chosen to best achieve that target. The number of buckets returned will always be less than or equal to this target number.
The buckets field is optional, and will default to 10 buckets if not specified.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
Optional[str]
If no field is specified it will default to the ‘timestamp_field’ of the Search class.buckets –
int
The number of buckets that are to be returned.minimum_interval –
Optional[str]
The minimum_interval allows the caller to specify the minimum rounding interval that should be used. This can make the collection process more efficient, as the aggregation will not attempt to round at any interval lower than minimum_interval.The accepted units for minimum_interval are: year, month, day, hour, minute, second
time_zone –
Optional[str]
Date-times are stored in Elasticsearch in UTC. By default, all bucketing and rounding is also done in UTC. Thetime_zone
parameter can be used to indicate that bucketing should use a different time zone.Time zones may either be specified as an ISO 8601 UTC offset (e.g.
+01:00
or-08:00
) or as a timezone id, an identifier used in the TZ database like America/Los_Angeles.Warning
When using time zones that follow DST (daylight savings time) changes, buckets close to the moment when those changes happen can have slightly different sizes than neighbouring buckets. For example, consider a DST start in the CET time zone: on 27 March 2016 at 2am, clocks were turned forward 1 hour to 3am local time. If the result of the aggregation was daily buckets, the bucket covering that day will only hold data for 23 hours instead of the usual 24 hours for other buckets. The same is true for shorter intervals like e.g. 12h. Here, we will have only a 11h bucket on the morning of 27 March when the DST shift happens.
format –
Optional[str]
Specifies the format of the ‘key_as_string’ response. See: mapping date formatkeyed –
bool
Setting the keyed flag to true associates a unique string key with each bucket and returns the ranges as a hash rather than an array.missing –
Optional[Any]
The missing parameter defines how documents that are missing a value should be treated. By default they will be ignored but it is also possible to treat them as if they had a value.script –
Optional[dict]
Generating the terms using a script
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_children
(*aggregation_name: Optional[str], type: str)¶ A special single bucket aggregation that selects child documents that have the specified type, as defined in a join field.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.type –
str
The child type that should be selected.
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_composite
(*aggregation_name: Optional[str], sources: Sequence[Mapping], size: int = 10, after: Optional[Union[str, int, float, datetime.datetime]] = None)¶ A multi-bucket aggregation that creates composite buckets from different sources.
Unlike the other multi-bucket aggregations, you can use the composite aggregation to paginate all buckets from a multi-level aggregation efficiently. This aggregation provides a way to stream all buckets of a specific aggregation, similar to what scroll does for documents.
The composite buckets are built from the combinations of the values extracted/created for each document and each combination is considered as a composite bucket.
For optimal performance the index sort should be set on the index so that it matches parts or fully the source order in the composite aggregation.
Sub-buckets: Like any multi-bucket aggregations the composite aggregation can hold sub-aggregations. These sub-aggregations can be used to compute other buckets or statistics on each composite bucket created by this parent aggregation.
Pipeline aggregations: The composite agg is not currently compatible with pipeline aggregations, nor does it make sense in most cases. E.g. due to the paging nature of composite aggs, a single logical partition (one day for example) might be spread over multiple pages. Since pipeline aggregations are purely post-processing on the final list of buckets, running something like a derivative on a composite page could lead to inaccurate results as it is only taking into account a “partial” result on that page.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.sources –
Sequence[Mapping]
The sources parameter defines the source fields to use when building composite buckets. The order that the sources are defined controls the order that the keys are returned.The sources parameter can be any of the following types:
Terms
Histogram
Date histogram
GeoTile grid
Note
You must use a unique name when defining sources.
size –
int
The size parameter can be set to define how many composite buckets should be returned. Each composite bucket is considered as a single bucket, so setting a size of 10 will return the first 10 composite buckets created from the value sources. The response contains the values for each composite bucket in an array containing the values extracted from each value source.Pagination: If the number of composite buckets is too high (or unknown) to be returned in a single response it is possible to split the retrieval in multiple requests. Since the composite buckets are flat by nature, the requested size is exactly the number of composite buckets that will be returned in the response (assuming that they are at least size composite buckets to return). If all composite buckets should be retrieved it is preferable to use a small size (100 or 1000 for instance) and then use the after parameter to retrieve the next results.
after –
Optional[Union[str, int, float, datetime]]
To get the next set of buckets, resend the same aggregation with the after parameter set to theafter_key
value returned in the response.Note
The after_key is usually the key to the last bucket returned in the response, but that isn’t guaranteed. Always use the returned after_key instead of derriving it from the buckets.
In order to optimize the early termination it is advised to set
track_total_hits
in the request to false. The number of total hits that match the request can be retrieved on the first request and it would be costly to compute this number on every page.
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_date_histogram
(*aggregation_name: Optional[str], field: Optional[str] = None, calendar_interval: Optional[str] = None, fixed_interval: Optional[str] = None, min_doc_count: int = 1, offset: Optional[str] = None, time_zone: Optional[str] = None, format: Optional[str] = None, keyed: bool = False, missing: Optional[Any] = None, script: Optional[dict] = None)¶ This multi-bucket aggregation is similar to the normal histogram, but it can only be used with date or date range values. Because dates are represented internally in Elasticsearch as long values, it is possible, but not as accurate, to use the normal histogram on dates as well. The main difference in the two APIs is that here the interval can be specified using date/time expressions. Time-based data requires special support because time-based intervals are not always a fixed length.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
Optional[str]
If no field is specified it will default to the ‘timestamp_field’ of the Search class.calendar_interval –
Optional[str]
Calendar-aware intervals are configured with the calendar_interval parameter. You can specify calendar intervals using the unit name, such asmonth
, or as a single unit quantity, such as1M
. For example,day
and1d
are equivalent. Multiple quantities, such as2d
, are not supported.fixed_interval –
Optional[str]
In contrast to calendar-aware intervals, fixed intervals are a fixed number of SI units and never deviate, regardless of where they fall on the calendar. One second is always composed of 1000ms. This allows fixed intervals to be specified in any multiple of the supported units.However, it means fixed intervals cannot express other units such as months, since the duration of a month is not a fixed quantity. Attempting to specify a calendar interval like month or quarter will throw an exception.
The accepted units for fixed intervals are:
milliseconds (
ms
): A single millisecond. This is a very, very small interval.seconds (
s
): Defined as 1000 milliseconds each.minutes (
m
): Defined as 60 seconds each (60,000 milliseconds). All minutes begin at 00 seconds.hours (
h
): Defined as 60 minutes each (3,600,000 milliseconds). All hours begin at 00 minutes and 00 seconds.days (
d
): Defined as 24 hours (86,400,000 milliseconds). All days begin at the earliest possible time, which is usually 00:00:00 (midnight).
min_doc_count –
int
Minimum documents required for a bucket. Set to 0 to allow creating empty buckets.offset –
Optional[str]
Use the offset parameter to change the start value of each bucket by the specified positive (+) or negative offset (-) duration, such as1h
for an hour, or1d
for a day. See Time units for more possible time duration options.For example, when using an interval of day, each bucket runs from midnight to midnight. Setting the offset parameter to
+6h
changes each bucket to run from 6am to 6amtime_zone –
Optional[str]
Elasticsearch stores date-times in Coordinated Universal Time (UTC). By default, all bucketing and rounding is also done in UTC. Use the time_zone parameter to indicate that bucketing should use a different time zone.For example, if the interval is a calendar day and the time zone is
America/New_York
then2020-01-03T01:00:01Z
isconverted to
2020-01-02T18:00:01
rounded down to
2020-01-02T00:00:00
then converted back to UTC to produce
2020-01-02T05:00:00:00Z
finally, when the bucket is turned into a string key it is printed in
America/New_York
so it’ll display as"2020-01-02T00:00:00"
It looks like:
bucket_key = localToUtc(Math.floor(utcToLocal(value) / interval) * interval))
You can specify time zones as an ISO 8601 UTC offset (e.g.
+01:00
or-08:00
) or as an IANA time zone ID, such as America/Los_Angeles.format –
Optional[str]
Specifies the format of the ‘key_as_string’ response. See: mapping date formatkeyed –
bool
Setting the keyed flag to true associates a unique string key with each bucket and returns the ranges as a hash rather than an array.missing –
Optional[Any]
The missing parameter defines how documents that are missing a value should be treated. By default they will be ignored but it is also possible to treat them as if they had a value.script –
Optional[dict]
Generating the terms using a script
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_date_range
(*aggregation_name: Optional[str], ranges: Sequence[Union[Mapping[str, str], str]], field: Optional[str] = None, format: Optional[str] = None, time_zone: Optional[str] = None, keyed: bool = False, missing: Optional[Any] = None, script: Optional[dict] = None)¶ A range aggregation that is dedicated for date values. The main difference between this aggregation and the normal range aggregation is that the from and to values can be expressed in Date Math expressions, and it is also possible to specify a date format by which the from and to response fields will be returned.
Note
Note that this aggregation includes the from value and excludes the to value for each range.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.ranges –
Sequence[Union[Mapping[str, str], str]]
List of ranges to define the bucketsExample:
[ {"to": "1970-01-01"}, {"from": "1970-01-01", "to": "1980-01-01"}, {"from": "1980-01-01"}, ]
Instead of date values any Date Math expression can be used as well.
Alternatively this parameter can be a list of strings. The above example can be rewritten as:
["1970-01-01", "1980-01-01"]
Note
This aggregation includes the from value and excludes the to value for each range.
field –
Optional[str]
The date fieldIf no field is specified it will default to the ‘timestamp_field’ of the Search class.
format –
Optional[str]
The format of the response bucket keys as available for the DateTimeFormattertime_zone –
Optional[str]
Dates can be converted from another time zone to UTC by specifying the time_zone parameter.Time zones may either be specified as an ISO 8601 UTC offset (e.g.
+01:00
or-08:00
) or as one of the time zone ids from the TZ database.The time_zone parameter is also applied to rounding in date math expressions.
keyed –
bool
Setting the keyed flag to true associates a unique string key with each bucket and returns the ranges as a hash rather than an array.missing –
Optional[Any]
The missing parameter defines how documents that are missing a value should be treated. By default they will be ignored but it is also possible to treat them as if they had a value.script –
Optional[dict]
Generating the terms using a script
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_diversified_sampler
(*aggregation_name: Optional[str], field: Optional[str] = None, script: Optional[Mapping] = None, shard_size: int = 100, max_docs_per_value: int = 1)¶ Like the
sampler
aggregation this is a filtering aggregation used to limit any sub aggregations’ processing to a sample of the top-scoring documents. Thediversified_sampler
aggregation adds the ability to limit the number of matches that share a common value such as an “author”.Note
Any good market researcher will tell you that when working with samples of data it is important that the sample represents a healthy variety of opinions rather than being skewed by any single voice. The same is true with aggregations and sampling with these diversify settings can offer a way to remove the bias in your content (an over-populated geography, a large spike in a timeline or an over-active forum spammer).
Example use cases:
Tightening the focus of analytics to high-relevance matches rather than the potentially very long tail of low-quality matches
Removing bias from analytics by ensuring fair representation of content from different sources
Reducing the running cost of aggregations that can produce useful results using only samples e.g. significant_terms
A choice of field or script setting is used to provide values used for de-duplication and the
max_docs_per_value
setting controls the maximum number of documents collected on any one shard which share a common value. The default setting formax_docs_per_value
is 1.Note
The aggregation will throw an error if the choice of field or script produces multiple values for a single document (de-duplication using multi-valued fields is not supported due to efficiency concerns).
Cannot be nested under breadth_first aggregations Being a quality-based filter the diversified_sampler aggregation needs access to the relevance score produced for each document. It therefore cannot be nested under a terms aggregation which has the collect_mode switched from the default depth_first mode to breadth_first as this discards scores. In this situation an error will be thrown.
Limited de-dup logic. The de-duplication logic applies only at a shard level so will not apply across shards.
No specialized syntax for geo/date fields Currently the syntax for defining the diversifying values is defined by a choice of field or script - there is no added syntactical sugar for expressing geo or date units such as
"7d"
(7 days). This support may be added in a later release and users will currently have to create these sorts of values using a script.- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
Optional[str]
The field to search on. Can alternatively be a scriptscript –
Optional[Mapping]
The script that specifies the aggregation. Can alternatively be a ‘field’shard_size –
int
The shard_size parameter limits how many top-scoring documents are collected in the sample processed on each shard. The default value is 100.max_docs_per_value –
int
The max_docs_per_value is an optional parameter and limits how many documents are permitted per choice of de-duplicating value. The default setting is 1.
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_filter
(*aggregation_name: Optional[str], filter: Union[Mapping, QueryInterface])¶ Defines a single bucket of all the documents in the current document set context that match a specified filter. Often this will be used to narrow down the current aggregation context to a specific set of documents.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.filter –
Union[Mapping, 'QueryInterface']
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_filters
(*aggregation_name: Optional[str], filters: Mapping[str, Union[Mapping, QueryInterface]])¶ Defines a multi bucket aggregation where each bucket is associated with a filter. Each bucket will collect all documents that match its associated filter.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.filters –
Mapping[str, Union[Mapping, 'QueryInterface']]
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_geo_distance
(*aggregation_name: Optional[str], field: str, ranges: Sequence[Union[Mapping[str, float], float]], origin: Union[str, Mapping[str, float], Sequence[float]], unit: str = 'm', distance_type: str = 'arc', keyed: bool = False)¶ A multi-bucket aggregation that works on geo_point fields and conceptually works very similar to the range aggregation. The user can define a point of origin and a set of distance range buckets. The aggregation evaluate the distance of each document value from the origin point and determines the buckets it belongs to based on the ranges (a document belongs to a bucket if the distance between the document and the origin falls within the distance range of the bucket).
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
The specified field must be of type geo_point (which can only be set explicitly in the mappings). And it can also hold an array of geo_point fields, in which case all will be taken into account during aggregation.ranges –
Sequence[Union[Mapping[str, float], float]]
A list of ranges that define the separate buckets, e.g:[ { "to": 100000 }, { "from": 100000, "to": 300000 }, { "from": 300000 } ]
Alternatively this parameter can be a list of numbers. The above example can be rewritten as
[100000, 300000]
origin –
Union[str, Mapping[str, float], Sequence[float]]
The origin point can accept all formats supported by the geo_point type:Object format:
{ "lat" : 52.3760, "lon" : 4.894 }
- this is the safest format as it is the most explicit about the lat & lon valuesString format:
"52.3760, 4.894"
- where the first number is the lat and the second is the lonArray format:
[4.894, 52.3760]
- which is based on the GeoJson standard and where the first number is the lon and the second one is the lat
unit –
str
By default, the distance unit ism
(meters) but it can also accept:mi
(miles),in
(inches),yd
(yards),km
(kilometers),cm
(centimeters),mm
(millimeters).distance_type –
str
There are two distance calculation modes:arc
(the default), andplane
. The arc calculation is the most accurate. The plane is the fastest but least accurate. Consider using plane when your search context is “narrow”, and spans smaller geographical areas (~5km).plane
will return higher error margins for searches across very large areas (e.g. cross continent search).keyed –
bool
Setting the keyed flag to true will associate a unique string key with each bucket and return the ranges as a hash rather than an array.
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_geohash_grid
(*aggregation_name: Optional[str], field: str, precision: Union[int, str] = 5, bounds: Optional[Mapping] = None, size: int = 10000, shard_size: Optional[int] = None)¶ A multi-bucket aggregation that works on geo_point fields and groups points into buckets that represent cells in a grid. The resulting grid can be sparse and only contains cells that have matching data. Each cell is labeled using a geohash which is of user-definable precision.
High precision geohashes have a long string length and represent cells that cover only a small area.
Low precision geohashes have a short string length and represent cells that each cover a large area.
Geohashes used in this aggregation can have a choice of precision between 1 and 12.
The highest-precision geohash of length 12 produces cells that cover less than a square metre of land and so high-precision requests can be very costly in terms of RAM and result sizes.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
The specified field must be of typegeo_point
orgeo_shape
(which can only be set explicitly in the mappings). And it can also hold an array of geo_point fields, in which case all will be taken into account during aggregation.Aggregating on Geo-shape fields works just as it does for points, except that a single shape can be counted for in multiple tiles. A shape will contribute to the count of matching values if any part of its shape intersects with that tile.
precision –
Union[int, str]
The required precision of the grid in the range [1, 12]. Higher means more precise.Alternatively, the precision level can be approximated from a distance measure like
"1km"
,"10m"
. The precision level is calculate such that cells will not exceed the specified size (diagonal) of the required precision. When this would lead to precision levels higher than the supported 12 levels, (e.g. for distances <5.6cm) the value is rejected.Note
When requesting detailed buckets (typically for displaying a “zoomed in” map) a filter like geo_bounding_box should be applied to narrow the subject area otherwise potentially millions of buckets will be created and returned.
bounds –
Optional[Mapping]
The geohash_grid aggregation supports an optional bounds parameter that restricts the points considered to those that fall within the bounds provided. The bounds parameter accepts the bounding box in all the same accepted formats of the bounds specified in the Geo Bounding Box Query. This bounding box can be used with or without an additional geo_bounding_box query filtering the points prior to aggregating. It is an independent bounding box that can intersect with, be equal to, or be disjoint to any additional geo_bounding_box queries defined in the context of the aggregation.size –
int
The maximum number of geohash buckets to return (defaults to 10,000). When results are trimmed, buckets are prioritised based on the volumes of documents they contain.shard_size –
Optional[int]
To allow for more accurate counting of the top cells returned in the final result the aggregation defaults to returningmax(10, (size x number-of-shards))
buckets from each shard. If this heuristic is undesirable, the number considered from each shard can be over-ridden using this parameter.
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_geotile_grid
(*aggregation_name: Optional[str], field: str, precision: Union[int, str] = 7, bounds: Optional[Mapping] = None, size: int = 10000, shard_size: Optional[int] = None)¶ A multi-bucket aggregation that works on geo_point fields and groups points into buckets that represent cells in a grid. The resulting grid can be sparse and only contains cells that have matching data. Each cell corresponds to a map tile as used by many online map sites. Each cell is labeled using a “{zoom}/{x}/{y}” format, where zoom is equal to the user-specified precision.
High precision keys have a larger range for x and y, and represent tiles that cover only a small area.
Low precision keys have a smaller range for x and y, and represent tiles that each cover a large area.
Warning
The highest-precision geotile of length 29 produces cells that cover less than a 10cm by 10cm of land and so high-precision requests can be very costly in terms of RAM and result sizes. Please first filter the aggregation to a smaller geographic area before requesting high-levels of detail.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
The specified field must be of type geo_point (which can only be set explicitly in the mappings). And it can also hold an array of geo_point fields, in which case all will be taken into account during aggregation.precision –
Union[int, str]
The required precision of the grid in the range [1, 29]. Higher means more precise.Note
When requesting detailed buckets (typically for displaying a “zoomed in” map) a filter like geo_bounding_box should be applied to narrow the subject area otherwise potentially millions of buckets will be created and returned.
bounds –
Optional[Mapping]
The geotile_grid aggregation supports an optional bounds parameter that restricts the points considered to those that fall within the bounds provided. The bounds parameter accepts the bounding box in all the same accepted formats of the bounds specified in the Geo Bounding Box Query. This bounding box can be used with or without an additionalgeo_bounding_box
query filtering the points prior to aggregating. It is an independent bounding box that can intersect with, be equal to, or be disjoint to any additional geo_bounding_box queries defined in the context of the aggregation.size –
int
The maximum number of geohash buckets to return (defaults to 10,000). When results are trimmed, buckets are prioritised based on the volumes of documents they contain.shard_size –
Optional[int]
To allow for more accurate counting of the top cells returned in the final result the aggregation defaults to returningmax(10, (size x number-of-shards))
buckets from each shard. If this heuristic is undesirable, the number considered from each shard can be over-ridden using this parameter.
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_global
(*aggregation_name: Optional[str])¶ Defines a single bucket of all the documents within the search execution context. This context is defined by the indices and the document types you’re searching on, but is not influenced by the search query itself.
Note
Global aggregators can only be placed as top level aggregators because it doesn’t make sense to embed a global aggregator within another bucket aggregator.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_histogram
(*aggregation_name: Optional[str], field: str, interval: int, min_doc_count: int = 0, offset: Optional[int] = None, extended_bounds: Optional[Mapping[str, int]] = None, hard_bounds: Optional[Mapping[str, int]] = None, format: Optional[str] = None, order: Optional[Union[Mapping, str]] = None, keyed: bool = False, missing: Optional[Any] = None)¶ A multi-bucket values source based aggregation that can be applied on numeric values or numeric range values extracted from the documents. It dynamically builds fixed size (a.k.a. interval) buckets over the values. For example, if the documents have a field that holds a price (numeric), we can configure this aggregation to dynamically build buckets with interval 5 (in case of price it may represent $5). When the aggregation executes, the price field of every document will be evaluated and will be rounded down to its closest bucket - for example, if the price is 32 and the bucket size is 5 then the rounding will yield 30 and thus the document will “fall” into the bucket that is associated with the key 30. To make this more formal, here is the rounding function that is used:
bucket_key = Math.floor((value - offset) / interval) * interval + offset
For range values, a document can fall into multiple buckets. The first bucket is computed from the lower bound of the range in the same way as a bucket for a single value is computed. The final bucket is computed in the same way from the upper bound of the range, and the range is counted in all buckets in between and including those two.
The interval must be a positive decimal, while the offset must be a decimal in [0, interval) (a decimal greater than or equal to 0 and less than interval)
Histogram fields: Running a histogram aggregation over histogram fields computes the total number of counts for each interval. See example
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
A numeric field to be indexed by the histogram.interval –
int
A positive decimal defining the interval between buckets.min_doc_count –
int
By default the response will fill gaps in the histogram with empty buckets. It is possible change that and request buckets with a higher minimum count thanks to the min_doc_count settingBy default the histogram returns all the buckets within the range of the data itself, that is, the documents with the smallest values (on which with histogram) will determine the min bucket (the bucket with the smallest key) and the documents with the highest values will determine the max bucket (the bucket with the highest key). Often, when requesting empty buckets, this causes a confusion, specifically, when the data is also filtered.
To understand why, let’s look at an example:
Lets say the you’re filtering your request to get all docs with values between 0 and 500, in addition you’d like to slice the data per price using a histogram with an interval of 50. You also specify “min_doc_count” : 0 as you’d like to get all buckets even the empty ones. If it happens that all products (documents) have prices higher than 100, the first bucket you’ll get will be the one with 100 as its key. This is confusing, as many times, you’d also like to get those buckets between 0 - 100.
offset –
Optional[int]
By default the bucket keys start with 0 and then continue in even spaced steps of interval, e.g. if the interval is 10, the first three buckets (assuming there is data inside them) will be [0, 10), [10, 20), [20, 30). The bucket boundaries can be shifted by using the offset option.This can be best illustrated with an example. If there are 10 documents with values ranging from 5 to 14, using interval 10 will result in two buckets with 5 documents each. If an additional offset 5 is used, there will be only one single bucket [5, 15) containing all the 10 documents.
extended_bounds –
Optional[Mapping[str, int]]
With extended_bounds setting, you now can “force” the histogram aggregation to start building buckets on a specific min value and also keep on building buckets up to a max value (even if there are no documents anymore). Using extended_bounds only makes sense whenmin_doc_count
is 0 (the empty buckets will never be returned if min_doc_count is greater than 0).Note that (as the name suggest) extended_bounds is not filtering buckets. Meaning, if the extended_bounds.min is higher than the values extracted from the documents, the documents will still dictate what the first bucket will be (and the same goes for the extended_bounds.max and the last bucket). For filtering buckets, one should nest the histogram aggregation under a range filter aggregation with the appropriate from/to settings.
When aggregating ranges, buckets are based on the values of the returned documents. This means the response may include buckets outside of a query’s range. For example, if your query looks for values greater than 100, and you have a range covering 50 to 150, and an interval of 50, that document will land in 3 buckets - 50, 100, and 150. In general, it’s best to think of the query and aggregation steps as independent - the query selects a set of documents, and then the aggregation buckets those documents without regard to how they were selected. See note on bucketing range fields for more information and an example.
hard_bounds –
Optional[Mapping[str, int]]
The hard_bounds is a counterpart of extended_bounds and can limit the range of buckets in the histogram. It is particularly useful in the case of open data ranges that can result in a very large number of buckets.format –
Optional[str]
Specifies the format of the ‘key_as_string’ response. See: mapping date formatorder –
Optional[Union[Mapping, str]]
By default the returned buckets are sorted by their key ascending, though the order behaviour can be controlled using the order setting. Supports the same order functionality as the Terms Aggregation.keyed –
bool
Setting the keyed flag to true associates a unique string key with each bucket and returns the ranges as a hash rather than an array.missing –
Optional[Any]
The missing parameter defines how documents that are missing a value should be treated. By default they will be ignored but it is also possible to treat them as if they had a value.
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_ip_range
(*aggregation_name: Optional[str], field: str, ranges: Sequence[Union[Mapping[str, str], str]], keyed: bool = False)¶ Just like the dedicated date range aggregation, there is also a dedicated range aggregation for IP typed fields:
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
The IPv4 fieldranges –
Sequence[Union[Mapping[str, str], str]]
List of ranges to define the buckets, either as straight IPv4 or as CIDR masks.Example:
[ {"to": "10.0.0.5"}, {"from": "10.0.0.5", "to": "10.0.0.127"}, {"from": "10.0.0.127"}, ]
Alternatively this parameter can be a list of strings. The above example can be rewritten as:
["10.0.0.5", "10.0.0.127"]
keyed –
bool
Setting the keyed flag to true associates a unique string key with each bucket and returns the ranges as a hash rather than an array.
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_missing
(*aggregation_name: Optional[str], field: str)¶ A field data based single bucket aggregation, that creates a bucket of all documents in the current document set context that are missing a field value (effectively, missing a field or having the configured NULL value set). This aggregator will often be used in conjunction with other field data bucket aggregators (such as ranges) to return information for all the documents that could not be placed in any of the other buckets due to missing field data values.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
The field we wish to investigate for missing values
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_nested
(*aggregation_name: Optional[str], path: str)¶ A special single bucket aggregation that enables aggregating nested documents.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.path –
str
The field of the nested document(s)
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_range
(*aggregation_name: Optional[str], ranges: Sequence[Union[Mapping[str, Any], Any]], field: Optional[str] = None, keyed: bool = False, script: Optional[dict] = None)¶ A multi-bucket value source based aggregation that enables the user to define a set of ranges - each representing a bucket. During the aggregation process, the values extracted from each document will be checked against each bucket range and “bucket” the relevant/matching document.
Note
Note that this aggregation includes the from value and excludes the to value for each range.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.ranges –
Sequence[Union[Mapping[str, Any], Any]]
List of ranges to define the bucketsExample:
[ {"to": 10}, {"from": 10, "to": 20}, {"from": 20}, ]
Alternatively this parameter can be a list of strings. The above example can be rewritten as:
[10, 20]
Note
This aggregation includes the from value and excludes the to value for each range.
field –
Optional[str]
The field to index by the aggregationkeyed –
bool
Setting the keyed flag to true associates a unique string key with each bucket and returns the ranges as a hash rather than an array.script –
Optional[dict]
Generating the terms using a script
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_rare_terms
(*aggregation_name: Optional[str], field: str, max_doc_count: int = 1, include: Optional[Union[str, Sequence[str], Mapping[str, int]]] = None, exclude: Optional[Union[str, Sequence[str]]] = None, missing: Optional[Any] = None)¶ A multi-bucket value source based aggregation which finds “rare” terms — terms that are at the long-tail of the distribution and are not frequent. Conceptually, this is like a terms aggregation that is sorted by
_count
ascending. As noted in the terms aggregation docs, actually ordering a terms agg by count ascending has unbounded error. Instead, you should use the rare_terms aggregation.- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
The field we wish to find rare terms inmax_doc_count –
int
The maximum number of documents a term should appear in.The max_doc_count parameter is used to control the upper bound of document counts that a term can have. There is not a size limitation on the rare_terms agg like terms agg has. This means that terms which match the max_doc_count criteria will be returned. The aggregation functions in this manner to avoid the order-by-ascending issues that afflict the terms aggregation.
This does, however, mean that a large number of results can be returned if chosen incorrectly. To limit the danger of this setting, the maximum max_doc_count is 100.
include –
Optional[Union[str, Sequence[str], Mapping[str, int]]]
A regexp pattern that filters the documents which will be aggregated.Alternatively can be a list of strings.
Parition expressions are also possible.
exclude –
Optional[Union[str, Sequence[str]]]
A regexp pattern that filters the documents which will be aggregated.Alternatively can be a list of strings.
missing –
Optional[Any]
The missing parameter defines how documents that are missing a value should be treated. By default they will be ignored but it is also possible to treat them as if they had a value.
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_sampler
(*aggregation_name: Optional[str], shard_size: int = 100)¶ A filtering aggregation used to limit any sub aggregations’ processing to a sample of the top-scoring documents.
Example use cases:
Tightening the focus of analytics to high-relevance matches rather than the potentially very long tail of low-quality matches
Reducing the running cost of aggregations that can produce useful results using only samples e.g. significant_terms
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.shard_size –
int
The shard_size parameter limits how many top-scoring documents are collected in the sample processed on each shard. The default value is 100.
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_significant_terms
(*aggregation_name: Optional[str], field: str, size: int = 10, shard_size: Optional[int] = None, min_doc_count: int = 1, shard_min_doc_count: Optional[int] = None, execution_hint: str = 'global_ordinals', include: Optional[Union[str, Sequence[str], Mapping[str, int]]] = None, exclude: Optional[Union[str, Sequence[str]]] = None, script: Optional[dict] = None)¶ An aggregation that returns interesting or unusual occurrences of terms in a set.
Example use cases:
Suggesting “H5N1” when users search for “bird flu” in text
Identifying the merchant that is the “common point of compromise” from the transaction history of credit card owners reporting loss
Suggesting keywords relating to stock symbol $ATI for an automated news classifier
Spotting the fraudulent doctor who is diagnosing more than their fair share of whiplash injuries
Spotting the tire manufacturer who has a disproportionate number of blow-outs
In all these cases the terms being selected are not simply the most popular terms in a set. They are the terms that have undergone a significant change in popularity measured between a foreground and background set. If the term “H5N1” only exists in 5 documents in a 10 million document index and yet is found in 4 of the 100 documents that make up a user’s search results that is significant and probably very relevant to their search.
5/10,000,000
vs4/100
is a big swing in frequency.Warning
Picking a free-text field as the subject of a significant terms analysis can be expensive! It will attempt to load every unique word into RAM. It is recommended to only use this on smaller indices.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
size –
int
The size parameter can be set to define how many term buckets should be returned out of the overall terms list. By default, the node coordinating the search process will request each shard to provide its own top size term buckets and once all shards respond, it will reduce the results to the final list that will then be returned to the client. This means that if the number of unique terms is greater than size, the returned list is slightly off and not accurate (it could be that the term counts are slightly off and it could even be that a term that should have been in the top size buckets was not returned).shard_size –
Optional[int]
The higher the requested size is, the more accurate the results will be, but also, the more expensive it will be to compute the final results (both due to bigger priority queues that are managed on a shard level and due to bigger data transfers between the nodes and the client).The shard_size parameter can be used to minimize the extra work that comes with bigger requested size. When defined, it will determine how many terms the coordinating node will request from each shard. Once all the shards responded, the coordinating node will then reduce them to a final result which will be based on the size parameter - this way, one can increase the accuracy of the returned terms and avoid the overhead of streaming a big list of buckets back to the client.
min_doc_count –
int
It is possible to only return terms that match more than a configured number of hits using the min_doc_count option. Default value is 1.Terms are collected and ordered on a shard level and merged with the terms collected from other shards in a second step. However, the shard does not have the information about the global document count available. The decision if a term is added to a candidate list depends only on the order computed on the shard using local shard frequencies. The min_doc_count criterion is only applied after merging local terms statistics of all shards. In a way the decision to add the term as a candidate is made without being very certain about if the term will actually reach the required min_doc_count. This might cause many (globally) high frequent terms to be missing in the final result if low frequent terms populated the candidate lists. To avoid this, the shard_size parameter can be increased to allow more candidate terms on the shards. However, this increases memory consumption and network traffic.
shard_min_doc_count –
Optional[int]
The parameter shard_min_doc_count regulates the certainty a shard has if the term should actually be added to the candidate list or not with respect to the min_doc_count. Terms will only be considered if their local shard frequency within the set is higher than the shard_min_doc_count. If your dictionary contains many low frequent terms and you are not interested in those (for example misspellings), then you can set the shard_min_doc_count parameter to filter out candidate terms on a shard level that will with a reasonable certainty not reach the required min_doc_count even after merging the local counts. shard_min_doc_count is set to 0 per default and has no effect unless you explicitly set it.Note
Setting min_doc_count=0 will also return buckets for terms that didn’t match any hit. However, some of the returned terms which have a document count of zero might only belong to deleted documents or documents from other types, so there is no warranty that a match_all query would find a positive document count for those terms.
Warning
When NOT sorting on doc_count descending, high values of min_doc_count may return a number of buckets which is less than size because not enough data was gathered from the shards. Missing buckets can be back by increasing shard_size. Setting shard_min_doc_count too high will cause terms to be filtered out on a shard level. This value should be set much lower than min_doc_count/#shards.
execution_hint –
str
There are different mechanisms by which terms aggregations can be executed:by using field values directly in order to aggregate data per-bucket (
map
)by using global ordinals of the field and allocating one bucket per global ordinal (
global_ordinals
)
Elasticsearch tries to have sensible defaults so this is something that generally doesn’t need to be configured.
global_ordinals
is the default option for keyword field, it uses global ordinals to allocates buckets dynamically so memory usage is linear to the number of values of the documents that are part of the aggregation scope.map
should only be considered when very few documents match a query. Otherwise the ordinals-based execution mode is significantly faster. By default,map
is only used when running an aggregation on scripts, since they don’t have ordinals.include –
Optional[Union[str, Sequence[str], Mapping[str, int]]]
A regexp pattern that filters the documents which will be aggregated.Alternatively can be a list of strings.
Parition expressions are also possible.
exclude –
Optional[Union[str, Sequence[str]]]
A regexp pattern that filters the documents which will be aggregated.Alternatively can be a list of strings.
script –
Optional[dict]
Generating the terms using a script
- Returns
'AggregationInterface'
A new instance is created and returned
-
agg_terms
(*aggregation_name: Optional[str], field: str, size: int = 10, shard_size: Optional[int] = None, show_term_doc_count_error: Optional[bool] = None, order: Optional[Union[Mapping, str]] = None, min_doc_count: int = 1, shard_min_doc_count: Optional[int] = None, include: Optional[Union[str, Sequence[str], Mapping[str, int]]] = None, exclude: Optional[Union[str, Sequence[str]]] = None, missing: Optional[Any] = None, script: Optional[dict] = None)¶ A multi-bucket value source based aggregation where buckets are dynamically built - one per unique value.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
size –
int
The size parameter can be set to define how many term buckets should be returned out of the overall terms list. By default, the node coordinating the search process will request each shard to provide its own top size term buckets and once all shards respond, it will reduce the results to the final list that will then be returned to the client. This means that if the number of unique terms is greater than size, the returned list is slightly off and not accurate (it could be that the term counts are slightly off and it could even be that a term that should have been in the top size buckets was not returned).shard_size –
Optional[int]
The higher the requested size is, the more accurate the results will be, but also, the more expensive it will be to compute the final results (both due to bigger priority queues that are managed on a shard level and due to bigger data transfers between the nodes and the client).The shard_size parameter can be used to minimize the extra work that comes with bigger requested size. When defined, it will determine how many terms the coordinating node will request from each shard. Once all the shards responded, the coordinating node will then reduce them to a final result which will be based on the size parameter - this way, one can increase the accuracy of the returned terms and avoid the overhead of streaming a big list of buckets back to the client.
show_term_doc_count_error –
Optional[bool]
This shows an error value for each term returned by the aggregation which represents the worst case error in the document count and can be useful when deciding on a value for the shard_size parameter. This is calculated by summing the document counts for the last term returned by all shards which did not return the term.These errors can only be calculated in this way when the terms are ordered by descending document count. When the aggregation is ordered by the terms values themselves (either ascending or descending) there is no error in the document count since if a shard does not return a particular term which appears in the results from another shard, it must not have that term in its index. When the aggregation is either sorted by a sub aggregation or in order of ascending document count, the error in the document counts cannot be determined and is given a value of -1 to indicate this.
order –
Optional[Union[Mapping, str]]
The order of the buckets can be customized by setting the order parameter. By default, the buckets are ordered by their doc_count descending.Warning
Sorting by ascending _count or by sub aggregation is discouraged as it increases the error on document counts. It is fine when a single shard is queried, or when the field that is being aggregated was used as a routing key at index time: in these cases results will be accurate since shards have disjoint values. However otherwise, errors are unbounded. One particular case that could still be useful is sorting by min or max aggregation: counts will not be accurate but at least the top buckets will be correctly picked.
min_doc_count –
int
It is possible to only return terms that match more than a configured number of hits using the min_doc_count option. Default value is 1.Terms are collected and ordered on a shard level and merged with the terms collected from other shards in a second step. However, the shard does not have the information about the global document count available. The decision if a term is added to a candidate list depends only on the order computed on the shard using local shard frequencies. The min_doc_count criterion is only applied after merging local terms statistics of all shards. In a way the decision to add the term as a candidate is made without being very certain about if the term will actually reach the required min_doc_count. This might cause many (globally) high frequent terms to be missing in the final result if low frequent terms populated the candidate lists. To avoid this, the shard_size parameter can be increased to allow more candidate terms on the shards. However, this increases memory consumption and network traffic.
shard_min_doc_count –
Optional[int]
The parameter shard_min_doc_count regulates the certainty a shard has if the term should actually be added to the candidate list or not with respect to the min_doc_count. Terms will only be considered if their local shard frequency within the set is higher than the shard_min_doc_count. If your dictionary contains many low frequent terms and you are not interested in those (for example misspellings), then you can set the shard_min_doc_count parameter to filter out candidate terms on a shard level that will with a reasonable certainty not reach the required min_doc_count even after merging the local counts. shard_min_doc_count is set to 0 per default and has no effect unless you explicitly set it.Note
Setting min_doc_count=0 will also return buckets for terms that didn’t match any hit. However, some of the returned terms which have a document count of zero might only belong to deleted documents or documents from other types, so there is no warranty that a match_all query would find a positive document count for those terms.
Warning
When NOT sorting on doc_count descending, high values of min_doc_count may return a number of buckets which is less than size because not enough data was gathered from the shards. Missing buckets can be back by increasing shard_size. Setting shard_min_doc_count too high will cause terms to be filtered out on a shard level. This value should be set much lower than min_doc_count/#shards.
include –
Optional[Union[str, Sequence[str], Mapping[str, int]]]
A regexp pattern that filters the documents which will be aggregated.Alternatively can be a list of strings.
Parition expressions are also possible.
exclude –
Optional[Union[str, Sequence[str]]]
A regexp pattern that filters the documents which will be aggregated.Alternatively can be a list of strings.
missing –
Optional[Any]
The missing parameter defines how documents that are missing a value should be treated. By default they will be ignored but it is also possible to treat them as if they had a value.script –
Optional[dict]
Generating the terms using a script
- Returns
'AggregationInterface'
A new instance is created and returned
-
aggregation
(*aggregation_name_type, **params) → elastipy.aggregation.aggregation.Aggregation[source]¶ Creates an aggregation.
- Either call
aggregation(“sum”, field=…) to create an automatic name
- or call
aggregation(“my_name”, “sum”, field=…) to set aggregation name explicitly
- Parameters
aggregation_name_type – one or two strings, meaning either “type” or “name”, “type”
params – all parameters of the aggregation function
- Returns
Aggregation instance
-
bool
(must: Optional[Union[elastipy.query.generated_interface.QueryInterface, Mapping, Sequence[Union[elastipy.query.generated_interface.QueryInterface, Mapping]]]] = None, must_not: Optional[Union[elastipy.query.generated_interface.QueryInterface, Mapping, Sequence[Union[elastipy.query.generated_interface.QueryInterface, Mapping]]]] = None, should: Optional[Union[elastipy.query.generated_interface.QueryInterface, Mapping, Sequence[Union[elastipy.query.generated_interface.QueryInterface, Mapping]]]] = None, filter: Optional[Union[elastipy.query.generated_interface.QueryInterface, Mapping, Sequence[Union[elastipy.query.generated_interface.QueryInterface, Mapping]]]] = None) → elastipy.query.generated_interface.QueryInterface¶ A query that matches documents matching boolean combinations of other queries. The bool query maps to Lucene BooleanQuery. It is built using one or more boolean clauses, each clause with a typed occurrence.
The bool query takes a more-matches-is-better approach, so the score from each matching must or should clause will be added together to provide the final _score for each document.
- Parameters
must –
Optional[Union['QueryInterface', Mapping, Sequence[Union['QueryInterface', Mapping]]]]
The clause (query) must appear in matching documents and will contribute to the score.must_not –
Optional[Union['QueryInterface', Mapping, Sequence[Union['QueryInterface', Mapping]]]]
The clause (query) must not appear in the matching documents. Clauses are executed in filter context meaning that scoring is ignored and clauses are considered for caching. Because scoring is ignored, a score of 0 for all documents is returned.should –
Optional[Union['QueryInterface', Mapping, Sequence[Union['QueryInterface', Mapping]]]]
The clause (query) should appear in the matching document.filter –
Optional[Union['QueryInterface', Mapping, Sequence[Union['QueryInterface', Mapping]]]]
The clause (query) must appear in matching documents. However unlike must the score of the query will be ignored. Filter clauses are executed in filter context, meaning that scoring is ignored and clauses are considered for caching.
- Returns
'QueryInterface'
A new instance is created
-
client
(client)[source]¶ Replace the client that will be used for request.
- Parameters
client – an elasticsearch.Elasticsearch client or compatible
- Returns
new Search instance
-
copy
()[source]¶ Make a copy of this instance and it’s queries.
Warning
Copying of Aggregations is currently not supported so aggregations must be added at the last step, after all queries are applied.
- Returns
a new Search instance
-
property
dump
¶ Access the print interface
-
execute
() → elastipy.search.Response[source]¶ Sends the search against the current client and returns the response. If no client is specified, elastipy.connections.get(“default”) will be used.
- Returns
Response, a dict wrapper with some convenience methods
-
match
(field: str, query: Union[str, int, float, elastipy.query.generated_interface.QueryInterface.bool], auto_generate_synonyms_phrase_query: elastipy.query.generated_interface.QueryInterface.bool = True, fuzziness: Optional[str] = None, max_expansions: int = 50, prefix_length: int = 0, fuzzy_transpositions: elastipy.query.generated_interface.QueryInterface.bool = True, fuzzy_rewrite: Optional[str] = None, lenient: elastipy.query.generated_interface.QueryInterface.bool = False, operator: Optional[str] = None, minimum_should_match: Optional[str] = None, zero_terms_query: str = 'none') → elastipy.query.generated_interface.QueryInterface¶ Returns documents that match a provided text, number, date or boolean value. The provided text is analyzed before matching.
The match query is the standard query for performing a full-text search, including options for fuzzy matching.
- Parameters
field –
str
Field you wish to search.query –
Union[str, int, float, bool]
Text, number, boolean value or date you wish to find in the provided <field>.The match query analyzes any provided text before performing a search. This means the match query can search text fields for analyzed tokens rather than an exact term.
auto_generate_synonyms_phrase_query –
bool
If true, match phrase queries are automatically created for multi-term synonyms. Defaults to true.fuzziness –
Optional[str]
Maximum edit distance allowed for matching. See Fuzziness for valid values and more information. See Fuzziness in the match query for an example.max_expansions –
int
Maximum number of terms to which the query will expand. Defaults to 50.prefix_length –
int
Number of beginning characters left unchanged for fuzzy matching. Defaults to 0.fuzzy_transpositions –
bool
If true, edits for fuzzy matching include transpositions of two adjacent characters (ab → ba). Defaults to true.fuzzy_rewrite –
Optional[str]
Method used to rewrite the query. See the rewrite parameter for valid values and more information.If the fuzziness parameter is not 0, the match query uses a fuzzy_rewrite method of
top_terms_blended_freqs_${max_expansions}
by default.lenient –
bool
If true, format-based errors, such as providing a text query value for a numeric field, are ignored. Defaults to false.operator –
Optional[str]
Boolean logic used to interpret text in the query value. Valid values are:OR
(Default) For example, a query value of capital of Hungary is interpreted as capital OR of OR Hungary.AND
For example, a query value of capital of Hungary is interpreted as capital AND of AND Hungary.
minimum_should_match –
Optional[str]
Minimum number of clauses that must match for a document to be returned. See the minimum_should_match parameter for valid values and more information.zero_terms_query –
str
Indicates whether no documents are returned if the analyzer removes all tokens, such as when using a stop filter. Valid values are: none (Default) No documents are returned if the analyzer removes all tokens. all Returns all documents, similar to a match_all query.
- Returns
'QueryInterface'
A new instance is created
-
match_all
(boost: Optional[float] = None) → elastipy.query.generated_interface.QueryInterface¶ The most simple query, which matches all documents, giving them all a
_score
of 1.0.The _score can be changed with the boost parameter
- Parameters
boost –
Optional[float]
The _score can be changed with the boost parameter- Returns
'QueryInterface'
A new instance is created
-
match_none
() → elastipy.query.generated_interface.QueryInterface¶ This is the inverse of the match_all query, which matches no documents.
- Returns
'QueryInterface'
A new instance is created
-
metric
(*aggregation_name_type, **params)¶ Alias for aggregation()
-
metric_avg
(*aggregation_name: Optional[str], field: str, missing: Optional[Any] = None, script: Optional[dict] = None, return_self: bool = False)¶ A single-value metrics aggregation that computes the average of numeric values that are extracted from the aggregated documents. These values can be extracted either from specific numeric fields in the documents, or be generated by a provided script.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
missing –
Optional[Any]
script –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_boxplot
(*aggregation_name: Optional[str], field: str, compression: int = 100, missing: Optional[Any] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
compression –
int
missing –
Optional[Any]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_cardinality
(*aggregation_name: Optional[str], field: str, precision_threshold: int = 3000, missing: Optional[Any] = None, script: Optional[dict] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
precision_threshold –
int
missing –
Optional[Any]
script –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_extended_stats
(*aggregation_name: Optional[str], field: str, sigma: float = 3.0, missing: Optional[Any] = None, script: Optional[dict] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
sigma –
float
missing –
Optional[Any]
script –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_geo_bounds
(*aggregation_name: Optional[str], field: str, wrap_longitude: bool = True, return_self: bool = False)¶ A metric aggregation that computes the bounding box containing all geo values for a field.
The Geo Bounds Aggregation is also supported on geo_shape fields.
If wrap_longitude is set to true (the default), the bounding box can overlap the international date line and return a bounds where the top_left longitude is larger than the top_right longitude.
For example, the upper right longitude will typically be greater than the lower left longitude of a geographic bounding box. However, when the area crosses the 180° meridian, the value of the lower left longitude will be greater than the value of the upper right longitude. See Geographic bounding box on the Open Geospatial Consortium website for more information.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
The field defining the geo_point or geo_shapewrap_longitude –
bool
An optional parameter which specifies whether the bounding box should be allowed to overlap the international date line. The default value is true.return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_geo_centroid
(*aggregation_name: Optional[str], field: str, return_self: bool = False)¶ A metric aggregation that computes the weighted centroid from all coordinate values for geo fields.
The centroid metric for geo-shapes is more nuanced than for points. The centroid of a specific aggregation bucket containing shapes is the centroid of the highest-dimensionality shape type in the bucket. For example, if a bucket contains shapes comprising of polygons and lines, then the lines do not contribute to the centroid metric. Each type of shape’s centroid is calculated differently. Envelopes and circles ingested via the Circle are treated as polygons.
Warning
Using geo_centroid as a sub-aggregation of
geohash_grid
:The geohash_grid aggregation places documents, not individual geo-points, into buckets. If a document’s geo_point field contains multiple values, the document could be assigned to multiple buckets, even if one or more of its geo-points are outside the bucket boundaries.
If a geocentroid sub-aggregation is also used, each centroid is calculated using all geo-points in a bucket, including those outside the bucket boundaries. This can result in centroids outside of bucket boundaries.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
The field defining the geo_point or geo_shapereturn_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_matrix_stats
(*aggregation_name: Optional[str], fields: list, mode: str = 'avg', missing: Optional[Any] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.fields –
list
mode –
str
missing –
Optional[Any]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_max
(*aggregation_name: Optional[str], field: str, missing: Optional[Any] = None, script: Optional[dict] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
missing –
Optional[Any]
script –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_median_absolute_deviation
(*aggregation_name: Optional[str], field: str, compression: int = 1000, missing: Optional[Any] = None, script: Optional[dict] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
compression –
int
missing –
Optional[Any]
script –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_min
(*aggregation_name: Optional[str], field: str, missing: Optional[Any] = None, script: Optional[dict] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
missing –
Optional[Any]
script –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_percentile_ranks
(*aggregation_name: Optional[str], field: str, values: list, keyed: bool = True, hdr__number_of_significant_value_digits: Optional[int] = None, missing: Optional[Any] = None, script: Optional[dict] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
values –
list
keyed –
bool
hdr__number_of_significant_value_digits –
Optional[int]
missing –
Optional[Any]
script –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_percentiles
(*aggregation_name: Optional[str], field: str, percents: list = '(1, 5, 25, 50, 75, 95, 99)', keyed: bool = True, tdigest__compression: int = 100, hdr__number_of_significant_value_digits: Optional[int] = None, missing: Optional[Any] = None, script: Optional[dict] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
percents –
list
keyed –
bool
tdigest__compression –
int
hdr__number_of_significant_value_digits –
Optional[int]
missing –
Optional[Any]
script –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_rate
(*aggregation_name: Optional[str], unit: str, field: Optional[str] = None, script: Optional[dict] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.unit –
str
field –
Optional[str]
script –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_scripted_metric
(*aggregation_name: Optional[str], map_script: str, combine_script: str, reduce_script: str, init_script: Optional[str] = None, params: Optional[dict] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.map_script –
str
combine_script –
str
reduce_script –
str
init_script –
Optional[str]
params –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_stats
(*aggregation_name: Optional[str], field: str, missing: Optional[Any] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
missing –
Optional[Any]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_string_stats
(*aggregation_name: Optional[str], field: str, show_distribution: bool = False, missing: Optional[Any] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
show_distribution –
bool
missing –
Optional[Any]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_sum
(*aggregation_name: Optional[str], field: str, missing: Optional[Any] = None, script: Optional[dict] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
str
missing –
Optional[Any]
script –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_t_test
(*aggregation_name: Optional[str], a__field: str, b__field: str, type: str, a__filter: Optional[dict] = None, b__filter: Optional[dict] = None, script: Optional[dict] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.a__field –
str
b__field –
str
type –
str
a__filter –
Optional[dict]
b__filter –
Optional[dict]
script –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_top_hits
(*aggregation_name: Optional[str], size: int, sort: Optional[dict] = None, _source: Optional[dict] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.size –
int
sort –
Optional[dict]
_source –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_top_metrics
(*aggregation_name: Optional[str], metrics: dict, sort: Optional[dict] = None, return_self: bool = False)¶ -
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.metrics –
dict
sort –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_value_count
(*aggregation_name: Optional[str], field: Optional[str] = None, script: Optional[dict] = None, return_self: bool = False)¶ A single-value metrics aggregation that counts the number of values that are extracted from the aggregated documents. These values can be extracted either from specific fields in the documents, or be generated by a provided script. Typically, this aggregator will be used in conjunction with other single-value aggregations. For example, when computing the avg one might be interested in the number of values the average is computed over.
value_count does not de-duplicate values, so even if a field has duplicates (or a script generates multiple identical values for a single document), each value will be counted individually.
Note
Because value_count is designed to work with any field it internally treats all values as simple bytes. Due to this implementation, if _value script variable is used to fetch a value instead of accessing the field directly (e.g. a “value script”), the field value will be returned as a string instead of it’s native format.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.field –
Optional[str]
The field who’s values should be countedscript –
Optional[dict]
Alternatively counting the values generated by a scriptreturn_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
metric_weighted_avg
(*aggregation_name: Optional[str], value__field: str, weight__field: str, value__missing: Optional[Any] = None, weight__missing: Optional[Any] = None, format: Optional[str] = None, value_type: Optional[str] = None, script: Optional[dict] = None, return_self: bool = False)¶ A single-value metrics aggregation that computes the weighted average of numeric values that are extracted from the aggregated documents. These values can be extracted either from specific numeric fields in the documents.
When calculating a regular average, each datapoint has an equal “weight” … it contributes equally to the final value. Weighted averages, on the other hand, weight each datapoint differently. The amount that each datapoint contributes to the final value is extracted from the document, or provided by a script.
As a formula, a weighted average is the
∑(value * weight) / ∑(weight)
A regular average can be thought of as a weighted average where every value has an implicit weight of 1
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.value__field –
str
The field that values should be extracted fromweight__field –
str
The field that weights should be extracted fromvalue__missing –
Optional[Any]
A value to use if the field is missing entirelyweight__missing –
Optional[Any]
A weight to use if the field is missing entirelyformat –
Optional[str]
value_type –
Optional[str]
script –
Optional[dict]
return_self –
bool
If True, this call returns the created metric, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
property
param
¶ Access to the search parameters
-
pipeline
(*aggregation_name_type, **params)¶ Alias for aggregation()
-
pipeline_avg_bucket
(*aggregation_name: Optional[str], buckets_path: str, gap_policy: str = 'skip', format: Optional[str] = None, return_self: bool = False)¶ A sibling pipeline aggregation which calculates the (mean) average value of a specified metric in a sibling aggregation. The specified metric must be numeric and the sibling aggregation must be a multi-bucket aggregation.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.buckets_path –
str
The path to the buckets we wish to find the average for.See: bucket path syntax
gap_policy –
str
The policy to apply when gaps are found in the data.See: gap policy
format –
Optional[str]
Format to apply to the output value of this aggregationreturn_self –
bool
If True, this call returns the created pipeline, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
pipeline_bucket_script
(*aggregation_name: Optional[str], script: str, buckets_path: Mapping[str, str], gap_policy: str = 'skip', format: Optional[str] = None, return_self: bool = False)¶ A parent pipeline aggregation which executes a script which can perform per bucket computations on specified metrics in the parent multi-bucket aggregation. The specified metric must be numeric and the script must return a numeric value.
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.script –
str
The script to run for this aggregation. The script can be inline, file or indexed. (see Scripting for more details)buckets_path –
Mapping[str, str]
A map of script variables and their associated path to the buckets we wish to use for the variable (see buckets_path Syntax for more details)gap_policy –
str
The policy to apply when gaps are found in the data (see Dealing with gaps in the data for more details)format –
Optional[str]
Format to apply to the output value of this aggregationreturn_self –
bool
If True, this call returns the created pipeline, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
pipeline_derivative
(*aggregation_name: Optional[str], buckets_path: str, gap_policy: str = 'skip', format: Optional[str] = None, units: Optional[str] = None, return_self: bool = False)¶ A parent pipeline aggregation which calculates the derivative of a specified metric in a parent histogram (or date_histogram) aggregation. The specified metric must be numeric and the enclosing histogram must have min_doc_count set to 0 (default for histogram aggregations).
- Parameters
aggregation_name –
Optional[str]
Optional name of the aggregation. Otherwise it will be auto-generated.buckets_path –
str
The path to the buckets we wish to find the average for.See: bucket path syntax
gap_policy –
str
The policy to apply when gaps are found in the data.See: gap policy
format –
Optional[str]
Format to apply to the output value of this aggregationunits –
Optional[str]
The derivative aggregation allows the units of the derivative values to be specified. This returns an extra field in the response normalized_value which reports the derivative value in the desired x-axis units.return_self –
bool
If True, this call returns the created pipeline, otherwise the parent is returned.
- Returns
'AggregationInterface'
A new instance is created and attached to the parent and the parent is returned, unless ‘return_self’ is True, in which case the new instance is returned.
-
query
(query: elastipy.query.generated_interface.QueryInterface)[source]¶ Replace the query.
- Parameters
query – a QueryInterface sub-class
- Returns
new Search instance
-
query_string
(query: str, default_field: Optional[str] = None, allow_leading_wildcard: elastipy.query.generated_interface.QueryInterface.bool = True, analyze_wildcard: elastipy.query.generated_interface.QueryInterface.bool = False, analyzer: Optional[str] = None, auto_generate_synonyms_phrase_query: Optional[elastipy.query.generated_interface.QueryInterface.bool] = None, boost: float = 1.0, default_operator: Optional[str] = None, enable_position_increments: elastipy.query.generated_interface.QueryInterface.bool = True, fields: Optional[Sequence[str]] = None, fuzziness: Optional[str] = None, fuzzy_max_expansions: int = 50, fuzzy_prefix_length: int = 0, fuzzy_transpositions: elastipy.query.generated_interface.QueryInterface.bool = True, lenient: elastipy.query.generated_interface.QueryInterface.bool = False, max_determinized_states: int = 10000, minimum_should_match: Optional[str] = None, quote_analyzer: Optional[str] = None, phrase_slop: int = 0, quote_field_suffix: Optional[str] = None, rewrite: Optional[str] = None, time_zone: Optional[str] = None) → elastipy.query.generated_interface.QueryInterface¶ Returns documents based on a provided query string, using a parser with a strict syntax.
This query uses a syntax to parse and split the provided query string based on operators, such as
AND
orNOT
. The query then analyzes each split text independently before returning matching documents.You can use the query_string query to create a complex search that includes wildcard characters, searches across multiple fields, and more. While versatile, the query is strict and returns an error if the query string includes any invalid syntax.
Warning
Because it returns an error for any invalid syntax, we don’t recommend using the query_string query for search boxes.
If you don’t need to support a query syntax, consider using the match query. If you need the features of a query syntax, use the simple_query_string query, which is less strict.
- Parameters
query –
str
Query string you wish to parse and use for search. See Query string syntax.default_field –
Optional[str]
Default field you wish to search if no field is provided in the query string.Defaults to the
index.query.default_field
index setting, which has a default value of*
. The*
value extracts all fields that are eligible for term queries and filters the metadata fields. All extracted fields are then combined to build a query if no prefix is specified.Searching across all eligible fields does not include nested documents. Use a nested query to search those documents.
For mappings with a large number of fields, searching across all eligible fields could be expensive.
There is a limit on the number of fields that can be queried at once. It is defined by the indices.query.bool.max_clause_count search setting, which defaults to 1024.
allow_leading_wildcard –
bool
If true, the wildcard characters * and ? are allowed as the first character of the query string. Defaults to true.analyze_wildcard –
bool
If true, the query attempts to analyze wildcard terms in the query string. Defaults to false.analyzer –
Optional[str]
Analyzer used to convert text in the query string into tokens. Defaults to the index-time analyzer mapped for the default_field. If no analyzer is mapped, the index’s default analyzer is used.auto_generate_synonyms_phrase_query –
Optional[bool]
If true, match phrase queries are automatically created for multi-term synonyms. Defaults to true. See Synonyms and the query_string query for an example.boost –
float
Floating point number used to decrease or increase the relevance scores of the query. Defaults to 1.0.Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.
default_operator –
Optional[str]
Default boolean logic used to interpret text in the query string if no operators are specified. Valid values are:OR
(Default) For example, a query string of capital of Hungary is interpreted as capital OR of OR Hungary.AND
For example, a query string of capital of Hungary is interpreted as capital AND of AND Hungary.
enable_position_increments –
bool
If true, enable position increments in queries constructed from a query_string search. Defaults to true.fields –
Optional[Sequence[str]]
Array of fields you wish to search.You can use this parameter query to search across multiple fields. See Search multiple fields.
fuzziness –
Optional[str]
Maximum edit distance allowed for matching. See Fuzziness for valid values and more information.fuzzy_max_expansions –
int
Maximum number of terms to which the query will expand. Defaults to 50.fuzzy_prefix_length –
int
Number of beginning characters left unchanged for fuzzy matching. Defaults to 0.fuzzy_transpositions –
bool
If true, edits for fuzzy matching include transpositions of two adjacent characters (ab → ba). Defaults to true.lenient –
bool
If true, format-based errors, such as providing a text query value for a numeric field, are ignored. Defaults to false.max_determinized_states –
int
Maximum number of automaton states required for the query. Default is 10000.Elasticsearch uses Apache Lucene internally to parse regular expressions. Lucene converts each regular expression to a finite automaton containing a number of determinized states.
You can use this parameter to prevent that conversion from unintentionally consuming too many resources. You may need to increase this limit to run complex regular expressions.
minimum_should_match –
Optional[str]
Minimum number of clauses that must match for a document to be returned. See the minimum_should_match parameter for valid values and more information.See How minimum_should_match works for an example.
quote_analyzer –
Optional[str]
Analyzer used to convert quoted text in the query string into tokens. Defaults to the search_quote_analyzer mapped for the default_field.For quoted text, this parameter overrides the analyzer specified in the analyzer parameter.
phrase_slop –
int
Maximum number of positions allowed between matching tokens for phrases. Defaults to 0. If 0, exact phrase matches are required. Transposed terms have a slop of 2.quote_field_suffix –
Optional[str]
Suffix appended to quoted text in the query string.You can use this suffix to use a different analysis method for exact matches. See Mixing exact search with stemming.
rewrite –
Optional[str]
Method used to rewrite the query. For valid values and more information, see the rewrite parameter.time_zone –
Optional[str]
Coordinated Universal Time (UTC) offset or IANA time zone used to convert date values in the query string to UTC.Valid values are ISO 8601 UTC offsets, such as
+01:00
or-08:00
, and IANA time zone IDs, such asAmerica/Los_Angeles
.Note
The time_zone parameter does not affect the date math value of now. now is always the current system time in UTC. However, the time_zone parameter does convert dates calculated using
now
and date math rounding. For example, thetime_zone
parameter will convert a value ofnow/d
.
- Returns
'QueryInterface'
A new instance is created
-
range
(field: str, gt: Optional[Union[str, int, float, datetime.date, datetime.datetime]] = None, gte: Optional[Union[str, int, float, datetime.date, datetime.datetime]] = None, lt: Optional[Union[str, int, float, datetime.date, datetime.datetime]] = None, lte: Optional[Union[str, int, float, datetime.date, datetime.datetime]] = None, format: Optional[str] = None, relation: str = 'INTERSECTS', time_zone: Optional[str] = None, boost: Optional[float] = None) → elastipy.query.generated_interface.QueryInterface¶ Returns documents that contain terms within a provided range.
When the <field> parameter is a date field data type, you can use date math with the
gt
,gte
,lt
andlte
parameters. See date math- Parameters
field –
str
Field you wish to search.gt –
Optional[Union[str, int, float, date, datetime]]
Greater than.gte –
Optional[Union[str, int, float, date, datetime]]
Greater than or equal to.lt –
Optional[Union[str, int, float, date, datetime]]
Less than.lte –
Optional[Union[str, int, float, date, datetime]]
Less than or equal to.format –
Optional[str]
Date format used to convert date values in the query.By default, Elasticsearch uses the date format provided in the <field>`s mapping. This value overrides that mapping format.
For valid syntax see mapping data format
relation –
str
Indicates how the range query matches values for range fields. Valid values are:INTERSECTS
(Default) Matches documents with a range field value that intersects the query’s range.CONTAINS
Matches documents with a range field value that entirely contains the query’s range.WITHIN
Matches documents with a range field value entirely within the query’s range.
time_zone –
Optional[str]
Coordinated Universal Time (UTC) offset or IANA time zone used to convert date values in the query to UTC.Valid values are ISO 8601 UTC offsets, such as
+01:00
or-08:00
, and IANA time zone IDs, such asAmerica/Los_Angeles
.boost –
Optional[float]
Floating point number used to decrease or increase the relevance scores of a query. Defaults to 1.0.You can use the boost parameter to adjust relevance scores for searches containing two or more queries.
Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.
- Returns
'QueryInterface'
A new instance is created
-
property
response
¶ Access to the response of the search. Raises exception if accessed before search
- Returns
Response, a dict wrapper with some convenience methods
-
set_response
(response: Mapping)[source]¶ Sets the elasticsearch API response.
Use this if you need other means of passing the API response to the Search instance.
- Parameters
response – Mapping, the complete response from /search/ endpoint
- Returns
self
-
size
(size)[source]¶ Replace the maximum document count.
- Parameters
size – int. number of document hits to return
- Returns
new Search instance
-
sort
(*sort) → elastipy.search.Search[source]¶ Change the order of the returned documents. See sort search results.
The parameter can be:
"field"
or"-field"
to sort a field ascending or descending{"field": "asc"}
or{"field": "desc"}
to sort a field ascending or descendinga
list
of strings or objects as above to sort by a couple of fieldsNone
to turn off sorting
- Returns
Search
A new Search instance is created
-
term
(field: str, value: Union[str, int, float, elastipy.query.generated_interface.QueryInterface.bool, datetime.datetime], boost: Optional[float] = None, case_insensitive: Optional[elastipy.query.generated_interface.QueryInterface.bool] = None) → elastipy.query.generated_interface.QueryInterface¶ Returns documents that contain an exact term in a provided field.
You can use the term query to find documents based on a precise value such as a price, a product ID, or a username.
- Parameters
field –
str
Field you wish to search.value –
Union[str, int, float, bool, datetime]
Term you wish to find in the provided <field>. To return a document, the term must exactly match the field value, including whitespace and capitalization.boost –
Optional[float]
Floating point number used to decrease or increase the relevance scores of a query. Defaults to 1.0.You can use the boost parameter to adjust relevance scores for searches containing two or more queries.
Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.
case_insensitive –
Optional[bool]
Allows ASCII case insensitive matching of the value with the indexed field values when set to true. Default is false which means the case sensitivity of matching depends on the underlying field’s mapping.
- Returns
'QueryInterface'
A new instance is created
-
terms
(field: str, value: Sequence[Union[str, int, float, elastipy.query.generated_interface.QueryInterface.bool, datetime.datetime]], boost: Optional[float] = None) → elastipy.query.generated_interface.QueryInterface¶ Returns documents that contain one or more exact terms in a provided field.
The terms query is the same as the term query, except you can search for multiple values.
- Parameters
field –
str
Field you wish to search.value –
Sequence[Union[str, int, float, bool, datetime]]
The value of this parameter is an array of terms you wish to find in the provided field. To return a document, one or more terms must exactly match a field value, including whitespace and capitalization.By default, Elasticsearch limits the terms query to a maximum of 65,536 terms. You can change this limit using the index.max_terms_count setting.
boost –
Optional[float]
Floating point number used to decrease or increase the relevance scores of a query. Defaults to 1.0.You can use the boost parameter to adjust relevance scores for searches containing two or more queries.
Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.
- Returns
'QueryInterface'
A new instance is created
-
search parameters¶
-
class
elastipy.generated_search_param.
SearchParameters
(search)[source]¶ Access to this class is through Search.param.
Each method returns a new Search instance.
… CODE:
s = Search() s = s.param.explain(True).param.size(100)
-
allow_no_indices
(value: bool = True) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
bool
If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targetingfoo*,bar*
returns an error if an index starts withfoo
but no index starts withbar
.- Returns
Search
A new Search instance is created
-
allow_partial_search_results
(value: bool = True) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
bool
Iftrue
, returns partial results if there are request timeouts or shard failures. Iffalse
, returns an error with no partial results. Defaults totrue
.To override the default for this field, set the
search.default_allow_partial_results
cluster setting to false.- Returns
Search
A new Search instance is created
-
batched_reduce_size
(value: int = 512) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
int
The number of shard results that should be reduced at once on the coordinating node. This value should be used as a protection mechanism to reduce the memory overhead per search request if the potential number of shards in the request can be large. Defaults to512
.- Returns
Search
A new Search instance is created
-
ccs_minimize_roundtrips
(value: bool = True) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
bool
Iftrue
, network round-trips between the coordinating node and the remote clusters are minimized when executing cross-cluster search (CCS) requests. See How cross-cluster search handles network delays. Defaults totrue
.- Returns
Search
A new Search instance is created
-
docvalue_fields
(value: Optional[Sequence[Union[Mapping[str, str], str]]] = None) → elastipy.search.Search[source]¶ A search body parameter.
- Parameters
value –
Optional[Sequence[Union[str, Mapping[str, str]]]]
Array of wildcard (*
) patterns. The request returns doc values for field names matching these patterns in thehits.fields
property of the response.You can specify items in the array as a string or object. See Doc value fields.
Properties of
docvalue_fields
objects:field
(Required, string) Wildcard pattern. The request returns doc values for field names matching this pattern.format
(Optional, string) Format in which the doc values are returned.
For date fields, you can specify a [date format](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html9. For numeric fields, you can specify a DecimalFormat pattern.
For other field data types, this parameter is not supported.
- Returns
Search
A new Search instance is created
-
expand_wildcards
(value: str = 'open') → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
str
Controls what kind of indices that wildcard expressions can expand to. Multiple values are accepted when separated by a comma, as inopen,hidden
. Valid values are:all
Expand to open and closed indices, including hidden indices.open
Expand only to open indices.closed
Expand only to closed indices.hidden
Expansion of wildcards will include hidden indices. Must be combined with open, closed, or both.none
Wildcard expressions are not accepted.
Defaults to
open
- Returns
Search
A new Search instance is created
-
explain
(value: bool = False) → elastipy.search.Search[source]¶ A search body parameter.
- Parameters
value –
bool
Iftrue
, returns detailed information about score computation as part of a hit. Defaults tofalse
.- Returns
Search
A new Search instance is created
-
fields
(value: Optional[Sequence[Union[Mapping[str, str], str]]] = None) → elastipy.search.Search[source]¶ A search body parameter.
- Parameters
value –
Optional[Sequence[Union[str, Mapping[str, str]]]]
Array of wildcard (*
) patterns. The request returns values for field names matching these patterns in thehits.fields
property of the response.You can specify items in the array as a string or object. See Fields for more details.
Properties of
fields
objects:field
(Required, string) Wildcard pattern. The request returns values for field names matching this pattern.format
(Optional, string) Format in which the values are returned.
The date fields date and date_nanos accept a date format. Spatial fields accept either geojson for GeoJSON (the default) or wkt for Well Known Text.
For other field data types, this parameter is not supported.
- Returns
Search
A new Search instance is created
-
from_
(value: int = 0) → elastipy.search.Search[source]¶ A search body parameter.
- Parameters
value –
int
Starting document offset. Defaults to0
.By default, you cannot page through more than
10,000
hits using the from and size parameters. To page through more hits, use thesearch_after
parameter.- Returns
Search
A new Search instance is created
-
ignore_throttled
(value: bool = True) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
bool
Iftrue
, concrete, expanded or aliased indices will be ignored when frozen. Defaults totrue
.- Returns
Search
A new Search instance is created
A search query parameter.
- Parameters
value –
bool
Iftrue
, missing or closed indices are not included in the response. Defaults tofalse
.- Returns
Search
A new Search instance is created
-
indices_boost
(value: Optional[Sequence[Mapping[str, float]]] = None) → elastipy.search.Search[source]¶ A search body parameter.
- Parameters
value –
Optional[Sequence[Mapping[str, float]]]
Boosts the_score
of documents from specified indices.Properties of
indices_boost
objects:<index>: <boost-value>
<index>
is the name of the index or index alias. Wildcard (*
) expressions are supported.<boost-value>
is thefloat
factor by which scores are multiplied.
A boost value greater than
1.0
increases the score. A boost value between0
and1.0
decreases the score.- Returns
Search
A new Search instance is created
-
max_concurrent_shard_requests
(value: int = 5) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
int
Defines the number of concurrent shard requests per node this search executes concurrently. This value should be used to limit the impact of the search on the cluster in order to limit the number of concurrent shard requests. Defaults to5
.- Returns
Search
A new Search instance is created
-
min_score
(value: Optional[float] = None) → elastipy.search.Search[source]¶ A search body parameter.
- Parameters
value –
Optional[float]
Minimum_score
for matching documents. Documents with a lower_score
are not included in the search results.- Returns
Search
A new Search instance is created
-
pre_filter_shard_size
(value: Optional[int] = None) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
Optional[int]
Defines a threshold that enforces a pre-filter roundtrip to prefilter search shards based on query rewriting if the number of shards the search request expands to exceeds the threshold. This filter roundtrip can limit the number of shards significantly if for instance a shard can not match any documents based on its rewrite method ie. if date filters are mandatory to match but the shard bounds and the query are disjoint. When unspecified, the pre-filter phase is executed if any of these conditions is met:The request targets more than 128 shards.
The request targets one or more read-only index.
The primary sort of the query targets an indexed field.
- Returns
Search
A new Search instance is created
-
preference
(value: Optional[str] = None) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
Optional[str]
Nodes and shards used for the search. By default, Elasticsearch selects from eligible nodes and shards using adaptive replica selection, accounting for allocation awareness.Valid values:
_only_local
Run the search only on shards on the local node._local
If possible, run the search on shards on the local node. If not, select shards using the default method._only_nodes:<node-id>,<node-id>
Run the search on only the specified nodes IDs. If suitable shards exist on more than one selected nodes, use shards on those nodes using the default method. If none of the specified nodes are available, select shards from any available node using the default method._prefer_nodes:<node-id>,<node-id>
If possible, run the search on the specified nodes IDs. If not, select shards using the default method._shards:<shard>,<shard>
Run the search only on the specified shards. This value can be combined with other preference values, but this value must come first. For example:_shards:2,3|_local
<custom-string>
Any string that does not start with _. If the cluster state and selected shards do not change, searches using the same<custom-string>
value are routed to the same shards in the same order.
- Returns
Search
A new Search instance is created
-
q
(value: Optional[str] = None) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
Optional[str]
Query in the Lucene query string syntax.You can use the
q
parameter to run a query parameter search. Query parameter searches do not support the full Elasticsearch Query DSL but are handy for testing.Important
The
q
parameter overrides the query parameter in the request body. If both parameters are specified, documents matching the query request body parameter are not returned.- Returns
Search
A new Search instance is created
-
request_cache
(value: Optional[bool] = None) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
Optional[bool]
Iftrue
, the caching of search results is enabled for requests wheresize
is0
. See Shard request cache settings. Defaults to index level settings.- Returns
Search
A new Search instance is created
-
rest_total_hits_as_int
(value: bool = False) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
bool
Indicates whetherhits.total
should be rendered as an integer or an object in the rest search response. Defaults tofalse
.- Returns
Search
A new Search instance is created
-
routing
(value: Optional[str] = None) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
Optional[str]
Target the specified primary shard.- Returns
Search
A new Search instance is created
-
scroll
(value: Optional[str] = None) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
Optional[str]
Period to retain the search context for scrolling. Format is Time units. See Scroll search results.By default, this value cannot exceed
1d
(24 hours). You can change this limit using thesearch.max_keep_alive
cluster-level setting.- Returns
Search
A new Search instance is created
-
search_type
(value: str = 'query_then_fetch') → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
str
How distributed term frequencies are calculated for relevance scoring.Valid values:
query_then_fetch
(Default) Distributed term frequencies are calculated locally for each shard running the search. We recommend this option for faster searches with potentially less accurate scoring.dfs_query_then_fetch
Distributed term frequencies are calculated globally, using information gathered from all shards running the search. While this option increases the accuracy of scoring, it adds a round-trip to each shard, which can result in slower searches.
- Returns
Search
A new Search instance is created
-
seq_no_primary_term
(value: bool = False) → elastipy.search.Search[source]¶ A search body parameter.
- Parameters
value –
bool
Iftrue
, returns sequence number and primary term of the last modification of each hit. See Optimistic concurrency control.- Returns
Search
A new Search instance is created
-
size
(value: int = 10) → elastipy.search.Search[source]¶ A search body parameter.
- Parameters
value –
int
Defines the number of hits to return. Defaults to10
.By default, you cannot page through more than
10,000
hits using the from and size parameters. To page through more hits, use the search_after parameter.- Returns
Search
A new Search instance is created
-
sort
(value: Optional[Union[str, Sequence[Union[Mapping[str, str], str]], Mapping[str, str]]] = None) → elastipy.search.Search[source]¶ A search body parameter.
- Parameters
value –
Optional[Union[str, Sequence[Union[str, Mapping[str, str]]], Mapping[str, str]]]
Change the order of the returned documents. See sort search results.The parameter can be:
"field"
or"-field"
to sort a field ascending or descending{"field": "asc"}
or{"field": "desc"}
to sort a field ascending or descendinga
list
of strings or objects as above to sort by a couple of fieldsNone
to turn off sorting
- Returns
Search
A new Search instance is created
-
source
(value: Union[bool, str, Sequence] = True) → elastipy.search.Search[source]¶ A search body parameter.
- Parameters
value –
Union[bool, str, Sequence]
Indicates which source fields are returned for matching documents. These fields are returned in thehits._source
property of the search response. Defaults totrue
.Valid values:
true
(Boolean) The entire document source is returned.false
(Boolean) The document source is not returned.<wildcard_pattern>
(string or array of strings) Wildcard (*
) pattern or array of patterns containing source fields to return.<object>
Object containing a list of source fields to include or exclude. Properties for <object>:excludes
(string or array of strings) Wildcard (*
) pattern or array of patterns containing source fields to exclude from the response. You can also use this property to exclude fields from the subset specified in includes property.includes
(string or array of strings) Wildcard (*
) pattern or array of patterns containing source fields to return. If this property is specified, only these source fields are returned. You can exclude fields from this subset using theexcludes
property.
- Returns
Search
A new Search instance is created
-
source_excludes
(value: Optional[str] = None) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
Optional[str]
A comma-separated list of source fields to exclude from the response.You can also use this parameter to exclude fields from the subset specified in
_source_includes
query parameter.If the
_source
parameter isfalse
, this parameter is ignored.- Returns
Search
A new Search instance is created
-
source_includes
(value: Optional[str] = None) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
Optional[str]
A comma-separated list of source fields to include in the response.If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the
_source_excludes
query parameter.If the
_source
parameter isfalse
, this parameter is ignored.- Returns
Search
A new Search instance is created
-
stats
(value: Optional[Sequence[str]] = None) → elastipy.search.Search[source]¶ A search body parameter.
- Parameters
value –
Optional[Sequence[str]]
Stats groups to associate with the search. Each group maintains a statistics aggregation for its associated searches. You can retrieve these stats using the indices stats API.- Returns
Search
A new Search instance is created
-
stored_fields
(value: Optional[str] = None) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
Optional[str]
A comma-separated list of stored fields to return as part of a hit. If no fields are specified, no stored fields are included in the response.If this field is specified, the
_source
parameter defaults tofalse
. You can pass_source: true
to return both source fields and stored fields in the search response.- Returns
Search
A new Search instance is created
-
suggest_field
(value: Optional[str] = None) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
Optional[str]
Specifies which field to use for suggestions.- Returns
Search
A new Search instance is created
-
suggest_text
(value: Optional[str] = None) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
Optional[str]
The source text for which the suggestions should be returned.- Returns
Search
A new Search instance is created
-
terminate_after
(value: int = 0) → elastipy.search.Search[source]¶ A search body parameter.
- Parameters
value –
int
The maximum number of documents to collect for each shard, upon reaching which the query execution will terminate early.Defaults to
0
, which does not terminate query execution early.- Returns
Search
A new Search instance is created
-
timeout
(value: Optional[str] = None) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
Optional[str]
Specifies the period of time to wait for a response in time units. If no response is received before the timeout expires, the request fails and returns an error. Defaults to no timeout.- Returns
Search
A new Search instance is created
-
to_body
() → dict¶ Convert all parameters to the representation in the search request body :return: dict
-
to_query_params
() → dict¶ Convert all parameters to the representation as search request query parameters :return: dict
-
track_scores
(value: bool = False) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
bool
Iftrue
, calculate and return document scores, even if the scores are not used for sorting. Defaults tofalse
.- Returns
Search
A new Search instance is created
-
track_total_hits
(value: Union[int, bool] = 10000) → elastipy.search.Search[source]¶ A search query parameter.
- Parameters
value –
Union[int, bool]
Number of hits matching the query to count accurately. Defaults to10000
.If
true
, the exact number of hits is returned at the cost of some performance.If
false
, the response does not include the total number of hits matching the query.- Returns
Search
A new Search instance is created
-
printing utilities¶
-
class
elastipy.search_dump.
SearchDump
(search: elastipy.search.Search)[source]¶ -
body
(indent: Optional[Union[int, str]] = 2, file: Optional[TextIO] = None)[source]¶ Print the complete request body.
- Parameters
indent – The json indentation, defaults to 2.
file – Optional output stream.
-
query
(indent: Optional[Union[int, str]] = 2, file: Optional[TextIO] = None)[source]¶ Print the query json.
- Parameters
indent – The json indentation, defaults to 2.
file – Optional output stream.
-
-
class
elastipy.response_dump.
ResponseDump
(response: elastipy.search.Response)[source]¶ -
aggregations
(indent: Optional[Union[int, str]] = 2, file: Optional[TextIO] = None)[source]¶ Print the aggregations part of the response.
- Parameters
indent – The json indentation, defaults to 2.
file – Optional output stream.
-
documents
(indent: Optional[Union[int, str]] = 2, file: Optional[TextIO] = None)[source]¶ Print the list of documents inside the hits.
- Parameters
indent – The json indentation, defaults to 2.
file – Optional output stream.
-
table
(score: bool = True, sort: Optional[str] = None, digits: Optional[int] = None, header: bool = True, bars: bool = True, zero: Union[bool, float] = True, colors: bool = True, ascii: bool = False, max_width: Optional[int] = None, max_bar_width: int = 40, file=None)[source]¶ Print the hit documents as a table.
- Parameters
score –
bool
Include the score for each hitsort –
str
Optional sort column name which must match a ‘header’ key. Can be prefixed with-
(minus) to reverse orderdigits –
int
Optional number of digits for rounding.header –
bool
if True, include the names in the first row.bars –
bool
Enable display of horizontal bars in each number column. The table width will stretch out in size while limited to ‘max_width’ and ‘max_bar_width’zero –
If
True
: the bar axis starts at zero (or at a negative value if appropriate).If
False
: the bar starts at the minimum of all values in the column.If a number is provided, the bar starts there, regardless of the minimum of all values.
colors –
bool
Enable console colors.ascii –
bool
IfTrue
fall back to ascii characters.max_width –
int
Will limit the expansion of the table when bars are enabled. If left None, the terminal width is used.max_bar_width –
int
The maximum size a bar should havefile – Optional text stream to print to.
-