指标仪表盘(metrics_dashboard)

Milvus指标仪表板

Milvus在运行时输出详细的时间序列指标列表。您可以使用Prometheus (opens in a new tab)Grafana (opens in a new tab)来可视化指标。本主题介绍了Grafana Milvus仪表板中显示的监控指标。

该主题中的时间单位为毫秒。而在该主题中,“99th percentile”意味着统计数据的99%都在某个值内。

我们建议先阅读Milvus监控框架概述,以了解Prometheus指标。

Proxy

PanelPanel descriptionPromQL (Prometheus query language)The Milvus metrics usedMilvus metrics description
Search Vector Count Rate过去两分钟内每个代理每秒查询的向量平均数。sum(increase(milvus_proxy_search_vectors_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (pod, node_id)milvus_proxy_search_vectors_count查询向量累计查询数。
Insert Vector Count Rate过去两分钟内每个代理每秒插入的向量平均数。sum(increase(milvus_proxy_insert_vectors_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (pod, node_id)milvus_proxy_insert_vectors_count插入向量累计插入数。
Search Latency过去两分钟内每个代理接收搜索和查询请求的平均延迟和第 99 百分位延迟。p99: histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_sq_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg:sum(increase(milvus_proxy_sq_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_sq_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type)milvus_proxy_sq_latency搜索和查询请求的延迟。
Collection Search Latency过去两分钟内每个代理接收特定集合的搜索和查询请求的平均延迟和第 99 百分位延迟。p99:histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_collection_sq_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", collection_name=~"$collection"}[2m]))) avg:sum(increase(milvus_proxy_collection_sq_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", collection_name=~"$collection"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_collection_sq_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", collection_name=~"$collection"}[2m])) by (pod, node_id, query_type)milvus_proxy_collection_sq_latency_sum特定集合的搜索和查询请求的延迟。
Mutation Latency在过去两分钟内,每个代理接收对特定集合的变异请求的平均延迟和延迟的第99个百分位数。p99: histogram_quantile(0.99, sum by (le, msg_type, pod, node_id) (rate(milvus_proxy_mutation_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_proxy_mutation_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, msg_type) / sum(increase(milvus_proxy_mutation_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, msg_type)milvus_proxy_mutation_latency_sumThe latency of mutation requests.
Collection Mutation Latency过去两分钟内发送搜索和查询请求与通过代理接收结果之间的平均延迟和延迟的第99百分位数。p99: histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_collection_sq_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", collection_name=~"$collection"}[2m]))) avg: sum(increase(milvus_proxy_collection_sq_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", collection_name=~"$collection"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_collection_sq_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", collection_name=~"$collection"}[2m])) by (pod, node_id, query_type)milvus_proxy_collection_sq_latency_sum过去两分钟内按代理聚合搜索和查询结果的平均延迟和延迟的第99百分位数。
Reduce Search Result Latency过去两分钟内按代理聚合搜索和查询结果的平均延迟和延迟的第99百分位数。p99: histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_sq_reduce_result_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_proxy_sq_reduce_result_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_sq_reduce_result_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type)milvus_proxy_sq_reduce_result_latencyThe latency of aggregating search and query results returned by each query node.
Decode Search Result Latency在过去两分钟内通过代理解码搜索和查询结果的平均延迟和延迟的第99百分位数。p99: histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_sq_decode_result_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_proxy_sq_decode_result_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_sq_decode_resultlatency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type)milvus_proxy_sq_decode_result_latencyThe latency of decoding each search and query result.
Msg Stream Object Num过去两分钟内每个代理在其相应物理主题上创建的msgstream对象的平均数、最大数和最小数。avg(milvus_proxy_msgstream_obj_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) max(milvus_proxy_msgstream_obj_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) min(milvus_proxy_msgstream_obj_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_proxy_msgstream_obj_numThe number of msgstream objects created on each physical topic.
Mutation Send Latency每个代理在过去两分钟内发送插入或删除请求的平均延迟和延迟的第99百分位数。p99: histogram_quantile(0.99, sum by (le, msg_type, pod, node_id) (rate(milvus_proxy_mutation_send_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_proxy_mutation_send_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, msg_type) / sum(increase(milvus_proxy_mutation_send_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, msg_type)milvus_proxy_mutation_send_latencyThe latency of sending insertion or deletion requests.
Cache Hit Rate在过去两分钟内每秒包括 GeCollectionID, GetCollectionInfo , and GetCollectionSchema 的操作的平均缓存命中率。sum(increase(milvus_proxy_cache_hit_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", cache_state="hit"}[2m])/120) by(cache_name, pod, node_id) / sum(increase(milvus_proxy_cache_hit_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by(cache_name, pod, node_id)milvus_proxy_cache_hit_countThe statistics of hit and failure rate of each cache reading operation.
Cache Update Latency代理在过去两分钟内的平均延迟和缓存更新延迟的第99百分位数p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_proxy_cache_update_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_proxy_cache_update_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id) / sum(increase(milvus_proxy_cache_update_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id)milvus_proxy_cache_update_latencyThe latency of updating cache each time.
Sync Time每个代理在其对应的物理信道中同步的历元时间的平均、最大和最小数量。avg(milvus_proxy_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) max(milvus_proxy_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) min(milvus_proxy_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_proxy_sync_epoch_timeEach physical channel's epoch time (Unix time, the milliseconds passed ever since January 1, 1970). There is a default ChannelName apart from the physical channels.
Apply PK Latency每个代理在过去两分钟内的平均延迟和主键应用程序延迟的第99百分位数。p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_proxy_apply_pk_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_proxy_apply_pk_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id) / sum(increase(milvus_proxy_apply_pk_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id)milvus_proxy_apply_pk_latencyThe latency of applying primary key.
Apply Timestamp Latency每个代理在过去两分钟内的平均延迟和主键应用程序延迟的第99百分位数。p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_proxy_apply_timestamp_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_proxy_apply_timestamp_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id) / sum(increase(milvus_proxy_apply_timestamp_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id)milvus_proxy_apply_timestamp_latencyThe latency of applying timestamp.
Request Success Rate每个代理接收的所有类型的成功请求数,并分别显示每种类型的统计信息。可能的请求类型有DescribeCollection、DescribeIndex、GetCollectionStatistics、HasCollection、Search、Query、ShowPartitions、Insert等。sum(increase(milvus_proxy_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", status="success"}[2m])/120) by(function_name, pod, node_id)milvus_proxy_req_countThe number of all types of receiving requests
Request Failed Rate每个代理接收的所有类型的失败请求数,并分别显示每种类型的统计信息。可能的请求类型有DescribeCollection、DescribeIndex、GetCollectionStatistics、HasCollection、Search、Query、ShowPartitions、Insert等。sum(increase(milvus_proxy_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", status="fail"}[2m])/120) by(function_name, pod, node_id)milvus_proxy_req_countThe number of all types of receiving requests
Request Latency每个代理接收所有类型请求的平均延迟和延迟的第99个百分位数p99: histogram_quantile(0.99, sum by (le, pod, node_id, function_name) (rate(milvus_proxy_req_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_proxy_req_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, function_name) / sum(increase(milvus_proxy_req_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, function_name)milvus_proxy_req_latencyThe latency of all types of receiving requests
Insert/Delete Request Byte Rate代理在过去两分钟内每秒接收的插入和删除请求的字节数。sum(increase(milvus_proxy_receive_bytes_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by(pod, node_id)milvus_proxy_receive_bytes_countThe count of insert and delete requests.
Send Byte Rate在过去两分钟内,每个代理响应搜索和查询请求时,每秒发送回客户端的字节数。sum(increase(milvus_proxy_send_bytes_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by(pod, node_id)milvus_proxy_send_bytes_countThe number of bytes sent back to the client while each proxy is responding to search and query requests.

Root coordinator

PanelPanel descriptionPromQL (Prometheus query language)The Milvus metrics usedMilvus metrics description
Proxy Node Num创建的代理数。sum(milvus_rootcoord_proxy_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_rootcoord_proxy_numThe number of proxies.
Sync Time由每个物理信道(PChannel)中的每个根坐标同步的历元时间的平均值、最大值和最小值。avg(milvus_rootcoord_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) max(milvus_rootcoord_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) min(milvus_rootcoord_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_rootcoord_sync_epoch_timeEach physical channel's epoch time (Unix time, the milliseconds passed ever since January 1, 1970).
DDL Request Rate过去两分钟内每秒DDL请求的状态和数量。sum(increase(milvus_rootcoord_ddl_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, function_name)milvus_rootcoord_ddl_req_countThe total number of DDL requests including CreateCollection, DescribeCollection, DescribeSegments, HasCollection, ShowCollections, ShowPartitions, and ShowSegments.
DDL Request Latency过去两分钟内的平均延迟和DDL请求延迟的第99百分位数。p99: histogram_quantile(0.99, sum by (le, function_name) (rate(milvus_rootcoord_ddl_req_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_rootcoord_ddl_req_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (function_name) / sum(increase(milvus_rootcoord_ddl_req_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (function_name)milvus_rootcoord_ddl_req_latencyThe latency of all types of DDL requests.
Sync Timetick Latency过去两分钟内根坐标将所有时间戳同步到PChannel所用的平均延迟和时间的第99百分位数。p99: histogram_quantile(0.99, sum by (le) (rate(milvus_rootcoord_sync_timetick_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_rootcoord_sync_timetick_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) / sum(increase(milvus_rootcoord_sync_timetick_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))milvus_rootcoord_sync_timetick_latencythe time used by root coord to sync all timestamp to pchannel.
ID Alloc Rate在过去两分钟内,根坐标每秒分配的ID数。sum(increase(milvus_rootcoord_id_alloc_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120)milvus_rootcoord_id_alloc_countThe accumulated number of IDs assigned by root coord.
Timestamp根坐标的最新时间戳。milvus_rootcoord_timestamp{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}milvus_rootcoord_timestampThe latest timestamp of root coord.
Timestamp Savedroot coord保存在meta存储中的预先分配的时间戳。milvus_rootcoord_timestamp_saved{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}milvus_rootcoord_timestamp_savedThe pre-assigned timestamps that root coord saves in meta storage. The timestamps are assigned 3 seconds earlier. And the timestamp is updated and saved in meta storage every 50 millisecond.
Collection Num集合的总数。sum(milvus_rootcoord_collection_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_rootcoord_collection_numThe total number of collections existing in Milvus currently.
Partition Num分区的总数sum(milvus_rootcoord_partition_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_rootcoord_partition_numThe total number of partitions existing in Milvus currently.
DML Channel NumDML通道的总数。sum(milvus_rootcoord_dml_channel_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_rootcoord_dml_channel_numThe total number of DML channels existing in Milvus currently.
Msgstream Nummsgstream的总数。sum(milvus_rootcoord_msgstream_obj_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_rootcoord_msgstream_obj_numThe total number of msgstreams in Milvus currently.
Credential Num凭据总数。sum(milvus_rootcoord_credential_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_rootcoord_credential_numThe total number of credentials in Milvus currently.
Time Tick Delay所有DataNode和QueryNode上的流图的最大计时延迟之和。sum(milvus_rootcoord_time_tick_delay{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_rootcoord_time_tick_delayThe maximum time tick delay of the flow graphs on each DataNode and QueryNode.

Query coordinator

PanelPanel descriptionPromQL (Prometheus query language)The Milvus metrics usedMilvus metrics description
Collection Loaded Num当前加载到内存中的集合数。sum(milvus_querycoord_collection_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_querycoord_collection_numThe number of collections that are currently loaded by Milvus.
Entity Loaded Num当前加载到内存中的实体数。sum(milvus_querycoord_entity_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_querycoord_entitiy_numThe number of entities that are currently loaded by Milvus.
Load Request Rate过去两分钟内每秒的加载请求数sum(increase(milvus_querycoord_load_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])120) by (status)milvus_querycoord_load_req_countThe accumulated number of load requests.
Release Request RateThe number of release requests per second within the past two minutes.sum(increase(milvus_querycoord_release_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status)milvus_querycoord_release_req_countThe accumulated number of release requests.
Load Request Latency过去两分钟内每秒的释放请求数p99: histogram_quantile(0.99, sum by (le) (rate(milvus_querycoord_load_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querycoord_load_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) / sum(increase(milvus_querycoord_load_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))milvus_querycoord_load_latencyThe time used to complete a load request.
Release Request Latency过去两分钟内的平均延迟和加载请求延迟的第99百分位数。p99: histogram_quantile(0.99, sum by (le) (rate(milvus_querycoord_release_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querycoord_release_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) / sum(increase(milvus_querycoord_release_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))milvus_querycoord_release_latencyThe time used to complete a release request.
Sub-Load Task子加载任务的数量。sum(milvus_querycoord_child_task_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_querycoord_child_task_numThe number of sub load tasks. A query coord splits a load request into multiple sub load tasks.
Parent Load Task父加载任务数。sum(milvus_querycoord_parent_task_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_querycoord_parent_task_numThe number of sub load tasks. Each load request corresponds to a parent task in the task queue.
Sub-Load Task Latency子负载任务在过去两分钟内的平均反应时间和反应时间的第99百分位数。p99: histogram_quantile(0.99, sum by (le) (rate(milvus_querycoord_child_task_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querycoord_child_task_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) / sum(increase(milvus_querycoord_child_task_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) namespace"}[2m])))milvus_querycoord_child_task_latencyThe latency to complete a sub load task.
Query Node Num查询坐标管理的查询节点数。sum(milvus_querycoord_querynode_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_querycoord_querynode_numThe number of query nodes managed by query coord.

Query node

PanelPanel descriptionPromQL (Prometheus query language)The Milvus metrics usedMilvus metrics description
Collection Loaded NumThe number of collections loaded into memory by each query node.sum(milvus_querynode_collection_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_querynode_collection_numThe number of collection loaded by each query node.
Partition Loaded NumThe number of partitions loaded into memory by each query node.sum(milvus_querynode_partition_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_querynode_partition_numThe number of partitions loaded by each query node.
Segment Loaded NumThe number of segments loaded into memory by each query node.sum(milvus_querynode_segment_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_querynode_segment_numThe number of segments loaded by each query node.
Queryable Entity NumThe number of queryable and searchable entities on each query node.sum(milvus_querynode_entity_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_querynode_entity_numThe number of queryable and searchable entities on each query node.
DML Virtual ChannelThe number of DML virtual channels watched by each query node.sum(milvus_querynode_dml_vchannel_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_querynode_dml_vchannel_numThe number of DML virtual channels watched by each query node.
Delta Virtual ChannelThe number of delta channels watched by each query node.sum(milvus_querynode_delta_vchannel_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_querynode_delta_vchannel_numThe number of delta channels watched by each query node.
Consumer NumThe number of consumers in each query node.sum(milvus_querynode_consumer_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_querynode_consumer_numThe number of consumers in each query node.
Search Request RateThe total number of search and query requests received per second by each query node and the number of successful search and query requests within the past two minutes.sum(increase(milvus_querynode_sq_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (query_type, status, pod, node_id)milvus_querynode_sq_req_countThe accumulated number of search and query requests.
Search Request LatencyThe average latency and the 99th percentile of the time used in search and query requests by each query node within the past two minutes. This panel displays the latency of search and query requests whose status are "success" or "total".p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_sq_req_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querynode_sq_req_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type) / sum(increase(milvus_querynode_sq_req_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type)milvus_querynode_sq_req_latencyThe search request latency of query node.
Search in Queue LatencyThe average latency and the 99th percentile of the latency of search and query requests in queue within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id, query_type) (rate(milvus_querynode_sq_queue_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querynode_sq_queue_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type) / sum(increase(milvus_querynode_sq_queue_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type)milvus_querynode_sq_queue_latencyThe latency of the search and query requests received by query node.
Search Segment LatencyThe average latency and the 99th percentile of the time each query node takes to search and query a segment within the past two minutes. The status of a segment can be sealed or growing.p99: histogram_quantile(0.99, sum by (le, query_type, segment_state, pod, node_id) (rate(milvus_querynode_sq_segment_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querynode_sq_segment_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type, segment_state) / sum(increase(milvus_querynode_sq_segment_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type, segment_state)milvus_querynode_sq_segment_latencyThe time each query node takes to search and query each segment.
Segcore Request LatencyThe average latency and the 99th percentile of the time each query node takes to search and query in segcore within the past two minutes.p99: histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_querynode_sq_core_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querynode_sq_core_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type) / sum(increase(milvus_querynode_sq_core_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type)milvus_querynode_sq_core_latencyThe time each query node takes to search and query in segcore.
Search Reduce LatencyThe average latency and the 99th percentile of the time used by each query node during the reduce stage of a search or query within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id, query_type) (rate(milvus_querynode_sq_reduce_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querynode_sq_reduce_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type) / sum(increase(milvus_querynode_sq_reduce_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type)milvus_querynode_sq_reduce_latencyThe time each query spends during the stage of reduce.
Load Segment LatencyThe average latency and the 99th percentile of the time each query node takes to load a segment in the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_load_segment_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querynode_load_segment_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_load_segment_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_querynode_load_segment_latency_bucketThe time each query node takes to load a segment.
Flowgraph NumThe number of flowgraphs in each query node.sum(milvus_querynode_flowgraph_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_querynode_flowgraph_numThe number of flowgraphs in each query node.
Unsolved Read Task LengthThe length of the queue of unsolved read requests in each query node.sum(milvus_querynode_read_task_unsolved_len{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_querynode_read_task_unsolved_lenThe length of the queue of unsolved read requests.
Ready Read Task LengthThe length of the queue of read requests to be executed in each query node.sum(milvus_querynode_read_task_ready_len{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_querynode_read_task_ready_lenThe length of the queue of read requests to be executed.
Parallel Read Task NumThe number of concurrent read requests currently executed in each query node.sum(milvus_querynode_read_task_concurrency{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_querynode_read_task_concurrencyThe number of concurrent read requests currently executed.
Estimate CPU UsageThe CPU usage by each query node estimated by the scheduler.sum(milvus_querynode_estimate_cpu_usage{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_querynode_estimate_cpu_usageThe CPU usage by each query node estimated by the scheduler. When the value is 100, this means a whole virtual CPU (vCPU) is used.
Search Group SizeThe average number and the 99th percentile of the search group size (i.e. The total number of original search requests in the combined search requests executed by each query node) within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_group_size_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querynode_search_group_size_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_group_size_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_querynode_load_segment_latency_bucketThe number of original search tasks among the combined search tasks from different buckets (i.e. The search group size).
Search NQThe average number and the 99th percentile of the number of queries (NQ) done while each query node executes search requests within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_group_size_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querynode_search_group_size_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_group_size_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_querynode_load_segment_latency_bucketThe number of queries (NQ) of search requests.
Search Group NQThe average number and the 99th percentile of NQ of search requests combined and executed by each query node within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_group_nq_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querynode_search_group_nq_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_group_nq_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_querynode_load_segment_latency_bucketThe NQ of search requests combined from different buckets.
Search Top_KThe average number and the 99th percentile of the Top_K of search requests executed by each query node within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_topk_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querynode_search_topk_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_topk_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_querynode_load_segment_latency_bucketThe Top_K of search requests.
Search Group Top_KThe average number and the 99th percentile of the Top_K of search requests combined and executed by each query node within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_group_topk_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_querynode_search_group_topk_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_group_topk_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_querynode_load_segment_latency_bucketThe Top_K of search requests combined from different buckets .
Evicted Read Requests RateThe number of read requests evicted per second by each query node within the past two minutes.sum(increase(milvus_querynode_read_evicted_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (pod, node_id)milvus_querynode_sq_req_countThe accumulated number of read requests evicted by query node due to traffic restriction.

Data coordinator

PanelPanel descriptionPromQL (Prometheus query language)The Milvus metrics usedMilvus metrics description
Data Node NumThe number of data nodes managed by data coord.sum(milvus_datacoord_datanode_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_datacoord_datanode_numThe number of data nodes managed by data coord.
Segment NumThe number of all types of segments recorded in metadata by data coord.sum(milvus_datacoord_segment_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (segment_state)milvus_datacoord_segment_numThe number of all types of segments recorded in metadata by data coord. Types of segment include: dropped, flushed, flushing, growing, and sealed.
Collection NumThe number of collections recorded in metadata by data coord.sum(milvus_datacoord_collection_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_datacoord_collection_numThe number of collections recorded in metadata by data coord.
Stored RowsThe accumulated number of rows of valid and flushed data in data coord.sum(milvus_datacoord_stored_rows_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_datacoord_stored_rows_numThe accumulated number of rows of valid and flushed data in data coord.
Stored Rows RateThe average number of rows flushed per second within the past two minutes.sum(increase(milvus_datacoord_stored_rows_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (pod, node_id)milvus_datacoord_stored_rows_countThe accumulated number of rows flushed by data coord.
Sync TimeThe average, maximum, and minimum number of epoch time synced by data coord in each physical channel.avg(milvus_datacoord_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) max(milvus_datacoord_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) min(milvus_datacoord_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_datacoord_sync_epoch_timeEach physical channel's epoch time (Unix time, the milliseconds passed ever since January 1, 1970).
Stored Binlog SizeThe total size of stored binlog.sum(milvus_datacoord_stored_binlog_size{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_datacoord_stored_binlog_sizeThe total size of binlog stored in Milvus.

Data node

PanelPanel descriptionPromQL (Prometheus query language)The Milvus metrics usedMilvus metrics description
Flowgraph NumThe number of flowgraph objects that correspond to each data node.sum(milvus_datanode_flowgraph_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_datanode_flowgraph_numThe number of flowgraph objects. Each shard in a collection corresponds to a flowgraph object.
Msg Rows Consume RateThe number of rows of streaming messages consumed per second by each data node within the past two minutes.sum(increase(milvus_datanode_msg_rows_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (msg_type, pod, node_id)milvus_datanode_msg_rows_countThe number of rows of streaming messages consumed. Currently, streaming messages counted by data node only include insertion and deletion messages.
Flush Data Size RateThe size of each flushed message recorded per second by each data node within the past two minutes.sum(increase(milvus_datanode_flushed_data_size{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (msg_type, pod, node_id)milvus_datanode_flushed_data_sizeThe size of each flushed message. Currently, streaming messages counted by data node only include insertion and deletion messages.
Consumer NumThe number of consumers created on each data node.sum(milvus_datanode_consumer_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_datanode_consumer_numThe number of consumers created on each data node. Each flowgraph corresponds to a consumer.
Producer NumThe number of producers created on each data node.sum(milvus_datanode_producer_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_datanode_producer_numThe number of consumers created on each data node. Each shard in a collection corresponds to a delta channel producer and a timetick channel producer.
Sync TimeThe average, maximum, and minimum number of epoch time synced by each data node in all physical topics.avg(milvus_datanode_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) max(milvus_datanode_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) min(milvus_datanode_sync_epoch_time{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_datanode_sync_epoch_timeThe epoch time (Unix time, the milliseconds passed ever since January 1, 1970.) of each physical topic on a data node.
Unflushed Segment NumThe number of unflushed segments created on each data node.sum(milvus_datanode_unflushed_segment_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id)milvus_datanode_unflushed_segment_numThe number of unflushed segments created on each data node.
Encode Buffer LatencyThe average latency and the 99th percentile of the time used to encode a buffer by each data node within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_datanode_encode_buffer_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_datanode_encode_buffer_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_datanode_encode_buffer_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_datanode_encode_buffer_latencyThe time each data node takes to encode a buffer.
Save Data LatencyThe average latency and the 99th percentile of the time used to write a buffer into the storage layer by each data node within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_datanode_save_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_datanode_save_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_datanode_save_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_datanode_save_latencyThe time each data node takes to write a buffer into the storage layer.
Flush Operate RateThe number of times each data node flushes a buffer per second within the past two minutes.sum(increase(milvus_datanode_flush_buffer_op_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, pod, node_id)milvus_datanode_flush_buffer_op_countThe accumulated number of times a data node flushes a buffer.
Autoflush Operate RateThe number of times each data node auto-flushes a buffer per second within the past two minutes.sum(increase(milvus_datanode_autoflush_buffer_op_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, pod, node_id)milvus_datanode_autoflush_buffer_op_countThe accumulated number of times a data node auto-flushes a buffer.
Flush Request RateThe number of times each data node receives a buffer flush request per second within the past two minute.sum(increase(milvus_datanode_flush_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, pod, node_id)milvus_datanode_flush_req_countThe accumulated number of times a data node receives a flush request from a data coord.
Compaction LatencyThe average latency and the 99 the percentile of the time each data node takes to execute a compaction task within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_datanode_compaction_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_datanode_compaction_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_datanode_compaction_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_datanode_compaction_latencyThe time each data node takes to execute a compaction task.

Index coordinator

PanelPanel descriptionPromQL (Prometheus query language)The Milvus metrics usedMilvus metrics description
Index Request RateThe average number of index building requests received per second by index coord within the past two minutes.sum(increase(milvus_indexcoord_indexreq_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status)milvus_indexcoord_indexreq_countThe number of index building requests received by index coord.
Index Task CountThe count of all indexing tasks recorded by index coord in index metadata.sum(milvus_indexcoord_indextask_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (index_task_status)milvus_indexcoord_indextask_countThe count of all indexing tasks recorded by index coord in index metadata.
Index Node NumThe number of index nodes managed by index coord.sum(milvus_indexcoord_indexnode_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance)milvus_indexcoord_indexnode_numThe number of index nodes managed by index coord.

Index node

PanelPanel descriptionPromQL (Prometheus query language)The Milvus metrics usedMilvus metrics description
Index Task RateThe average number of index building tasks received by each index node per second within the past two minutes.sum(increase(milvus_indexnode_index_task_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, pod, node_id)milvus_indexnode_index_task_countThe number of index building tasks received.
Load Field LatencyThe average latency and the 99th percentile of the time used by each index node to load segment field data each time within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_load_field_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_indexnode_load_field_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_load_field_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_indexnode_load_field_latencyThe time used by index node to load segment field data.
Decode Field LatencyThe average latency and the 99th percentile of the time used by each index node to encode field data each time within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_decode_field_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_indexnode_decode_field_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_decode_field_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_indexnode_decode_field_latencyThe time used to decode field data.
Build Index LatencyThe average latency and the 99th percentile of the time used by each index node to build indexes within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_build_index_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_indexnode_build_index_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_build_index_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_indexnode_build_index_latencyThe time used to build indexes.
Encode Index LatencyThe average latency and the 99th percentile of the time used by each index node to encode index files within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_encode_index_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_indexnode_encode_index_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_encode_index_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_indexnode_encode_index_latencyThe time used to encode index files.
Save Index LatencyThe average latency and the 99th percentile of the time used by each index node to save index files within the past two minutes.p99: histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_save_index_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))) avg: sum(increase(milvus_indexnode_save_index_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_save_index_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)milvus_indexnode_save_index_latencyThe time used to save index files.