fix: field metaqueries take fast path if predicate is only on `_measurement` #21962

williamhbaker · 2021-07-28T19:27:07Z

Closes #21961

This PR updates the logic for detecting if a query that is attempting to get values for _field contains a predicate on something other than _measurement.

Since the influxql expression will have had references to _measurement or the bytes equivalent \x00 replaced with _name by reads.NodeToExpr:

influxdb/storage/reads/influxql_predicate.go

Lines 137 to 143 in 5d84c60

    
           case datatypes.NodeTypeTagRef: 
        
           	ref := n.GetTagRefValue() 
        
           	if v.remap != nil { 
        
           		if nk, ok := v.remap[ref]; ok { 
        
           			ref = nk 
        
           		} 
        
           	}

...we need to check for that equivalent remapped value when determining if an expression node contains something other than that.

This will allow a query like the one in #21961 to avoid a performance-intensive block scan. The performance increase can be dramatic when querying a large number of series, and this kind of query is very common when exploring data via the UI.

A relevant existing test is https://github.com/influxdata/flux/blob/master/stdlib/influxdata/influxdb/schema/show_fields_with_pred_test.flux, which ensures that the correct result is produced when a query for fields does include a non-measurement predicate. I've also verified locally that a query for fields with a predicate of only _measurement produces the correct result, and adding that test case would probably be a good idea as well.

Loading a database with ~1 million series and running the query listed in #21961 results in the following from query_benchmarker_influxdb when running locally on my machine, an average query time of ~10ms over 100 queries executed:

{
   "InfluxDB (Flux) field keys":{
      "count":100,
      "max":12.232133,
      "maxRate":81.75189069641411,
      "mean":9.265909370000003,
      "meanRate":107.92248877780678,
      "min":8.842354,
      "minRate":113.09205670797617,
      "sum":0.9265909370000003
   }
}

Running the same benchmark against the build from the commit just prior to this one results in a query time of ~4 seconds (note: the benchmark time was limited to 30 seconds, so only 9 queries ran):

{
   "InfluxDB (Flux) field keys":{
      "count":9,
      "max":4275.679762,
      "maxRate":0.23388093956134784,
      "mean":4110.049170444445,
      "meanRate":0.24330609161346453,
      "min":3619.381888,
      "minRate":0.27629027025732855,
      "sum":36.990442534
   }
}

…rement`

wolffcm · 2021-07-28T20:23:37Z

This seems right to me. Neither measurement or field (or whatever synonyms are actually used in any given layer) is a "tag key" in the proper sense in TSM's data model. Wish that I had thought of this before!

danxmoran · 2021-07-28T22:18:13Z

@ wbaker85 do you still plan on adding a test case for a query with a _measurement-only predicate?

williamhbaker · 2021-07-28T22:51:43Z

@ wbaker85 do you still plan on adding a test case for a query with a _measurement-only predicate?

I've opened a PR in the flux repo with the additional test: influxdata/flux#3900

williamhbaker · 2021-07-29T18:06:53Z

I duplicated the test that was merged to flux with this PR so that it will run in our CI and via make test-flux until we upgrade to the latest flux. Also created an issue to remove the duplicated test when we do upgrade: #21970

This will allow us to get this performance improvement merged to master and backported to 2.0 with test coverage even if we don't upgrade to the latest flux on 2.0.

fix: field metaqueries take fast path if predicate is only on `_measu…

2a953f5

…rement`

chore: update CHANGELOG

5978cab

williamhbaker force-pushed the wb-faster-field-metaquery-21961 branch from b420275 to 5978cab Compare July 28, 2021 20:53

williamhbaker marked this pull request as ready for review July 28, 2021 20:54

williamhbaker mentioned this pull request Jul 28, 2021

test: test query fields with only a measurement predicate influxdata/flux#3900

Merged

test: add test for fields with measurement predicate

9427ba0

williamhbaker force-pushed the wb-faster-field-metaquery-21961 branch from 610eb28 to 9427ba0 Compare July 29, 2021 17:52

williamhbaker mentioned this pull request Jul 29, 2021

Remove duplicate test for field metaqueries when we update flux #21970

Closed

williamhbaker requested a review from danxmoran July 29, 2021 18:07

danxmoran approved these changes Jul 29, 2021

View reviewed changes

williamhbaker merged commit 8e80798 into master Jul 29, 2021

williamhbaker deleted the wb-faster-field-metaquery-21961 branch July 29, 2021 19:57

williamhbaker mentioned this pull request Aug 3, 2021

chore: remove duplicate flux test #22035

Merged

williamhbaker mentioned this pull request Aug 10, 2021

Benchmark a "fast" query to compare flux vs influxql #22147

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: field metaqueries take fast path if predicate is only on `_measurement` #21962

fix: field metaqueries take fast path if predicate is only on `_measurement` #21962

williamhbaker commented Jul 28, 2021 •

edited

Loading

wolffcm commented Jul 28, 2021 •

edited

Loading

danxmoran commented Jul 28, 2021

williamhbaker commented Jul 28, 2021

williamhbaker commented Jul 29, 2021

	case datatypes.NodeTypeTagRef:
	ref := n.GetTagRefValue()
	if v.remap != nil {
	if nk, ok := v.remap[ref]; ok {
	ref = nk
	}
	}

fix: field metaqueries take fast path if predicate is only on _measurement #21962

fix: field metaqueries take fast path if predicate is only on _measurement #21962

Conversation

williamhbaker commented Jul 28, 2021 • edited Loading

wolffcm commented Jul 28, 2021 • edited Loading

danxmoran commented Jul 28, 2021

williamhbaker commented Jul 28, 2021

williamhbaker commented Jul 29, 2021

fix: field metaqueries take fast path if predicate is only on `_measurement` #21962

fix: field metaqueries take fast path if predicate is only on `_measurement` #21962

williamhbaker commented Jul 28, 2021 •

edited

Loading

wolffcm commented Jul 28, 2021 •

edited

Loading