Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add aggregate filter support for groups only #426

Merged
merged 28 commits into from
May 16, 2022

Conversation

AndrewSisley
Copy link
Contributor

@AndrewSisley AndrewSisley commented May 9, 2022

Closes #99

Add aggregate filter support for groups only. Types/intelisense will be visible/available in the clients on child relationships but using them will do nothing. I have put this PR up for review and would like to merge before sorting out child relation (and inline array) filters, as this is already a large and fiddly code change, and the child relations aggregate filter is blocked by #390 and inline array aggregate filters are blocked by #392 . I will likely do #390 and then the child relations aggregate filter stuff next, so this is unlikely to impact users (or you guys).

Also fixes a bug (no ticket) similar to #390 but for _group, where if different filters were added for _group fields with different aliases they would previously both use the last defined filter. This is why _group now uses the virtual/internal field mechanic that aggregates rely on (and child relations + inline arrays will likely also need to use before migrating our document structure from map[string]interface{} to []interface{})

@AndrewSisley AndrewSisley added feature New feature or request area/query Related to the query component labels May 9, 2022
@AndrewSisley AndrewSisley added this to the DefraDB v0.3 milestone May 9, 2022
@AndrewSisley AndrewSisley self-assigned this May 9, 2022
@AndrewSisley AndrewSisley force-pushed the sisley/feat/I99-aggregate-filter branch from c9bb032 to 6c3edc5 Compare May 9, 2022 22:07
@codecov
Copy link

codecov bot commented May 9, 2022

Codecov Report

Merging #426 (e1667ae) into develop (dbbe282) will increase coverage by 0.33%.
The diff coverage is 84.73%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #426      +/-   ##
===========================================
+ Coverage    65.66%   66.00%   +0.33%     
===========================================
  Files           80       80              
  Lines         9481     9593     +112     
===========================================
+ Hits          6226     6332     +106     
  Misses        2624     2624              
- Partials       631      637       +6     
Impacted Files Coverage Δ
query/graphql/planner/average.go 78.26% <ø> (ø)
query/graphql/planner/planner.go 71.02% <56.00%> (-1.57%) ⬇️
query/graphql/planner/group.go 76.81% <69.76%> (-13.19%) ⬇️
query/graphql/planner/sum.go 75.73% <76.92%> (-2.66%) ⬇️
query/graphql/planner/render.go 86.95% <80.00%> (-1.94%) ⬇️
query/graphql/planner/select.go 75.41% <86.79%> (+4.30%) ⬆️
query/graphql/schema/generate.go 82.13% <88.00%> (+0.48%) ⬆️
query/graphql/planner/arbitrary_join.go 77.58% <89.47%> (+1.22%) ⬆️
query/graphql/parser/query.go 80.00% <93.69%> (+4.79%) ⬆️
query/graphql/planner/count.go 83.67% <100.00%> (-5.91%) ⬇️
... and 8 more

@source-devs
Copy link

Benchmark Results

Summary

  • 113 Benchmarks successfully compared.
  • 103 Benchmarks were ✅ Better.
  • 10 Benchmarks were ❌ Worse .
  • 0 Benchmarks were ✨ Unchanged.
✅ See Better Results...
time/opdelta
_Collection_UserSimple_CreateMany_Sync_0_100-4303ms ± 0%221ms ± 0%−27.24%(p=1.000 n=1+1)
_Collection_UserSimple_Create_Sync_0_10-414.2ms ± 0%10.2ms ± 0%−28.04%(p=1.000 n=1+1)
_Collection_UserSimple_Create_Sync_0_100-4108ms ± 0%101ms ± 0%−6.75%(p=1.000 n=1+1)
_Collection_UserSimple_Create_Sync_0_1000-41.19s ± 0%1.03s ± 0%−13.78%(p=1.000 n=1+1)
_Collection_UserSimple_Create_Async_0_100-449.9ms ± 0%46.7ms ± 0%−6.46%(p=1.000 n=1+1)
_Collection_UserSimple_Create_Async_0_1000-4470ms ± 0%469ms ± 0%−0.24%(p=1.000 n=1+1)
_Collection_UserSimple_Create_Async_0_10000-44.70s ± 0%4.65s ± 0%−1.04%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Sync_10_10-4420µs ± 0%351µs ± 0%−16.51%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Sync_100_100-44.55ms ± 0%3.58ms ± 0%−21.20%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Sync_1000_1000-444.8ms ± 0%38.7ms ± 0%−13.47%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Sync_1000_10-4407µs ± 0%360µs ± 0%−11.69%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Sync_1000_100-44.40ms ± 0%3.78ms ± 0%−14.13%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Async_10_10-4258µs ± 0%255µs ± 0%−1.29%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Async_100_100-41.87ms ± 0%1.81ms ± 0%−3.16%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Async_1000_1000-424.7ms ± 0%24.5ms ± 0%−1.04%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Async_1000_10-4281µs ± 0%261µs ± 0%−7.22%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Async_1000_100-41.95ms ± 0%1.85ms ± 0%−5.39%(p=1.000 n=1+1)
_Query_UserSimple_Query_Sync_100-41.22ms ± 0%1.16ms ± 0%−4.62%(p=1.000 n=1+1)
_Query_UserSimple_Query_Sync_1000-410.3ms ± 0%9.4ms ± 0%−8.44%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithFilter_Sync_10-4528µs ± 0%418µs ± 0%−20.80%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithFilter_Sync_100-41.63ms ± 0%1.31ms ± 0%−19.62%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithFilter_Sync_1000-411.1ms ± 0%10.0ms ± 0%−9.79%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithLimitOffset_Sync_10-4464µs ± 0%398µs ± 0%−14.31%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithMultiLookup_Sync_10-4736µs ± 0%707µs ± 0%−3.96%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithMultiLookup_Sync_100-4726µs ± 0%721µs ± 0%−0.67%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithMultiLookup_Sync_1000-4711µs ± 0%679µs ± 0%−4.58%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithSingleLookup_Sync_10-4325µs ± 0%288µs ± 0%−11.53%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithSingleLookup_Sync_100-4378µs ± 0%287µs ± 0%−24.05%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithSingleLookup_Sync_1000-4285µs ± 0%272µs ± 0%−4.73%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithSort_Sync_10-4462µs ± 0%399µs ± 0%−13.60%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithSort_Sync_100-41.64ms ± 0%1.39ms ± 0%−14.96%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithSort_Sync_1000-412.4ms ± 0%12.3ms ± 0%−0.89%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_10/ValueSize:0128-416.7µs ± 0%13.4µs ± 0%−19.59%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_10/ValueSize:0256-416.2µs ± 0%14.0µs ± 0%−13.54%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_10/ValueSize:0512-418.8µs ± 0%16.5µs ± 0%−11.97%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_10/ValueSize:1024-423.7µs ± 0%19.5µs ± 0%−17.61%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_100/ValueSize:0064-4149µs ± 0%131µs ± 0%−11.93%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_100/ValueSize:0128-4155µs ± 0%141µs ± 0%−9.02%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_100/ValueSize:0256-4155µs ± 0%143µs ± 0%−7.68%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_100/ValueSize:0512-4160µs ± 0%159µs ± 0%−0.83%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_100/ValueSize:1024-4203µs ± 0%198µs ± 0%−2.73%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_10/ValueSize:0064-415.7µs ± 0%15.1µs ± 0%−3.73%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_10/ValueSize:0128-415.6µs ± 0%14.9µs ± 0%−4.35%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_10/ValueSize:0256-416.9µs ± 0%15.5µs ± 0%−8.16%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_10/ValueSize:0512-418.4µs ± 0%17.4µs ± 0%−5.46%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_10/ValueSize:1024-425.4µs ± 0%23.9µs ± 0%−5.92%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_100/ValueSize:0064-4153µs ± 0%144µs ± 0%−6.46%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_100/ValueSize:0256-4164µs ± 0%157µs ± 0%−4.01%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_100/ValueSize:0512-4173µs ± 0%165µs ± 0%−4.45%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_100/ValueSize:1024-4244µs ± 0%203µs ± 0%−16.52%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_10/ValueSize:0064-451.7µs ± 0%48.2µs ± 0%−6.79%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_10/ValueSize:0128-457.8µs ± 0%48.1µs ± 0%−16.89%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_10/ValueSize:0256-457.4µs ± 0%50.1µs ± 0%−12.75%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_10/ValueSize:0512-459.5µs ± 0%53.5µs ± 0%−10.03%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_10/ValueSize:1024-469.0µs ± 0%67.0µs ± 0%−2.79%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_10/ValueSize:0064-451.5µs ± 0%49.9µs ± 0%−3.15%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_10/ValueSize:0128-465.7µs ± 0%52.6µs ± 0%−19.95%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_10/ValueSize:0256-468.7µs ± 0%52.7µs ± 0%−23.34%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_10/ValueSize:0512-463.3µs ± 0%59.6µs ± 0%−5.85%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_10/ValueSize:1024-484.8µs ± 0%71.8µs ± 0%−15.31%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_100/ValueSize:0064-4410µs ± 0%398µs ± 0%−2.89%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_100/ValueSize:0128-4497µs ± 0%349µs ± 0%−29.80%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_100/ValueSize:0256-4466µs ± 0%407µs ± 0%−12.61%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_100/ValueSize:1024-4593µs ± 0%544µs ± 0%−8.32%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_10/ValueSize:0064-4136µs ± 0%123µs ± 0%−9.24%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_10/ValueSize:0128-4190µs ± 0%131µs ± 0%−30.67%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_10/ValueSize:0256-4173µs ± 0%128µs ± 0%−25.80%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_10/ValueSize:0512-4140µs ± 0%128µs ± 0%−8.52%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_10/ValueSize:1024-4161µs ± 0%152µs ± 0%−6.00%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_100/ValueSize:0064-41.38ms ± 0%1.29ms ± 0%−6.58%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_100/ValueSize:0128-41.51ms ± 0%1.30ms ± 0%−13.75%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_100/ValueSize:0256-41.55ms ± 0%1.36ms ± 0%−12.10%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_100/ValueSize:0512-41.51ms ± 0%1.33ms ± 0%−11.96%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_100/ValueSize:1024-41.72ms ± 0%1.48ms ± 0%−13.89%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_10/ValueSize:0064-4123µs ± 0%113µs ± 0%−8.52%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_10/ValueSize:0128-4142µs ± 0%115µs ± 0%−19.11%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_10/ValueSize:0256-4157µs ± 0%125µs ± 0%−20.70%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_10/ValueSize:0512-4161µs ± 0%125µs ± 0%−21.84%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_10/ValueSize:1024-4172µs ± 0%140µs ± 0%−18.66%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_100/ValueSize:0064-41.36ms ± 0%1.18ms ± 0%−13.41%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_100/ValueSize:0128-41.56ms ± 0%1.18ms ± 0%−24.02%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_100/ValueSize:0256-41.62ms ± 0%1.18ms ± 0%−27.21%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_100/ValueSize:0512-41.50ms ± 0%1.26ms ± 0%−16.04%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_100/ValueSize:1024-41.76ms ± 0%1.33ms ± 0%−24.48%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0064-412.3µs ± 0%8.6µs ± 0%−29.57%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0128-411.8µs ± 0%10.1µs ± 0%−14.26%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0256-416.5µs ± 0%11.6µs ± 0%−29.36%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0512-420.4µs ± 0%12.9µs ± 0%−36.58%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:1024-425.5µs ± 0%16.1µs ± 0%−36.76%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0064-4147µs ± 0%102µs ± 0%−30.90%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0128-4131µs ± 0%104µs ± 0%−20.45%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0512-4147µs ± 0%130µs ± 0%−11.86%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:1024-4187µs ± 0%176µs ± 0%−5.79%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0064-4190µs ± 0%141µs ± 0%−25.60%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0128-4147µs ± 0%134µs ± 0%−9.04%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0256-4132µs ± 0%128µs ± 0%−3.60%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0512-4143µs ± 0%123µs ± 0%−13.78%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:1024-4154µs ± 0%135µs ± 0%−12.22%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0064-41.44ms ± 0%1.22ms ± 0%−14.79%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0128-41.54ms ± 0%1.23ms ± 0%−20.21%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0256-41.46ms ± 0%1.23ms ± 0%−15.75%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0512-41.43ms ± 0%1.22ms ± 0%−14.50%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:1024-41.58ms ± 0%1.24ms ± 0%−21.28%(p=1.000 n=1+1)
 
❌ See Worse Results...
time/opdelta
_Query_UserSimple_Query_WithLimitOffset_Sync_100-4495µs ± 0%513µs ± 0%+3.72%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithLimitOffset_Sync_1000-4448µs ± 0%454µs ± 0%+1.50%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_100/ValueSize:0128-4154µs ± 0%158µs ± 0%+2.81%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_100/ValueSize:0064-4354µs ± 0%385µs ± 0%+8.92%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_100/ValueSize:0128-4381µs ± 0%386µs ± 0%+1.12%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_100/ValueSize:0256-4366µs ± 0%405µs ± 0%+10.70%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_100/ValueSize:0512-4413µs ± 0%467µs ± 0%+13.14%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_100/ValueSize:1024-4495µs ± 0%503µs ± 0%+1.66%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_100/ValueSize:0512-4441µs ± 0%468µs ± 0%+6.17%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0256-4119µs ± 0%123µs ± 0%+2.81%(p=1.000 n=1+1)
 
✨ See Unchanged Results...
time/opdelta
 
🐋 See Full Results...
develop.txtcurrent.txt
time/opdelta
pkg:github.com/sourcenetwork/defradb/bench/collection goos:linux goarch:amd64
_Collection_UserSimple_CreateMany_Sync_0_10-411.6ms ± 0%10.6ms ± 0%−8.75%(p=1.000 n=1+1)
_Collection_UserSimple_CreateMany_Sync_0_100-4303ms ± 0%221ms ± 0%−27.24%(p=1.000 n=1+1)
_Collection_UserSimple_Create_Sync_0_10-414.2ms ± 0%10.2ms ± 0%−28.04%(p=1.000 n=1+1)
_Collection_UserSimple_Create_Sync_0_100-4108ms ± 0%101ms ± 0%−6.75%(p=1.000 n=1+1)
_Collection_UserSimple_Create_Sync_0_1000-41.19s ± 0%1.03s ± 0%−13.78%(p=1.000 n=1+1)
_Collection_UserSimple_Create_Async_0_100-449.9ms ± 0%46.7ms ± 0%−6.46%(p=1.000 n=1+1)
_Collection_UserSimple_Create_Async_0_1000-4470ms ± 0%469ms ± 0%−0.24%(p=1.000 n=1+1)
_Collection_UserSimple_Create_Async_0_10000-44.70s ± 0%4.65s ± 0%−1.04%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Sync_10_10-4420µs ± 0%351µs ± 0%−16.51%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Sync_100_100-44.55ms ± 0%3.58ms ± 0%−21.20%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Sync_1000_1000-444.8ms ± 0%38.7ms ± 0%−13.47%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Sync_1000_10-4407µs ± 0%360µs ± 0%−11.69%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Sync_1000_100-44.40ms ± 0%3.78ms ± 0%−14.13%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Async_10_10-4258µs ± 0%255µs ± 0%−1.29%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Async_100_100-41.87ms ± 0%1.81ms ± 0%−3.16%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Async_1000_1000-424.7ms ± 0%24.5ms ± 0%−1.04%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Async_1000_10-4281µs ± 0%261µs ± 0%−7.22%(p=1.000 n=1+1)
_Collection_UserSimple_Read_Async_1000_100-41.95ms ± 0%1.85ms ± 0%−5.39%(p=1.000 n=1+1)
pkg:github.com/sourcenetwork/defradb/bench/query/simple goos:linux goarch:amd64
_Query_UserSimple_Query_Sync_10-4399µs ± 0%338µs ± 0%−15.35%(p=1.000 n=1+1)
_Query_UserSimple_Query_Sync_100-41.22ms ± 0%1.16ms ± 0%−4.62%(p=1.000 n=1+1)
_Query_UserSimple_Query_Sync_1000-410.3ms ± 0%9.4ms ± 0%−8.44%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithFilter_Sync_10-4528µs ± 0%418µs ± 0%−20.80%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithFilter_Sync_100-41.63ms ± 0%1.31ms ± 0%−19.62%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithFilter_Sync_1000-411.1ms ± 0%10.0ms ± 0%−9.79%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithLimitOffset_Sync_10-4464µs ± 0%398µs ± 0%−14.31%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithLimitOffset_Sync_100-4495µs ± 0%513µs ± 0%+3.72%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithLimitOffset_Sync_1000-4448µs ± 0%454µs ± 0%+1.50%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithMultiLookup_Sync_10-4736µs ± 0%707µs ± 0%−3.96%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithMultiLookup_Sync_100-4726µs ± 0%721µs ± 0%−0.67%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithMultiLookup_Sync_1000-4711µs ± 0%679µs ± 0%−4.58%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithSingleLookup_Sync_10-4325µs ± 0%288µs ± 0%−11.53%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithSingleLookup_Sync_100-4378µs ± 0%287µs ± 0%−24.05%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithSingleLookup_Sync_1000-4285µs ± 0%272µs ± 0%−4.73%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithSort_Sync_10-4462µs ± 0%399µs ± 0%−13.60%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithSort_Sync_100-41.64ms ± 0%1.39ms ± 0%−14.96%(p=1.000 n=1+1)
_Query_UserSimple_Query_WithSort_Sync_1000-412.4ms ± 0%12.3ms ± 0%−0.89%(p=1.000 n=1+1)
pkg:github.com/sourcenetwork/defradb/bench/storage goos:linux goarch:amd64
_Storage_Simple_Read_Sync_1_10/ValueSize:0064-415.0µs ± 0%14.1µs ± 0%−5.69%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_10/ValueSize:0128-416.7µs ± 0%13.4µs ± 0%−19.59%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_10/ValueSize:0256-416.2µs ± 0%14.0µs ± 0%−13.54%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_10/ValueSize:0512-418.8µs ± 0%16.5µs ± 0%−11.97%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_10/ValueSize:1024-423.7µs ± 0%19.5µs ± 0%−17.61%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_100/ValueSize:0064-4149µs ± 0%131µs ± 0%−11.93%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_100/ValueSize:0128-4155µs ± 0%141µs ± 0%−9.02%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_100/ValueSize:0256-4155µs ± 0%143µs ± 0%−7.68%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_100/ValueSize:0512-4160µs ± 0%159µs ± 0%−0.83%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_1_100/ValueSize:1024-4203µs ± 0%198µs ± 0%−2.73%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_10/ValueSize:0064-415.7µs ± 0%15.1µs ± 0%−3.73%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_10/ValueSize:0128-415.6µs ± 0%14.9µs ± 0%−4.35%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_10/ValueSize:0256-416.9µs ± 0%15.5µs ± 0%−8.16%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_10/ValueSize:0512-418.4µs ± 0%17.4µs ± 0%−5.46%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_10/ValueSize:1024-425.4µs ± 0%23.9µs ± 0%−5.92%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_100/ValueSize:0064-4153µs ± 0%144µs ± 0%−6.46%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_100/ValueSize:0128-4154µs ± 0%158µs ± 0%+2.81%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_100/ValueSize:0256-4164µs ± 0%157µs ± 0%−4.01%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_100/ValueSize:0512-4173µs ± 0%165µs ± 0%−4.45%(p=1.000 n=1+1)
_Storage_Simple_Read_Sync_100_100/ValueSize:1024-4244µs ± 0%203µs ± 0%−16.52%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_10/ValueSize:0064-451.7µs ± 0%48.2µs ± 0%−6.79%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_10/ValueSize:0128-457.8µs ± 0%48.1µs ± 0%−16.89%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_10/ValueSize:0256-457.4µs ± 0%50.1µs ± 0%−12.75%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_10/ValueSize:0512-459.5µs ± 0%53.5µs ± 0%−10.03%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_10/ValueSize:1024-469.0µs ± 0%67.0µs ± 0%−2.79%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_100/ValueSize:0064-4354µs ± 0%385µs ± 0%+8.92%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_100/ValueSize:0128-4381µs ± 0%386µs ± 0%+1.12%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_100/ValueSize:0256-4366µs ± 0%405µs ± 0%+10.70%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_100/ValueSize:0512-4413µs ± 0%467µs ± 0%+13.14%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_0_100/ValueSize:1024-4495µs ± 0%503µs ± 0%+1.66%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_10/ValueSize:0064-451.5µs ± 0%49.9µs ± 0%−3.15%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_10/ValueSize:0128-465.7µs ± 0%52.6µs ± 0%−19.95%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_10/ValueSize:0256-468.7µs ± 0%52.7µs ± 0%−23.34%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_10/ValueSize:0512-463.3µs ± 0%59.6µs ± 0%−5.85%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_10/ValueSize:1024-484.8µs ± 0%71.8µs ± 0%−15.31%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_100/ValueSize:0064-4410µs ± 0%398µs ± 0%−2.89%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_100/ValueSize:0128-4497µs ± 0%349µs ± 0%−29.80%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_100/ValueSize:0256-4466µs ± 0%407µs ± 0%−12.61%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_100/ValueSize:0512-4441µs ± 0%468µs ± 0%+6.17%(p=1.000 n=1+1)
_Storage_Simple_WriteMany_Sync_100_100/ValueSize:1024-4593µs ± 0%544µs ± 0%−8.32%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_10/ValueSize:0064-4136µs ± 0%123µs ± 0%−9.24%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_10/ValueSize:0128-4190µs ± 0%131µs ± 0%−30.67%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_10/ValueSize:0256-4173µs ± 0%128µs ± 0%−25.80%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_10/ValueSize:0512-4140µs ± 0%128µs ± 0%−8.52%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_10/ValueSize:1024-4161µs ± 0%152µs ± 0%−6.00%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_100/ValueSize:0064-41.38ms ± 0%1.29ms ± 0%−6.58%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_100/ValueSize:0128-41.51ms ± 0%1.30ms ± 0%−13.75%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_100/ValueSize:0256-41.55ms ± 0%1.36ms ± 0%−12.10%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_100/ValueSize:0512-41.51ms ± 0%1.33ms ± 0%−11.96%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_0_100/ValueSize:1024-41.72ms ± 0%1.48ms ± 0%−13.89%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_10/ValueSize:0064-4123µs ± 0%113µs ± 0%−8.52%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_10/ValueSize:0128-4142µs ± 0%115µs ± 0%−19.11%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_10/ValueSize:0256-4157µs ± 0%125µs ± 0%−20.70%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_10/ValueSize:0512-4161µs ± 0%125µs ± 0%−21.84%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_10/ValueSize:1024-4172µs ± 0%140µs ± 0%−18.66%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_100/ValueSize:0064-41.36ms ± 0%1.18ms ± 0%−13.41%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_100/ValueSize:0128-41.56ms ± 0%1.18ms ± 0%−24.02%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_100/ValueSize:0256-41.62ms ± 0%1.18ms ± 0%−27.21%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_100/ValueSize:0512-41.50ms ± 0%1.26ms ± 0%−16.04%(p=1.000 n=1+1)
_Storage_Simple_Write_Sync_100_100/ValueSize:1024-41.76ms ± 0%1.33ms ± 0%−24.48%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0064-412.3µs ± 0%8.6µs ± 0%−29.57%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0128-411.8µs ± 0%10.1µs ± 0%−14.26%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0256-416.5µs ± 0%11.6µs ± 0%−29.36%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0512-420.4µs ± 0%12.9µs ± 0%−36.58%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:1024-425.5µs ± 0%16.1µs ± 0%−36.76%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0064-4147µs ± 0%102µs ± 0%−30.90%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0128-4131µs ± 0%104µs ± 0%−20.45%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0256-4119µs ± 0%123µs ± 0%+2.81%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0512-4147µs ± 0%130µs ± 0%−11.86%(p=1.000 n=1+1)
_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:1024-4187µs ± 0%176µs ± 0%−5.79%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0064-4190µs ± 0%141µs ± 0%−25.60%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0128-4147µs ± 0%134µs ± 0%−9.04%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0256-4132µs ± 0%128µs ± 0%−3.60%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0512-4143µs ± 0%123µs ± 0%−13.78%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:1024-4154µs ± 0%135µs ± 0%−12.22%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0064-41.44ms ± 0%1.22ms ± 0%−14.79%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0128-41.54ms ± 0%1.23ms ± 0%−20.21%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0256-41.46ms ± 0%1.23ms ± 0%−15.75%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0512-41.43ms ± 0%1.22ms ± 0%−14.50%(p=1.000 n=1+1)
_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:1024-41.58ms ± 0%1.24ms ± 0%−21.28%(p=1.000 n=1+1)
 

@@ -221,7 +221,7 @@ replace (
github.com/SierraSoftworks/connor => github.com/sourcenetwork/connor v1.0.3-0.20210312091030-4823d0411a12

// SourceNetwork fork og graphql-go
github.com/graphql-go/graphql => github.com/sourcenetwork/graphql v0.7.10-0.20220122211559-2fe60b2360cc
github.com/graphql-go/graphql => github.com/sourcenetwork/graphql-go v0.7.10-0.20220509213405-bc95f52e34d9
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the name of the forked repo changed? I thought the only change was that PR #4 that got merged. I do see that github.com/sourcenetwork/graphql resolves to the same remote repo. Just curious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol - that is really bad of me - I didnt spot it had changed. Looks like there is a redirect from github.com/sourcenetwork/graphql to github.com/sourcenetwork/graphql-go, maybe it got renamed at some point.

Comment on lines +22 to +27
Query: `query {
users(groupBy: [Name]) {
Name
_avg(_group: {field: Age, filter: {Age: {_gt: 26}}})
}
}`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not:

Suggested change
Query: `query {
users(groupBy: [Name]) {
Name
_avg(_group: {field: Age, filter: {Age: {_gt: 26}}})
}
}`,
Query: `query {
users(groupBy: [Name]) {
Name
_avg(_group: {field: Age, filter: {Age: {_gt: 26}}})
}
}`,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cant tell why this is rendered like this in github, I appear to have 5 tabs locally infront of users... on ln23. Does it all look normal to you locally?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better locally yea, strange rendering.

}
}
}

return aggregates, nil
}

func appendNotNilFilter(field *parser.Select, childField string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment for this function would be sweet.

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it would! Thanks

  • document appendNotNilFilter

}

typedChildBlock := childBlock.(map[string]interface{})
typedChildBlock["$ne"] = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this?

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was documentation in the original and I accidentally removed it. Will add some, thanks for flagging.

We have to filter out values that equal nil otherwise it messes the numbers up, e.g. [2, 4, nil] needs to average to 3, but the without the filter the avg would be 2 (sum: 6, count: 3). TestQuerySimpleWithGroupByStringWithoutRenderedGroupAndChildNilAverage covers this

  • document ne nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasoning is documented on the the function call, not within the function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

if allArguementsMatch {
return f, true
}
if f.Equal(*targetField) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible to test this case ? or do you think it's overkill?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be possible, but is blocked by #410 - tests should definately be expanded as part of that ticket and thanks for reminding me to make that clear in that ticket

for i, selection := range childSelections {
info.children[i] = buildRenderInfo(selection)
for _, selection := range childSelections {
if slct, isSelect := selection.(*parser.Select); isSelect && slct.Hidden {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you don't really know the size of the children anymore. Feel free to ignore the comment above if that is the reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be looking into it either way, this is sensitive code and I didn't review after making it work

info.children[i] = buildRenderInfo(selection)
for _, selection := range childSelections {
if slct, isSelect := selection.(*parser.Select); isSelect && slct.Hidden {
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this possible to hit with a test easily?

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good spot, I scanned the test coverage and missed this hole. I really dont know why it is not hit, it should be, and besides this being a hangover from an old implementation that is no longer useful nothing comes to mind as to why it is not covered. Will look into

  • test gap in render.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the planner/group stuff might be being cunning and optimizes away the need for this with groups, for example I would have expected the below query to hit this path but it doesnt. I expect the child relationship stuff to hit here and I'll make a note of in the issue, but otherwise will leave as is.

	users(groupBy: [Name]) {
		Name
		_avg(_group: {field: _avg})
		_group (groupBy: [Verified]){
			Verified
			_avg(_group: {field: Age}) // I expected this inner `_group` to hit this line
		}
	}


testUtils "github.com/sourcenetwork/defradb/db/tests"
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a test case where there are no documents but the query is made (i.e. empty collection).

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, will add

  • test empty

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the group-aggregate tests as the filters will not impact this. Will likely add empty inner collection tests for inline-arrays and child collections, as it is more plausible that they could take a path hitting the filter code.

Results: []map[string]interface{}{
{
"Name": "Alice",
"G1": []map[string]interface{}{},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"G1": []map[string]interface{}{},
"G1": []map[string]interface{}{},

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gofmt aligns these so that the right hand side all aligns

Query: `query {
users(groupBy: [Age]) {
Age
_sum(_group: {field: Age, filter: {Age: {_gt: 26}}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the line 64 (_sum(_group: {field: Age, filter: {Age: {_gt: 26}}})) and 25 the same yet one says sum on rendered and the other says sum on non-rendered. Did you mean here that we have _sum(...) with rendered (i.e. _group{Name})?

Just trying to understand what on rendered is meant haha.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) The _group is non-rendered - the 'non-rendered' tests basically test select.joinAggregatedChild. Is easier to visualise if you consider aggregating a child relationship perhaps.

author {
  name
  _count(published: {}) // here we join 'published' even though it is not rendered in the query, else we would have nothing to count
}

vs

author {
  name
  _count(published: {})
  published {
     name
  }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotchu, appreciate the explaination.

@source-devs

This comment was marked as off-topic.

@source-devs

This comment was marked as off-topic.

@source-devs

This comment was marked as off-topic.

Copy link
Member

@shahzadlone shahzadlone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome stuff, thanks for the changes!

Made some more comments and suggestions, most are just questions and small suggestions (mostly non-blocking nitpicks).

This is my first try with conventional comments, any feedback would be highly appreciated.

Comment on lines +22 to +27
Query: `query {
users(groupBy: [Name]) {
Name
_avg(_group: {field: Age, filter: {Age: {_gt: 26}}})
}
}`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better locally yea, strange rendering.

Query: `query {
users(groupBy: [Age]) {
Age
_sum(_group: {field: Age, filter: {Age: {_gt: 26}}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotchu, appreciate the explaination.

}

typedChildBlock := childBlock.(map[string]interface{})
typedChildBlock["$ne"] = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -104,7 +105,9 @@ type Selection interface {
type Select struct {
Name string
Alias string
ExternalName string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: What are these two new fields added to the Select ?
nitpick: Perhaps a short comment on what ExternalName and Hidden is.

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will document.
ExternalName: The name by which the user refers to the field (e.g. _group)
Hidden: Wont be rendered

  • document new props on select

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -147,6 +150,28 @@ func (s Select) GetAlias() string {
return s.Alias
}

func (s Select) Equal(other Select) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Equal method isn't checking if all fields / attributes of Select type are equal to the other instance of Select object. Is that the intended behavior? As it only checks if the Name, ExternalName and Fliter (with it's condition as the same).

nitpick: perhaps a rename to EqualFilter, or something to signal only partial comparison not a full comparison.

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is intended - will grow too as more options get added to aggregates.

Not keen on changing the name due to the future expansion of this function. But it should definitely be documented 😅 as it has the capacity to catch people off-guard.

  • Document Select.Equals

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this function might need to be completed. There might be a bug here if comparing an aggregate to a child list that contains a limit (I think other items might be fine e.g. sort).

  • expand tests to cover other 'stuff' on the target property

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found another bug testing this (test fails atm): 6fa9cd2

Opened a new ticket for it and will look at fixing outside of this PR (not aggregate related), but I will see as there have been changes here relating to the group-alias bug that might clash.
#435

If this PR is approved and ready to merge before that ticket is merged, I'll expand the tests outside of this review too (not worth blocking the PR for).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found another bug testing this (test fails atm): 6fa9cd2

Opened a new ticket for it and will look at fixing outside of this PR (not aggregate related), but I will see as there have been changes here relating to the group-alias bug that might clash. #435

If this PR is approved and ready to merge before that ticket is merged, I'll expand the tests outside of this review too (not worth blocking the PR for).

Totes okay with the separate PR resolving that bug. Thanks for documenting and making an issue!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not actually possible to hit this issue until either aggregates get other args (e.g. limit), or we add support nil filters. Will add to those tickets.

Also - the limit on group bug fix is based off of this branch, as I wanted/needed some of the group related changes, so this PR will be merged first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, then will review the other PR after this is merged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other PR can be reviewed now, just only review the last commit

Comment on lines +315 to +318
dataSource := plan.group.dataSources[i]
dataSource.childSource = childSelectNode
dataSource.parentSource = plan.plan
dataSource.pipeNode = pipe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Are the following supposed to still be set to -1 at this point?

  • dataSource.lastParentDocIndex
  • dataSource.lastChildDocIndex

and dataSource.childName is expected to be already expected to be set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the planner is not responsible for these things

Comment on lines +52 to 57
for _, selection := range childSelections {
if slct, isSelect := selection.(*parser.Select); isSelect && slct.Hidden {
continue
}
info.children = append(info.children, buildRenderInfo(selection))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I feel like this can be a function as the same exact logic as these lines (52-57) is also on the lines 80-85.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the merit in pulling them out to a function, but think I prefer the visibility duplication gives here. Leaving as is unless someone can talk me round (including future me lol)

Copy link
Member

@shahzadlone shahzadlone May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why leave duplicate code within the same file? or even the same package. I understand to avoid multiple packages depending on each other it's better to leave duplicated code within different packages, but here personally I think it's more readable if a set of instructions that do the same tasks can be a function that you have to understand once and then wherever it is called you know it is the same set of instructions being called. Also makes testing easier and using tested code possible, if say you were to test that function, now you are using tested code in all the references where the same set of instructions are called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the current more readable, and your tests shouldn't care (test public behaviour, not the code itself).

Consider why you would be reading this - you might be reading the renderNode constructor, or how the renderInfo is constructed. I can read downward from render into buildTopLevelRenderInfo and then into buildRenderInfo with almost no bouncing around to 'helper' methods (is just the Selection getters). The only mutations append(info.children are super obvious and wont come as a surprise.

If you extract the for loop to a function you break this linearity and have to jump into helper functions that mutate objects owned by the parent functions. You'd also hide the recursion within the helper function transforming A=>A=>A into A=>B=>A=>B=>A=>B which I find much harder to follow. And the duplication here is super limited/simple code.

Copy link
Member

@shahzadlone shahzadlone May 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand the thought process now. However consider this, if you came to this file and this was lets say already split up into a function that was being called in these two places. Would you think it was better code to remove the function and copy paste the functions body into both these places (or where ever the function was called) just so you can have a A=>A=>A logic ? Or would you leave it as is? Where do we draw the line for code duplication? I am actually curious lol for my learning as I have a habit of splitting things into smaller and smaller chunks. Though process is to control the number of linearly independent paths to avoid inflating cyclomatic complexity later on when new code is added as well.

Also btw super non-blocking and minor thing, perhaps I should have used nitpick instead of suggestion.

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think that the opposite would happen (more code will be added).

In the short term yes, usually lines are saved by deduplication. But after a while when more places start calling and the functions grow to accommodate all the new edge cases and feature creep you'll find a lot of this coupled mess of a function (and child functions) is worthless/dead to most of the parent functions.

A common first step when I start to clean up spaghetti like that is to (re)duplicate stuff so I can actually see what gets used by what, and then deletion/re-structuring becomes much easier and safer.

Personally if I do change some code for feature A that is being used else where I try to ensure the references of that common aspects don't break the rest.

This is error prone in a small and new codebase like defra, and a massive time-sink and risk in a larger older codebase, unless you are lucky enough to have really awesome integration test coverage (is easy enough with something like defra, but most dont have such easy testable/interfaces e.g. web apps, let alone legacy stuff that got large without int. tests).

he said unit tests are a design smell and they couple the code

From memory (I've not watch it in a few years) he was distinguishing between unit and integration tests - not automated testing overall. Unit testing of private functionality is very much tied into the internal structure of the code and makes refactoring more costly in that you either delete the overly specific tests or you re-write them to test your new implementation. This also can discourage refactoring and ends up making shared and messy code more persistent and ever more horrible as no one can stomach cleaning it up.

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit vs integration test....

Okay, I showered and wanted to write more on this as I consider it to be one of the most important things to get right in software development.

In my opinion:

  1. Tests (automated or otherwise) assert that the production code conforms to one's perception of reality. In the case of unit and integration tests, this is typically the author's reality.
  2. The reality in (1) will almost always differ slightly from the reality of the desired functionality (e.g. we did not assert that child filters behaved correctly when the same underlying property is requested multiple times with different filters). Over time if developers do their job, the gap between these realities will shrink as individual differences are discovered and corrected/asserted.
  3. In integration tests the desired reality in (2) is that of the users/product-owners/etc. These assert that the code does what the users/product-owners want it to do.
  4. In unit tests the desired reality in (2) is the author(s) of the code. These assert that the code does what the author(s) thinks other code will need it to do. Similar to (1) and (2), there will be a gap here between what the author thinks the code needs to do, and what the other code needs it to do.
  5. Modifying existing tests is really risky, typically tests are untested - the changes are the perceived reality of the author. The more complex the change, the greater the risk that the realities in (1) and (2) will diverge and gaps in tests will appear.
  6. Modifying the interfaces of code will require modifying the tests. This can only ever widen or maintain the current gap in realities between (1) and (2), it can never reduce it (improvements to tests are out-of-scope here and are viewed as independent changes).
  7. Refactoring often involves complex changes to the structure of code. If present, this typically means complex changes to the unit tests, which can only ever widen or maintain the gap between (1) and (2), and can only ever widen, maintain, or replace (in the case of rip and replace) the reality gap described in (4).
  8. Modifying external interfaces are almost always much rarer than the modification of internal interfaces.
  9. Because of (8), integration tests are going to degrade at a slower rate than unit tests. If desired external behaviour is well designed and developers do their job, the reality gap should decrease over time.
  10. Production code will change it's behaviour.
  11. Because of (10) production code will degrade.
  12. Because of (11) production code will need to be refactored (or replaced).
  13. Refactoring adds no value to the users. No user-visible behaviour is changed (besides sometimes performance as a side effect).
  14. Because of (7), refactoring (12) will expose a code base dependent on unit tests to the risks in (5) whilst adding zero value to the users (13) and consuming development resources (reducing the time that can be spent deliberately changing external, paid for, behaviour). This typically results in a reluctance to refactor, and the code base continues to degrade. This continued degradation makes the eventual refactor much more complex (7) and risky (5,6,7), and slows down deliberate change in the meantime. Because the interfaces tested by integration tests do not change during a refactor, there is much less risk involved (5 is not applicable), and thus much less reluctance to refactor - this means refactoring will be more frequent and less complex resulting in a cleaner code base. Coupled with (9), relying on integration tests over unit tests can result in a far smaller gap between the realities of the developers and the realities of the users/product-owners, and a far cleaner code base.
  15. Considering (14) unit tests primary use-case is for the short-term debugging of code and they are a poor tool for asserting that the code does what the users/product-owners want it to do (which is what developers are paid to do).

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit vs integration test.... 1..15

A defra example of this is that I could rewrite/replace the entire planner package without having to change the tests. I could spend days messing around with it and then still be exactly as confident as I was before the refactor in the behaviour of the code (is 5 key-presses for me in the terminal and couple of seconds to wait for our non-benchmark tests). If we were relying on unit tests it would take me much longer, and the tests will have diverged (risking numerous regressions).

Copy link
Member

@shahzadlone shahzadlone May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion:

  1. Tests (automated or otherwise) assert that the production code conforms to one's perception of reality. In the case of unit and integration tests, this is typically the author's reality.
  2. The reality in (1) will almost always differ slightly from the reality of the desired functionality (e.g. we did not assert that child filters behaved correctly when the same underlying property is requested multiple times with different filters). Over time if developers do their job, the gap between these realities will shrink as individual differences are discovered and corrected/asserted.
  3. In integration tests the desired reality in (2) is that of the users/product-owners/etc. These assert that the code does what the users/product-owners want it to do.
  4. In unit tests the desired reality in (2) is the author(s) of the code. These assert that the code does what the author(s) thinks other code will need it to do. Similar to (1) and (2), there will be a gap here between what the author thinks the code needs to do, and what the other code needs it to do.
  5. Modifying existing tests is really risky, typically tests are untested - the changes are the perceived reality of the author. The more complex the change, the greater the risk that the realities in (1) and (2) will diverge and gaps in tests will appear.
  6. Modifying the interfaces of code will require modifying the tests. This can only ever widen or maintain the current gap in realities between (1) and (2), it can never reduce it (improvements to tests are out-of-scope here and are viewed as independent changes).
  7. Refactoring often involves complex changes to the structure of code. If present, this typically means complex changes to the unit tests, which can only ever widen or maintain the gap between (1) and (2), and can only ever widen, maintain, or replace (in the case of rip and replace) the reality gap described in (4).
  8. Modifying external interfaces are almost always much rarer than the modification of internal interfaces.
  9. Because of (8), integration tests are going to degrade at a slower rate than unit tests. If desired external behaviour is well designed and developers do their job, the reality gap should decrease over time.
  10. Production code will change it's behaviour.
  11. Because of (10) production code will degrade.
  12. Because of (11) production code will need to be refactored (or replaced).
  13. Refactoring adds no value to the users. No user-visible behavior is changed (besides sometimes performance as a side effect).
  14. Because of (7), refactoring (12) will expose a code base dependent on unit tests to the risks in (5) whilst adding zero value to the users (13) and consuming development resources (reducing the time that can be spent deliberately changing external, paid for, behavior). This typically results in a reluctance to refactor, and the code base continues to degrade. This continued degradation makes the eventual refactor much more complex (7) and risky (5,6,7), and slows down deliberate change in the meantime. Because the interfaces tested by integration tests do not change during a refactor, there is much less risk involved (5 is not applicable), and thus much less reluctance to refactor - this means refactoring will be more frequent and less complex resulting in a cleaner code base. Coupled with (9), relying on integration tests over unit tests can result in a far smaller gap between the realities of the developers and the realities of the users/product-owners, and a far cleaner code base.
  15. Considering (14) unit tests primary use-case is for the short-term debugging of code and they are a poor tool for asserting that the code does what the users/product-owners want it to do (which is what developers are paid to do).

Sorry for the late response! This is a great encapsulation of what happens over time if you start relying on unit tests. Also always in favor for pushing for having more integration tests, then unit tests and etc. You mentioned the full story on what's wrong with unit tests overtime which is also why I can't just agree with the only reason the guy gave which was that 'since they couple the code, then they are bad'. One might say the same for integration tests, that they couple the code. They by definition integrate more modules together to test their behavior together, but as you mentioned with integration tests you can easily swap out an old module with the new implementation and be assured that it will work (considering that tests would pass).

Btw, this was such a nice writeup on divergence between Unit Tests vs Integration Tests that I would like to put this somewhere we can come back to in future so I made an outline doc: https://source.getoutline.com/doc/fruitful-engineering-writeups-discussions-RU1f6RzGph

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 Cheers Shahzad, was really nice of you to write that! Thanks for creating the doc - hopefully we'll fill it full of fun chats soon enough 😁

Does sound like I should probs watch that video again though lol, hopefully I'm not overrating it 😆 And I'll have probs forgot a few of his better phrased points lol

countField, countExists := tryGetAggregateField(parsed.Fields, f, parser.CountFieldName, parser.CountFieldName)
// Note: sumExists will always be false until we support filtering by nil in the query
if !countExists {
countField = f.CopyWithName(fmt.Sprintf("%s_count", f.Name), parser.CountFieldName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
countField = f.CopyWithName(fmt.Sprintf("%s_count", f.Name), parser.CountFieldName)
countField = f.CopyWithName(fmt.Sprintf("%s%s", f.Name, parser.CountFieldName), parser.CountFieldName)

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this, using parser.CountFieldName implies that the suffix must be that value, when it really doesnt matter much what it is - so long as it is unique. Might be worth a code-comment noting the unimportance here though.

  • consider documenting lack of importance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow the reasoning. The suffix can be anything? as is doesn't need to be "_count" or "_sum"?

In that case why not use some other naming scheme that is different say: "_virtualSum", "_virtualCount".

question: Are these the internal virtual fields that do the register like magic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suffix can be anything? as is doesn't need to be "_count" or "_sum"?

correct, if you prefer it to be something else I can change it (virtual is arguably incorrect though, these are more 'internal' fields used as dependency of _avg)

Are these the internal virtual fields that do the register like magic?

No, these are different magic fields (:sweat_smile:) that allow average to do nothing besides divide _sum by _count instead of having to also sum and count.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suffix can be anything? as is doesn't need to be "_count" or "_sum"?

correct, if you prefer it to be something else I can change it (virtual is arguably incorrect though, these are more 'internal' fields used as dependency of _avg)

Yes please, that seems like it would be more clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sumField, sumExists := tryGetAggregateField(parsed.Fields, f, parser.SumFieldName, parser.SumFieldName)
// Note: sumExists will always be false until we support filtering by nil in the query
if !sumExists {
sumField = f.CopyWithName(fmt.Sprintf("%s_sum", f.Name), parser.SumFieldName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sumField = f.CopyWithName(fmt.Sprintf("%s_sum", f.Name), parser.SumFieldName)
sumField = f.CopyWithName(fmt.Sprintf("%s%s", f.Name, parser.SumFieldName), parser.SumFieldName)

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as other comment.

  • consider documenting lack of importance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better! Thanks.

// tryGetAggregateField attempts to find an existing aggregate field that matches the given
// name and arguements. Will return the match field and true if one is found, false otherwise.
// name and template field. Will return the match field and true if one is found, false otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: What is meant by template field in this context? The match here only checks Equality of the Name, ExternalName and Fliter (including it's conditions) for every Select Field right?

Copy link
Contributor Author

@AndrewSisley AndrewSisley May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

template is the name of an input param, the comment refers to that (I might add ticks round it though). Although on re-read, this looks like the comment will have made more sense under a previous version of the code, will look at revising.

  • review tryGetAggregateField documentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function reworked slightly and documentation re-written: 9eab58f

@source-devs

This comment was marked as outdated.

@source-devs

This comment was marked as off-topic.

@source-devs

This comment was marked as off-topic.

@source-devs

This comment was marked as spam.

@source-devs

This comment was marked as spam.

@source-devs

This comment was marked as spam.

@source-devs

This comment was marked as spam.

@AndrewSisley AndrewSisley added the action/no-benchmark Skips the action that runs the benchmark. label May 11, 2022
@AndrewSisley AndrewSisley force-pushed the sisley/feat/I99-aggregate-filter branch from 6fa9cd2 to 9eab58f Compare May 11, 2022 18:56
@AndrewSisley AndrewSisley requested a review from shahzadlone May 11, 2022 19:09
@AndrewSisley AndrewSisley merged commit d36ffc8 into develop May 16, 2022
@AndrewSisley AndrewSisley deleted the sisley/feat/I99-aggregate-filter branch May 16, 2022 16:16
shahzadlone pushed a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
* Remove extra whitespace

* Make aggregate a select not a field

As more functionality is added to the aggregate system it will use more and more of the select properties until there is little-no difference between them.

* Alias astValue

Later commit(s) will make use of this variable

* Handle multiple aliased _group properties with different arguments

* Document field renaming

* Update gql-go package to work with thunks

* Check for type already being defined

Thunk may be run multiple times and we we error if we try and add the inner type more than once

* Add count filter support

* Add sum filter support

* Add average filter support
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/no-benchmark Skips the action that runs the benchmark. area/query Related to the query component feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Aggregate: Allow filter defined within aggregate for group
4 participants