-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add aggregate filter support for groups only #426
Conversation
c9bb032
to
6c3edc5
Compare
Codecov Report
@@ Coverage Diff @@
## develop #426 +/- ##
===========================================
+ Coverage 65.66% 66.00% +0.33%
===========================================
Files 80 80
Lines 9481 9593 +112
===========================================
+ Hits 6226 6332 +106
Misses 2624 2624
- Partials 631 637 +6
|
Benchmark ResultsSummary
✅ See Better Results...
❌ See Worse Results...
✨ See Unchanged Results...
🐋 See Full Results...
|
@@ -221,7 +221,7 @@ replace ( | |||
github.com/SierraSoftworks/connor => github.com/sourcenetwork/connor v1.0.3-0.20210312091030-4823d0411a12 | |||
|
|||
// SourceNetwork fork og graphql-go | |||
github.com/graphql-go/graphql => github.com/sourcenetwork/graphql v0.7.10-0.20220122211559-2fe60b2360cc | |||
github.com/graphql-go/graphql => github.com/sourcenetwork/graphql-go v0.7.10-0.20220509213405-bc95f52e34d9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was the name of the forked repo changed? I thought the only change was that PR #4
that got merged. I do see that github.com/sourcenetwork/graphql
resolves to the same remote repo. Just curious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol - that is really bad of me - I didnt spot it had changed. Looks like there is a redirect from github.com/sourcenetwork/graphql
to github.com/sourcenetwork/graphql-go
, maybe it got renamed at some point.
Query: `query { | ||
users(groupBy: [Name]) { | ||
Name | ||
_avg(_group: {field: Age, filter: {Age: {_gt: 26}}}) | ||
} | ||
}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not:
Query: `query { | |
users(groupBy: [Name]) { | |
Name | |
_avg(_group: {field: Age, filter: {Age: {_gt: 26}}}) | |
} | |
}`, | |
Query: `query { | |
users(groupBy: [Name]) { | |
Name | |
_avg(_group: {field: Age, filter: {Age: {_gt: 26}}}) | |
} | |
}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cant tell why this is rendered like this in github, I appear to have 5 tabs locally infront of users...
on ln23. Does it all look normal to you locally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks better locally yea, strange rendering.
} | ||
} | ||
} | ||
|
||
return aggregates, nil | ||
} | ||
|
||
func appendNotNilFilter(field *parser.Select, childField string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment for this function would be sweet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it would! Thanks
- document appendNotNilFilter
} | ||
|
||
typedChildBlock := childBlock.(map[string]interface{}) | ||
typedChildBlock["$ne"] = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there was documentation in the original and I accidentally removed it. Will add some, thanks for flagging.
We have to filter out values that equal nil otherwise it messes the numbers up, e.g. [2, 4, nil] needs to average to 3, but the without the filter the avg would be 2 (sum: 6, count: 3). TestQuerySimpleWithGroupByStringWithoutRenderedGroupAndChildNilAverage covers this
- document ne nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasoning is documented on the the function call, not within the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
query/graphql/planner/select.go
Outdated
if allArguementsMatch { | ||
return f, true | ||
} | ||
if f.Equal(*targetField) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible to test this case ? or do you think it's overkill?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be possible, but is blocked by #410 - tests should definately be expanded as part of that ticket and thanks for reminding me to make that clear in that ticket
for i, selection := range childSelections { | ||
info.children[i] = buildRenderInfo(selection) | ||
for _, selection := range childSelections { | ||
if slct, isSelect := selection.(*parser.Select); isSelect && slct.Hidden { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you don't really know the size of the children anymore. Feel free to ignore the comment above if that is the reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be looking into it either way, this is sensitive code and I didn't review after making it work
info.children[i] = buildRenderInfo(selection) | ||
for _, selection := range childSelections { | ||
if slct, isSelect := selection.(*parser.Select); isSelect && slct.Hidden { | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this possible to hit with a test easily?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good spot, I scanned the test coverage and missed this hole. I really dont know why it is not hit, it should be, and besides this being a hangover from an old implementation that is no longer useful nothing comes to mind as to why it is not covered. Will look into
- test gap in render.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the planner/group stuff might be being cunning and optimizes away the need for this with groups, for example I would have expected the below query to hit this path but it doesnt. I expect the child relationship stuff to hit here and I'll make a note of in the issue, but otherwise will leave as is.
users(groupBy: [Name]) {
Name
_avg(_group: {field: _avg})
_group (groupBy: [Verified]){
Verified
_avg(_group: {field: Age}) // I expected this inner `_group` to hit this line
}
}
|
||
testUtils "github.com/sourcenetwork/defradb/db/tests" | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a test case where there are no documents but the query is made (i.e. empty collection).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, will add
- test empty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to the group-aggregate tests as the filters will not impact this. Will likely add empty inner collection tests for inline-arrays and child collections, as it is more plausible that they could take a path hitting the filter code.
Results: []map[string]interface{}{ | ||
{ | ||
"Name": "Alice", | ||
"G1": []map[string]interface{}{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"G1": []map[string]interface{}{}, | |
"G1": []map[string]interface{}{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gofmt
aligns these so that the right hand side all aligns
Query: `query { | ||
users(groupBy: [Age]) { | ||
Age | ||
_sum(_group: {field: Age, filter: {Age: {_gt: 26}}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the line 64 (_sum(_group: {field: Age, filter: {Age: {_gt: 26}}})
) and 25 the same yet one says sum on rendered
and the other says sum on non-rendered
. Did you mean here that we have _sum(...)
with rendered (i.e. _group{Name}
)?
Just trying to understand what on rendered is meant haha.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:) The _group
is non-rendered - the 'non-rendered' tests basically test select.joinAggregatedChild
. Is easier to visualise if you consider aggregating a child relationship perhaps.
author {
name
_count(published: {}) // here we join 'published' even though it is not rendered in the query, else we would have nothing to count
}
vs
author {
name
_count(published: {})
published {
name
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotchu, appreciate the explaination.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome stuff, thanks for the changes!
Made some more comments and suggestions, most are just questions and small suggestions (mostly non-blocking nitpicks).
This is my first try with conventional comments, any feedback would be highly appreciated.
Query: `query { | ||
users(groupBy: [Name]) { | ||
Name | ||
_avg(_group: {field: Age, filter: {Age: {_gt: 26}}}) | ||
} | ||
}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks better locally yea, strange rendering.
Query: `query { | ||
users(groupBy: [Age]) { | ||
Age | ||
_sum(_group: {field: Age, filter: {Age: {_gt: 26}}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotchu, appreciate the explaination.
} | ||
|
||
typedChildBlock := childBlock.(map[string]interface{}) | ||
typedChildBlock["$ne"] = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@@ -104,7 +105,9 @@ type Selection interface { | |||
type Select struct { | |||
Name string | |||
Alias string | |||
ExternalName string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: What are these two new fields added to the Select ?
nitpick: Perhaps a short comment on what ExternalName
and Hidden
is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, will document.
ExternalName: The name by which the user refers to the field (e.g. _group
)
Hidden: Wont be rendered
- document new props on select
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -147,6 +150,28 @@ func (s Select) GetAlias() string { | |||
return s.Alias | |||
} | |||
|
|||
func (s Select) Equal(other Select) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Equal
method isn't checking if all fields / attributes of Select
type are equal to the other instance of Select
object. Is that the intended behavior? As it only checks if the Name
, ExternalName
and Fliter
(with it's condition as the same).
nitpick: perhaps a rename to EqualFilter
, or something to signal only partial comparison not a full comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is intended - will grow too as more options get added to aggregates.
Not keen on changing the name due to the future expansion of this function. But it should definitely be documented 😅 as it has the capacity to catch people off-guard.
- Document Select.Equals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this function might need to be completed. There might be a bug here if comparing an aggregate to a child list that contains a limit (I think other items might be fine e.g. sort).
- expand tests to cover other 'stuff' on the target property
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found another bug testing this (test fails atm): 6fa9cd2
Opened a new ticket for it and will look at fixing outside of this PR (not aggregate related), but I will see as there have been changes here relating to the group-alias bug that might clash.
#435
If this PR is approved and ready to merge before that ticket is merged, I'll expand the tests outside of this review too (not worth blocking the PR for).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found another bug testing this (test fails atm): 6fa9cd2
Opened a new ticket for it and will look at fixing outside of this PR (not aggregate related), but I will see as there have been changes here relating to the group-alias bug that might clash. #435
If this PR is approved and ready to merge before that ticket is merged, I'll expand the tests outside of this review too (not worth blocking the PR for).
Totes okay with the separate PR resolving that bug. Thanks for documenting and making an issue!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is not actually possible to hit this issue until either aggregates get other args (e.g. limit), or we add support nil filters. Will add to those tickets.
Also - the limit on group bug fix is based off of this branch, as I wanted/needed some of the group related changes, so this PR will be merged first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, then will review the other PR after this is merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other PR can be reviewed now, just only review the last commit
dataSource := plan.group.dataSources[i] | ||
dataSource.childSource = childSelectNode | ||
dataSource.parentSource = plan.plan | ||
dataSource.pipeNode = pipe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Are the following supposed to still be set to -1
at this point?
dataSource.lastParentDocIndex
dataSource.lastChildDocIndex
and dataSource.childName
is expected to be already expected to be set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the planner is not responsible for these things
for _, selection := range childSelections { | ||
if slct, isSelect := selection.(*parser.Select); isSelect && slct.Hidden { | ||
continue | ||
} | ||
info.children = append(info.children, buildRenderInfo(selection)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: I feel like this can be a function as the same exact logic as these lines (52-57) is also on the lines 80-85.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see the merit in pulling them out to a function, but think I prefer the visibility duplication gives here. Leaving as is unless someone can talk me round (including future me lol)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why leave duplicate code within the same file? or even the same package. I understand to avoid multiple packages depending on each other it's better to leave duplicated code within different packages, but here personally I think it's more readable if a set of instructions that do the same tasks can be a function that you have to understand once and then wherever it is called you know it is the same set of instructions being called. Also makes testing easier and using tested code possible, if say you were to test that function, now you are using tested code in all the references where the same set of instructions are called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the current more readable, and your tests shouldn't care (test public behaviour, not the code itself).
Consider why you would be reading this - you might be reading the renderNode constructor, or how the renderInfo is constructed. I can read downward from render
into buildTopLevelRenderInfo
and then into buildRenderInfo
with almost no bouncing around to 'helper' methods (is just the Selection
getters). The only mutations append(info.children
are super obvious and wont come as a surprise.
If you extract the for loop to a function you break this linearity and have to jump into helper functions that mutate objects owned by the parent functions. You'd also hide the recursion within the helper function transforming A=>A=>A into A=>B=>A=>B=>A=>B which I find much harder to follow. And the duplication here is super limited/simple code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can understand the thought process now. However consider this, if you came to this file and this was lets say already split up into a function that was being called in these two places. Would you think it was better code to remove the function and copy paste the functions body into both these places (or where ever the function was called) just so you can have a A=>A=>A logic ? Or would you leave it as is? Where do we draw the line for code duplication? I am actually curious lol for my learning as I have a habit of splitting things into smaller and smaller chunks. Though process is to control the number of linearly independent paths to avoid inflating cyclomatic complexity later on when new code is added as well.
Also btw super non-blocking and minor thing, perhaps I should have used nitpick
instead of suggestion
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think that the opposite would happen (more code will be added).
In the short term yes, usually lines are saved by deduplication. But after a while when more places start calling and the functions grow to accommodate all the new edge cases and feature creep you'll find a lot of this coupled mess of a function (and child functions) is worthless/dead to most of the parent functions.
A common first step when I start to clean up spaghetti like that is to (re)duplicate stuff so I can actually see what gets used by what, and then deletion/re-structuring becomes much easier and safer.
Personally if I do change some code for feature A that is being used else where I try to ensure the references of that common aspects don't break the rest.
This is error prone in a small and new codebase like defra, and a massive time-sink and risk in a larger older codebase, unless you are lucky enough to have really awesome integration test coverage (is easy enough with something like defra, but most dont have such easy testable/interfaces e.g. web apps, let alone legacy stuff that got large without int. tests).
he said unit tests are a design smell and they couple the code
From memory (I've not watch it in a few years) he was distinguishing between unit and integration tests - not automated testing overall. Unit testing of private functionality is very much tied into the internal structure of the code and makes refactoring more costly in that you either delete the overly specific tests or you re-write them to test your new implementation. This also can discourage refactoring and ends up making shared and messy code more persistent and ever more horrible as no one can stomach cleaning it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unit vs integration test....
Okay, I showered and wanted to write more on this as I consider it to be one of the most important things to get right in software development.
In my opinion:
- Tests (automated or otherwise) assert that the production code conforms to one's perception of reality. In the case of unit and integration tests, this is typically the author's reality.
- The reality in (1) will almost always differ slightly from the reality of the desired functionality (e.g. we did not assert that child filters behaved correctly when the same underlying property is requested multiple times with different filters). Over time if developers do their job, the gap between these realities will shrink as individual differences are discovered and corrected/asserted.
- In integration tests the desired reality in (2) is that of the users/product-owners/etc. These assert that the code does what the users/product-owners want it to do.
- In unit tests the desired reality in (2) is the author(s) of the code. These assert that the code does what the author(s) thinks other code will need it to do. Similar to (1) and (2), there will be a gap here between what the author thinks the code needs to do, and what the other code needs it to do.
- Modifying existing tests is really risky, typically tests are untested - the changes are the perceived reality of the author. The more complex the change, the greater the risk that the realities in (1) and (2) will diverge and gaps in tests will appear.
- Modifying the interfaces of code will require modifying the tests. This can only ever widen or maintain the current gap in realities between (1) and (2), it can never reduce it (improvements to tests are out-of-scope here and are viewed as independent changes).
- Refactoring often involves complex changes to the structure of code. If present, this typically means complex changes to the unit tests, which can only ever widen or maintain the gap between (1) and (2), and can only ever widen, maintain, or replace (in the case of rip and replace) the reality gap described in (4).
- Modifying external interfaces are almost always much rarer than the modification of internal interfaces.
- Because of (8), integration tests are going to degrade at a slower rate than unit tests. If desired external behaviour is well designed and developers do their job, the reality gap should decrease over time.
- Production code will change it's behaviour.
- Because of (10) production code will degrade.
- Because of (11) production code will need to be refactored (or replaced).
- Refactoring adds no value to the users. No user-visible behaviour is changed (besides sometimes performance as a side effect).
- Because of (7), refactoring (12) will expose a code base dependent on unit tests to the risks in (5) whilst adding zero value to the users (13) and consuming development resources (reducing the time that can be spent deliberately changing external, paid for, behaviour). This typically results in a reluctance to refactor, and the code base continues to degrade. This continued degradation makes the eventual refactor much more complex (7) and risky (5,6,7), and slows down deliberate change in the meantime. Because the interfaces tested by integration tests do not change during a refactor, there is much less risk involved (5 is not applicable), and thus much less reluctance to refactor - this means refactoring will be more frequent and less complex resulting in a cleaner code base. Coupled with (9), relying on integration tests over unit tests can result in a far smaller gap between the realities of the developers and the realities of the users/product-owners, and a far cleaner code base.
- Considering (14) unit tests primary use-case is for the short-term debugging of code and they are a poor tool for asserting that the code does what the users/product-owners want it to do (which is what developers are paid to do).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unit vs integration test.... 1..15
A defra example of this is that I could rewrite/replace the entire planner package without having to change the tests. I could spend days messing around with it and then still be exactly as confident as I was before the refactor in the behaviour of the code (is 5 key-presses for me in the terminal and couple of seconds to wait for our non-benchmark tests). If we were relying on unit tests it would take me much longer, and the tests will have diverged (risking numerous regressions).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion:
- Tests (automated or otherwise) assert that the production code conforms to one's perception of reality. In the case of unit and integration tests, this is typically the author's reality.
- The reality in (1) will almost always differ slightly from the reality of the desired functionality (e.g. we did not assert that child filters behaved correctly when the same underlying property is requested multiple times with different filters). Over time if developers do their job, the gap between these realities will shrink as individual differences are discovered and corrected/asserted.
- In integration tests the desired reality in (2) is that of the users/product-owners/etc. These assert that the code does what the users/product-owners want it to do.
- In unit tests the desired reality in (2) is the author(s) of the code. These assert that the code does what the author(s) thinks other code will need it to do. Similar to (1) and (2), there will be a gap here between what the author thinks the code needs to do, and what the other code needs it to do.
- Modifying existing tests is really risky, typically tests are untested - the changes are the perceived reality of the author. The more complex the change, the greater the risk that the realities in (1) and (2) will diverge and gaps in tests will appear.
- Modifying the interfaces of code will require modifying the tests. This can only ever widen or maintain the current gap in realities between (1) and (2), it can never reduce it (improvements to tests are out-of-scope here and are viewed as independent changes).
- Refactoring often involves complex changes to the structure of code. If present, this typically means complex changes to the unit tests, which can only ever widen or maintain the gap between (1) and (2), and can only ever widen, maintain, or replace (in the case of rip and replace) the reality gap described in (4).
- Modifying external interfaces are almost always much rarer than the modification of internal interfaces.
- Because of (8), integration tests are going to degrade at a slower rate than unit tests. If desired external behaviour is well designed and developers do their job, the reality gap should decrease over time.
- Production code will change it's behaviour.
- Because of (10) production code will degrade.
- Because of (11) production code will need to be refactored (or replaced).
- Refactoring adds no value to the users. No user-visible behavior is changed (besides sometimes performance as a side effect).
- Because of (7), refactoring (12) will expose a code base dependent on unit tests to the risks in (5) whilst adding zero value to the users (13) and consuming development resources (reducing the time that can be spent deliberately changing external, paid for, behavior). This typically results in a reluctance to refactor, and the code base continues to degrade. This continued degradation makes the eventual refactor much more complex (7) and risky (5,6,7), and slows down deliberate change in the meantime. Because the interfaces tested by integration tests do not change during a refactor, there is much less risk involved (5 is not applicable), and thus much less reluctance to refactor - this means refactoring will be more frequent and less complex resulting in a cleaner code base. Coupled with (9), relying on integration tests over unit tests can result in a far smaller gap between the realities of the developers and the realities of the users/product-owners, and a far cleaner code base.
- Considering (14) unit tests primary use-case is for the short-term debugging of code and they are a poor tool for asserting that the code does what the users/product-owners want it to do (which is what developers are paid to do).
Sorry for the late response! This is a great encapsulation of what happens over time if you start relying on unit tests. Also always in favor for pushing for having more integration tests, then unit tests and etc. You mentioned the full story on what's wrong with unit tests overtime which is also why I can't just agree with the only reason the guy gave which was that 'since they couple the code, then they are bad'. One might say the same for integration tests, that they couple the code. They by definition integrate more modules together to test their behavior together, but as you mentioned with integration tests you can easily swap out an old module with the new implementation and be assured that it will work (considering that tests would pass).
Btw, this was such a nice writeup on divergence between Unit Tests vs Integration Tests that I would like to put this somewhere we can come back to in future so I made an outline doc: https://source.getoutline.com/doc/fruitful-engineering-writeups-discussions-RU1f6RzGph
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😅 Cheers Shahzad, was really nice of you to write that! Thanks for creating the doc - hopefully we'll fill it full of fun chats soon enough 😁
Does sound like I should probs watch that video again though lol, hopefully I'm not overrating it 😆 And I'll have probs forgot a few of his better phrased points lol
query/graphql/planner/select.go
Outdated
countField, countExists := tryGetAggregateField(parsed.Fields, f, parser.CountFieldName, parser.CountFieldName) | ||
// Note: sumExists will always be false until we support filtering by nil in the query | ||
if !countExists { | ||
countField = f.CopyWithName(fmt.Sprintf("%s_count", f.Name), parser.CountFieldName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
countField = f.CopyWithName(fmt.Sprintf("%s_count", f.Name), parser.CountFieldName) | |
countField = f.CopyWithName(fmt.Sprintf("%s%s", f.Name, parser.CountFieldName), parser.CountFieldName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree with this, using parser.CountFieldName
implies that the suffix must be that value, when it really doesnt matter much what it is - so long as it is unique. Might be worth a code-comment noting the unimportance here though.
- consider documenting lack of importance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow the reasoning. The suffix can be anything? as is doesn't need to be "_count" or "_sum"?
In that case why not use some other naming scheme that is different say: "_virtualSum", "_virtualCount".
question: Are these the internal virtual fields that do the register like magic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suffix can be anything? as is doesn't need to be "_count" or "_sum"?
correct, if you prefer it to be something else I can change it (virtual is arguably incorrect though, these are more 'internal' fields used as dependency of _avg
)
Are these the internal virtual fields that do the register like magic?
No, these are different magic fields (:sweat_smile:) that allow average to do nothing besides divide _sum
by _count
instead of having to also sum and count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suffix can be anything? as is doesn't need to be "_count" or "_sum"?
correct, if you prefer it to be something else I can change it (virtual is arguably incorrect though, these are more 'internal' fields used as dependency of
_avg
)
Yes please, that seems like it would be more clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
query/graphql/planner/select.go
Outdated
sumField, sumExists := tryGetAggregateField(parsed.Fields, f, parser.SumFieldName, parser.SumFieldName) | ||
// Note: sumExists will always be false until we support filtering by nil in the query | ||
if !sumExists { | ||
sumField = f.CopyWithName(fmt.Sprintf("%s_sum", f.Name), parser.SumFieldName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sumField = f.CopyWithName(fmt.Sprintf("%s_sum", f.Name), parser.SumFieldName) | |
sumField = f.CopyWithName(fmt.Sprintf("%s%s", f.Name, parser.SumFieldName), parser.SumFieldName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as other comment.
- consider documenting lack of importance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even better! Thanks.
query/graphql/planner/select.go
Outdated
// tryGetAggregateField attempts to find an existing aggregate field that matches the given | ||
// name and arguements. Will return the match field and true if one is found, false otherwise. | ||
// name and template field. Will return the match field and true if one is found, false otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: What is meant by template field
in this context? The match here only checks Equality of the Name
, ExternalName
and Fliter
(including it's conditions) for every Select Field right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
template
is the name of an input param, the comment refers to that (I might add ticks round it though). Although on re-read, this looks like the comment will have made more sense under a previous version of the code, will look at revising.
- review tryGetAggregateField documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function reworked slightly and documentation re-written: 9eab58f
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
6fa9cd2
to
9eab58f
Compare
46cf62d
to
e1667ae
Compare
* Remove extra whitespace * Make aggregate a select not a field As more functionality is added to the aggregate system it will use more and more of the select properties until there is little-no difference between them. * Alias astValue Later commit(s) will make use of this variable * Handle multiple aliased _group properties with different arguments * Document field renaming * Update gql-go package to work with thunks * Check for type already being defined Thunk may be run multiple times and we we error if we try and add the inner type more than once * Add count filter support * Add sum filter support * Add average filter support
Closes #99
Add aggregate filter support for groups only. Types/intelisense will be visible/available in the clients on child relationships but using them will do nothing. I have put this PR up for review and would like to merge before sorting out child relation (and inline array) filters, as this is already a large and fiddly code change, and the child relations aggregate filter is blocked by #390 and inline array aggregate filters are blocked by #392 . I will likely do #390 and then the child relations aggregate filter stuff next, so this is unlikely to impact users (or you guys).
Also fixes a bug (no ticket) similar to #390 but for
_group
, where if different filters were added for_group
fields with different aliases they would previously both use the last defined filter. This is why_group
now uses the virtual/internal field mechanic that aggregates rely on (and child relations + inline arrays will likely also need to use before migrating our document structure from map[string]interface{} to []interface{})