Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StatsModule: Optimize data selection and render #12712

Closed
wants to merge 3 commits into from

Conversation

edo-ran
Copy link
Contributor

@edo-ran edo-ran commented Apr 2, 2017

This is performance improvement to StatsModule as part of performance improvement to My Sites.

getSiteStatsNormalizedData uses createSelector, which memoize the result, but because StatModule is generic and uses this selector for all types of stats the result end up never been cached because the dependent data for cache invalidation was the data of the specific stats type. This, in turn, caused the StatsModule component to render over and over even though the data didn't change.

Using the new createSelectorPerKey now invalidate the cache on a per-key basis which eliminates most of the wasted render calls.

There is also a change in the way state/stats/list/reducer.js handle received stats. Before all of the items subtree would have been recreated because of the use of merge( {}, existingItems, ... ). Now only the path that changed in site -> statType -> queryKey is recreated, the rest of the state remain exactly the same which allows connect HOC to detect nothing has changed and not re-render StatsModule.

The result is about 40ms of wasted render time instead of about 800ms of wasted render time.

To Test

  1. Visit the live branch or pull the branch
  2. Transition between Reader and My Sites - use site with lots of stats

I've done the test #10326 even though we're not using merge any more but just to be on the safe side.

@matticbot matticbot added [Size] M Medium sized issue OSS Citizen labels Apr 2, 2017
@edo-ran edo-ran added [Type] Performance [Status] In Progress [Type] Enhancement Changes to an existing feature — removing, adding, or changing parts of it labels Apr 2, 2017
getSiteStatsNormalizedData uses `createSelector`, which memoize the result, but because StatModule is generic and uses this selector for all types of stats the result end up never been cached because the dependent data for cache invalidation was the data of the specific stats type. This, in turn, caused the StatsModule component to render over and over even though the data didn't change.

Using the new `createSelectorPerKey` now invalidate the cache on a per-key basis which eliminates most of the wasted render calls.

Because the dependent data still been changed, but the content stays the same, I've also implement `shouldComponentUpdate` and measure it to ensure it takes less time checking if the data changed than render the component even though nothing changed.

The result is about 120ms of wasted render time instead of about 800ms of wasted render time.
@edo-ran edo-ran force-pushed the update/stats-module-selector-by-key branch from 9a9fafb to 377240b Compare April 3, 2017 18:29
}

// Special case for data prop, do deep equality check.
const isDeepEqual = isEqual( this.props[ key ], nextProps[ key ] );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some data payloads include nested child objects like Clicks, Referrers, and Authors - any concerns with using isEqual knowing that? Specifically Referrers can potentially be 3 levels deep:

stats_ the_wordpress_com_blog _wordpress_com

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've measured the time it takes for the whole shouldComponentUpdate to run, not just the isEqual and got 70ms for the entire interaction of transition from Reader to My Sites which reduce the wasted render from at least 600ms down to 40ms so it still worth it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - curious was that on a site with a large amount of stats data... specifically referrers?

@@ -105,3 +106,28 @@ export default function createSelector( selector, getDependants = DEFAULT_GET_DE
}, { memoizedSelector } );
}

export function createSelectorPerKey( selector, getDependants = DEFAULT_GET_DEPENDANTS, getCacheKey = DEFAULT_GET_CACHE_KEY ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we possibly add some test coverage for this new function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You right, I've pushed commit with tests for the new method. I'm using the existing test suite of createSelector and added test specific for the new method.

The new createSelectorPerKey is different from createSelector by the fact is store dependent data and invalidate cache on a per-key basis.

This commit adds tests to verify it. Most of the test are exactly the same, that's why we use the exact same tests only run them with each of the functions.
@matticbot matticbot added [Size] XL Probably needs to be broken down into multiple smaller issues and removed [Size] M Medium sized issue labels Apr 3, 2017
The stats.list.items reducer keep the state tree hierachy by site -> statType -> queryKey -> data.
When new stats is recieved the state tree needs to update to include the new stats.
Until now the update were done using `merge` function by passing empty object as first parameter which resulted in re-building the whole state tree. Even though the values looked the same the object instances were different ( !== ).

This resulted in over-render of all the `StatsModules`. Every time any of the statTypes were received _all_ of the `StatModule`s instances were rendering, not just the one that actually changed.

The fix was to rebuild the state tree from the bottom-up, preserving all properties that were not changed. That is any `queryKey`s that were already received were left with the exact same object instance, any other `statType` were left the same and any other sites were left the same.

This allowed the `createSelectorPerKey` to invalidate the cache only to the stats that was actually change and serve all the others from the cache, shortcutting render to all but the `StatsModule` that was actually needed it.
@edo-ran edo-ran added [Status] Needs Review The PR is ready for review. This also triggers e2e canary tests and wp-desktop tests automatically. and removed [Status] In Progress labels Apr 5, 2017
@edo-ran
Copy link
Contributor Author

edo-ran commented Apr 5, 2017

I have pushed a commit that fixes the way the reducer handle received stats to preserve unchanged values in items state tree. This allows to completely remove the shouldComponentUpdate.

@edo-ran
Copy link
Contributor Author

edo-ran commented Apr 5, 2017

@aduth I've added the createSelectorPerKey function as we discussed in slack. I've added it as a new function instead of changing createSelector. I think it will be safer to change the original after I'll add statistics collection so we can see what the effect of the change is.

Do you think this is the right way or should I change the implementation of createSelector to work on a per key basis?

@aduth
Copy link
Contributor

aduth commented Apr 5, 2017

A worry I have with something like this is that it encourages developers to avoid tuning their code and instead apply heavier and more granular caching. I'd be curious to measure, particularly on mobile devices, just how much we're exhausting memory versus raw CPU execution. The size of global state and memoize cache alone must be intense for limited memory devices.

To the specific example, the getDependents (second argument) of getSiteStatsNormalizedData performs an expensive object serialization and seems to me as though this might defeat the purpose of memoizing at all, because the selector itself is barely more expensive than the dependents getter which is invoked every time the selector is called.

Minor points to the implementation itself:

  • Can we do a better job of reducing code duplication between the two variants?
  • How can we make it less awkward that one of the two variants is the default export and the other merely a named export? Could we standardize on named exports?
  • Should createSelectorPerKey be a separate function or defined as option to the base createSelector (fourth argument)?

@timmyc
Copy link
Contributor

timmyc commented Apr 5, 2017

Can we do a better job of reducing code duplication between the two variants?

➕ on this - I noticed that as well in re-reading the code just now.

@matticbot
Copy link
Contributor

@edo-ran This PR needs a rebase

@samouri
Copy link
Contributor

samouri commented Aug 29, 2017

I just ran into this same issue: createSelector invalidating the cache more frequently than ideally would be happening.

I was able to get most of the way there without making a new helper function by doing something slightly counterintuitive in this pr. Essentially instead of making your dependent look like this: get(state.stats.lists.items, [ keys, go, here ], what if you made it: state.stats.lists.items.

@alisterscott
Copy link
Contributor

Closing as this PR is stale - please reopen if necessary

@alisterscott alisterscott deleted the update/stats-module-selector-by-key branch October 24, 2017 05:14
@matticbot matticbot removed the [Status] Needs Review The PR is ready for review. This also triggers e2e canary tests and wp-desktop tests automatically. label Oct 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OSS Citizen [Size] XL Probably needs to be broken down into multiple smaller issues [Status] Needs Rebase [Type] Enhancement Changes to an existing feature — removing, adding, or changing parts of it [Type] Performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants