Feature: improved diffing #103

ramezrafla · 2016-09-19T04:54:59Z

Summary: the diff-ing engine used by Meteor lets through content that has not changed, especially relevant when we are display arrays. We added a fast comparison function and used it in two key locations to prevent needless changes to the DOM.

Added check of params for template's methods events and helpers

… from 1.3 onwards.

ramezrafla · 2016-09-21T20:40:23Z

Thanks @mitar, I was only asking as it was left in limbo on the forum https://forums.meteor.com/t/community-involvement-setup/29538/15

Essentially our goal was not incremental DOM per se, it was to improve rendering and avoid wasted calls to DOM. I don't think our solution is optimal, but it is very good with the constraints around it.

Look forward to you completing your review, in the mean time, I'll try to see if I can invest time in this:

If you want to help, you can in meantime investigate (maybe you already did) why looping in Minimongo does not work as expected. So we all agree that if one document is inserted in the middle, we would want only that DOM element to be added there and this is it. Why is this not happening? Could #each ... in help here? So instead of trying to mask a problem by diffing, I would like that we do not even trigger a change notification.

It's likely a Tracker / diff issue (something MDG seems to have known about, and so have you)

As far as urgency, there is a message to the community: "Blaze is getting love and it's getting better!"

This could be a temporary fix.

mitar · 2016-09-21T20:50:40Z

It's likely a Tracker / diff issue (something MDG seems to have known about, and so have you)

No, this is not what is known. It is known that if a document for a particular entry changes at all, that one will be rerun completely, because default equality for data context is conservative. But this for insertion is something new, at least to me. And is not how it was documented in the past. So if this is happening, I would see this as a bug, maybe even regression. And I would like to know where it is coming from. It should work both for arrays with _id and Minimongo.

What I think we have to get to is:

rerendering is triggered anytime anything in data context changes (then, we can optimize how data context changes are detected, with similar approach to yours)
but not if some other documents in Minimongo are changed, or some other ones are inserted somewhere else, that should be fixed on Minimongo/array tracking level

As far as urgency, there is a message to the community: "Blaze is getting love and it's getting better!"

I think nobody in Meteor community expects "be quick and break things" approach to development. In any case, I do not see that as a good reason to not do things properly. I have also not seen 1000 upvotes or comments to quickly merge on this pull request, so I do not see any urgency from others either.

I know that when you make a pull request you feel amazing and would love to get things merged. But again, when you make a pull request which was not discussed before, then you have to take responsibility to hold your own horses. This is on you.

mitar · 2016-09-22T06:20:21Z

With the current approach, the 'changed' objects trickle down to be compared with what is actually in the DOM, field by field. The comparison is far from perfect and converts content to DOM elements to compare them.

Yes, this is why I am proposing that we do comparison with previous value of HTMLJS for that object. So instead of converting to DOM elements, we compare HTMLJS itself. And only apply those changes which we find different between HTMLJS values.

Again, this is what I am saying that could be an interesting optimization which would work similar to virtual DOM, now that you investigated incremental DOM and are saying that it is not a good match.

And I agree that we should also optimize the whole pipeline for Tracker recomputations as well. But now imagine that we can make DOM rendering faster without Tracker recomputations optimization. Then when we have both, so great!

You have very ambitious plans for Blaze (e.g. different materializers).

No, this is a very simple change, for this particular approach to materializing with comparing HTMLJS, because you are interested in this. Or was.

Yes observe-sequence is where we need to work, except it looks like a big task at the moment (diff-ing engine). Will require more dedication. Not to mention it's a core package and we have to work with MDG developers who are super busy. Won't happen right away.

It would still be great if we can solve the issue there as well. I do not think we should worry too much about pull requests not getting in, they are pretty efficient on merging pull requests recently. And even if it goes late on, it goes. But somebody has to make, to even go in at all.

mitar · 2016-09-22T06:27:01Z

packages/blaze/view.js

@@ -352,13 +352,16 @@ Blaze._materializeView = function (view, parentView, _workStack, _intoArray) {
      view._isInRender = false;

      if (! c.firstRun) {
+        // added by ramezrafla@zegenie to skip affecting DOM for unchanged content
+        if (Blaze._deepCompare(lastHtmljs,htmljs)) {


Oh, you are comparing last and new HTMLJS already. Why you wrote (or I understood) that we are talking about data layer comparison.

So yea, this is similar to what I had in mind with virtual DOM. To compare old and new HTMLJS.

So this part of the change is something I like.

I do think we can do better though. :-)

First, why are you not simply using EJSON.equals? If you think that one can be optimized, than I would suggest make pull requests to it. Hm, but do we want to introduce a new dependency for Blaze?

So, one thing I think we can improve is that it seems that HTMLJS can contain not-rendered templates. See this code snippet. This means that if you are comparing any such lastHtmljs and htmljs they will not be equal, yes? So maybe it would be better if we make one pass over the HTMLJS tree, converting it into a pure HTMLJS tree. And then compare. The issue with that is that this might confuse how rendered callback is called. We would have to mark which segments of HTMLJS require their view 'rendered' to be called. But my worry is that now with any more complicated HTMLJS segment we are not improving much. What do you think?

Next idea: now we are all or nothing. We compare HTMLJS, if it is equal, great. But if there is one attribute difference, we still go in and then create DOM elements for whole HTMLJS tree and compare them, no, inserting/updating them? Can you point me to where DOM elements are compared?

So currently I worry that this prevents only the innermost HTMLJS trees from being rerendered.

Can you point me to where DOM elements are compared?

In the past, the _isContentEqual did just that. We are no longer doing that now, we are comparing data before it becomes DOM.

So currently I worry that this prevents only the innermost HTMLJS trees from being rerendered.

If the deepcompare finds the objects to be the same, it makes sense to skip re-rendering, unless I missed something.

In the past, the _isContentEqual did just that. We are no longer doing that now, we are comparing data before it becomes DOM.

I mean, where it is comparing does it have to replace a DOM element with a new one. It seems to me that it does not do a new DOM element and then compares with current DOM, if necessary. But it removes a DOM element if previous view is invalided, it then creates a DOM element for new state of the view, and adds it. Even if comparing previous and new DOM element would show that it is almost the same, or even exactly the same.

If the deepcompare finds the objects to be the same, it makes sense to skip re-rendering, unless I missed something.

Yes, but I see those checks to be run only for very leaf rendering, nothing before. :-(

mitar · 2016-09-22T06:27:11Z

packages/blaze/view.js

-// Are the HTMLjs entities `a` and `b` the same?  We could be
-// more elaborate here but the point is to catch the most basic
-// cases.
-Blaze._isContentEqual = function (a, b) {


Why you didn't just improve this method?

I could have simply renamed the new function as ._isContentEqual, except this function is meant for DOM comparison, and we are comparing objects / primitives, so to avoid confusion I broke with the past.

mitar · 2016-09-22T06:29:24Z

packages/blaze/builtins.js

@@ -226,6 +226,11 @@ Blaze.Each = function (argFunc, contentFunc, elseFunc) {
        });
      },
      changedAt: function (id, newItem, oldItem, index) {


But this is at the data level, no?

Yes, we have two additional diff levels right (see my original PR comment)

Data level

Field level in htmljs

I think 1 will being gone once we improve the original diff-ing which calls the observe handlers

mitar · 2016-09-22T07:14:30Z

Can you check difference for the 01-dbmon example between current Blaze version and one with this pull request?

mitar · 2016-09-22T08:57:41Z

packages/blaze/view.js

@@ -352,13 +352,16 @@ Blaze._materializeView = function (view, parentView, _workStack, _intoArray) {
      view._isInRender = false;

      if (! c.firstRun) {
+        // added by ramezrafla@zegenie to skip affecting DOM for unchanged content
+        if (Blaze._deepCompare(lastHtmljs,htmljs)) {


So I looked into this change using this example and I could not get it to in fact come to this if (! c.firstRun) { condition with any other view than lookup:something.

If I did:

Collection.update({}, {$set: {value: 'a'}})

Change was from aa to a0.

And then if I did:

Collection.update({}, {$set: {value: 'a', test: 'b'}})

Change was from aa to aa.

So, I think that Blaze._isContentEqual moving up is a great idea for cases like above, where I am not minimizing my fields selector well in Minimongo. But I am not sure if there is any benefit of changing the function to deep-comparison? Can you please adapt my example to the case where you get something else than those simple values Blaze._isContentEqual is already checking? I could imagine only very extreme cases where you would be using {{value}} where value is not a simple variable, but an object, and you are using its stringified value to be shown in the template. But is anyone really doing that?

Right, we had deep variables. I think for a complex App, it is to be expected.

Here is your example adapted to our test case:
var obj = { a: { b : 1}}; Collection.update({}, {$set: obj});

No, it does not happen. So what I am saying is that you will never really do in your app:

{{value}}

Where:

value = {a: {b: 1}};

You might do:

{{value.a.b}}

But what I am saying is that in the comparison, Blaze just compares then values of bs, so primitive values.

So can you give me a realistic example where the value of lastHtmljs and htmljs is something else than what it is checked by Blaze._isContentEqual? I have updated my example with complex objects to test this.

Oh, I just used deepCompare for the sake of convenience and because I haven't gone through all the options. deepCompare short circuits comparison if it's a primitive so I am reusing the same function for risk mitigation as the data level.

But I did see somewhere in the code that htmljs can be views. Also, we skipped DOM comparison altogether that was there before, which was my biggest concern. If I was too cautious, then we can simply simplify the compare function. But, why would the original developers convert to DOM if it was always primitives?

But I did see somewhere in the code that htmljs can be views. Also, we skipped DOM comparison altogether that was there before, which was my biggest concern. If I was too cautious, then we can simply simplify the compare function.

Yes, but at a higher level. These views which have !computation.firstRun seems to be very simple ones.

Which is also a problem probably, that higher-levels are recreated.

So can you make in your codebase (on which you are doing these optimizations here) a check to print out when computation.firstRun is false and htmljs is something complicated.

But, why would the original developers convert to DOM if it was always primitives?

I have no idea why they convert at all before the if. The fact that you moved if out is great, but I have no idea why they did it like this. Does converting to DOM has some side-effects we do not know?

mitar · 2016-09-22T09:01:52Z

packages/blaze/builtins.js

@@ -226,6 +226,11 @@ Blaze.Each = function (argFunc, contentFunc, elseFunc) {
        });
      },
      changedAt: function (id, newItem, oldItem, index) {


In my example I could not get to this issue. If I instrument changedAt method to print out every time is called, and then I click "add 5" button, this is not called at all, not for 5 or any other value in the Minimongo database.

I tried even to limit my fields selector to {value: 1, test: 1}. Then:

Collection.update({}, {$set: {value: 'a'}}) newItem, oldItem: Object {value: "a", _id: "s84f5D5bwFgPzwvmk"} Object {value: 0, _id: "s84f5D5bwFgPzwvmk"} Collection.update({}, {$set: {value: 'a', test: 'b'}}) newItem, oldItem: Object {value: "a", _id: "s84f5D5bwFgPzwvmk", test: "b"} Object {value: "a", _id: "s84f5D5bwFgPzwvmk"} Collection.update({}, {$set: {value: 'a', test: 'b', test2: 'c'}})

The last change changes a field outside of the projection. And changedAt is not called.

So, I could not reproduce that changedAt would be called unnecessary if a value was not really changed.

Right, our test case was as per above comment:
var obj = { a: { b : 1}}; Collection.update({}, {$set: obj});

So I could not reproduce that changedAt when data projected using fields didn't really change.

For example, if I have the following cursor over which I am doing #each:

Collection.find({}, {sort: {'value.a': 1}, fields: {'value.a': 1, test: 1}});

And values are inserted like:

Collection.insert({value: {a: 1}});

When I do:

Collection.update({}, {$set: {'value.b': 'b'}})

changedAt is not called at all.

And if I insert one one document in the middle, only Blaze for that one document is called, and no changedAt at all, for nothing.

The only difference with what I did is that it was server side update with method. If you can replicate it, it means that the problem is in the publish / diff-ing.

Could you please make an example showing what you observed?

We need tests for these changes anyway.

mitar · 2016-09-22T19:03:10Z

(I wanted to make this an inline reply to a inline comment, but I am realizing it is pretty valuable, so I am moving it top-level.)

Yes, sorry about that. I understood something else under "data level". For me this is more data context level, which has its own issues. Sorry for misunderstanding.

Maybe we should define terminology here better. To my knowledge we have issues at:

data context level, equality being what it is, which triggers reruns on any non-primitive value change:
- possible solutions:
  - get rid of data context
  - allow custom data context with custom equality function
tracker level, which is similar to data context level, but means that in general any reactive dependency might rerun more often than necessary
- possible solutions:
  - using something like computed field to be precise about dependency changes
no tracker caching level: if you have autorun with multiple dependencies, when one of those change, also other are recomputed, instead of previously returned value just cached and returned
minimongo and array processing level: we want changes to be triggered only when data really changes, and only for those changes
blaze data level (what you are talking about here, I think): that propagation of data through blaze might be ineffective and be done when data has not really changed
HTMLJS materialization level: when HTMLJS does not really change, but Blaze still builds DOM elements and does diff against DOM
interacting with DOM level: querying DOM for state is often expensive
- for more information:
- possible solutions:
  - ignore 3rd party changes in DOM and just trust the previous HTMLJS state and apply it blindly (should probably be configurable per template)
  - do rendering only inside the requestAnimationFrame when DOM is frozen and do not do layout trashing, see example

Did I miss anything?

ramezrafla · 2016-09-22T19:44:55Z

@mitar
You obviously know a lot more of the Blaze history than I do, I just jumped recently to resolve the needless re-rendering when no data has changed. I am sure I will learn all these issues as I go carefully through the code and the issues.

ignore 3rd party changes in DOM and just trust the previous HTMLJS state and apply it blindly (should probably be configurable per template)

We have an App that affects the DOM directly and had no issues. I think it has to do with carefully managing collisions. We do not with JS affect anything that is handled by Blaze. If that separation is done, we're good

blaze data level (what you are talking about here, I think): that propagation of data through blaze might be ineffective and be done when data has not really changed
HTMLJS materialization level: when HTMLJS does not really change, but Blaze still builds DOM elements and does diff against DOM

Right! So the first one has to do with the diff-ing algorithm in the compare-sequence lib you had talked about before. This is where the real fix should be applied, unless it breaks something else (is this used by Tracker?)

The second is where I believe we have a lot of savings. We don't build DOM elements to diff against DOM. This is bad design. We should always diff raw data, not display layer. This is what makes Blaze fast (like VirtualDOM) and is the reason Incremental DOM is slower than Virtual DOM.

Edit: And given so many unchanged objects can be fed through, that DOM element creation for diffing is really problematic

mitar · 2016-09-26T02:56:15Z

You obviously know a lot more of the Blaze history than I do, I just jumped recently to resolve the needless re-rendering when no data has changed. I am sure I will learn all these issues as I go carefully through the code and the issues.

Oh, maybe I do know about history, but about internals we are all learning. I think you looked much deeper than I ever did in details of rendering itself. So we are all learning here.

We have an App that affects the DOM directly and had no issues. I think it has to do with carefully managing collisions. We do not with JS affect anything that is handled by Blaze. If that separation is done, we're good

Yes, me too. Or you leave to Blaze, or you use something else. But do not mix. :-)

So the first one has to do with the diff-ing algorithm in the compare-sequence lib you had talked about before. This is where the real fix should be applied, unless it breaks something else (is this used by Tracker?)

But I have not been able to reproduce it so that I could look what is happening.

But I was using Minimongo and not arrays directly.

The second is where I believe we have a lot of savings. We don't build DOM elements to diff against DOM. This is bad design. We should always diff raw data, not display layer. This is what makes Blaze fast (like VirtualDOM) and is the reason Incremental DOM is slower than Virtual DOM.

Yes, but I could not really find where diffing is done. I have found diffing for attributes, but for DOM it seems like Blaze simply removes old element and adds a new one, when something changes. It does not try to diff and patch the old one?

ramezrafla · 2016-09-26T23:47:45Z

From view.js

var rangesAndNodes = Blaze._materializeDOM(htmljs, [], view);
if (! Blaze._isContentEqual(lastHtmljs, htmljs)) {
domrange.setMembers(rangesAndNodes);
Blaze._fireCallbacks(view, 'rendered');
}

So this creates the DOM elements before comparing

mitar · 2016-09-27T20:55:33Z

So this creates the DOM elements before comparing

Yes, but it compares HTMLJS, not DOM.

For DOM it seems (inside setMembers) that it simply removes old and adds new, I do not see any logic to really compare old DOM and new DOM to see the minimal change. Hm.

mitar · 2016-10-04T09:11:10Z

BTW, this is waiting for tests/reproducible example for the changedAt issue.

The second change I like and I think we should add it to Blaze, but I will maybe add it with _isContentEqual, but also tests would be needed. This one seems pretty easy to do.

mitar · 2016-11-15T19:26:56Z

So I think this should be split into two pull requests. One for moving _isContentEqual, and one for data. And we need tests for both, to have a case when that code is match at all.

mitar · 2016-12-30T08:41:58Z

I cleaned master branch to remove all Meteor history (#112). This means the pull request should be also updated to new history to not base it on old commits. Sorry for this extra work.

Based on #103.

mitar · 2016-12-31T19:25:54Z

Based on this pull request, I made a change in e17991a, which prevents unnecessary materialization of DOM. This is what I have able to make a reproduction myself. For the other part, checking a change in data layer I have not been able to make a reproduction.

I am closing this pull request. If we find a good reproduction for that other case, we could look into optimizing that as well.

For more ideas how to optimize Blaze see #111.

arggh · 2016-12-31T19:37:05Z

@mitar Did you benchmark this change on any of your test cases?

mitar · 2016-12-31T19:44:14Z

Sadly not, I do not have any real case where this is happening. I am relaying on the report in this pull request. Feel free to make a test case/benchmark and help here.

arggh · 2016-12-31T19:56:53Z

I might have something useful in one of my projects for a benchmark.

ramezrafla · 2016-12-31T20:43:42Z

Beautiful @mitar! Thanks!

@arggh, we caught this as we outputted to the console every re-render. This can improve performance dramatically by skipping DOM manipulation if we have a very dynamic page. I am guessing this change is going to be very noticeable for a lot of people. Look forward to your benchmark.

There is also room for improvement in the actual diffing but that's a bigger job.

mitar · 2017-01-01T08:37:50Z

One more thing. I was unable to get to not c.firstRun for anything else besides primitives for which Blaze._isContentEqual is good at checking equality. If anyone can get there with something which Blaze._isContentEqual is not comparing well, we might want to upgrade Blaze._isContentEqual to something which checks values in depth.

martijnwalraven and others added 30 commits February 9, 2016 12:43

Merge branch 'release-1.3' into cordova-improvements

3c38adc

Merge pull request #4520 from dolgarev/devel

b597e60

Added check of params for template's methods events and helpers

Cleaner code.

1cf7e85

Rename separated minifiers to singular name and use separate versions…

c44ce7d

… from 1.3 onwards.

Rename separated minifiers to singular name and use separate versions…

68fd857

… from 1.3 onwards.

Rename separated minifiers to singular name and use separate versions…

e3b1508

… from 1.3 onwards.

Bump package versions for 1.3-cordova-beta.4 release

269202d

Bump package versions for 1.3-cordova-beta.4 release

9c3ee62

Bump package versions for 1.3-cordova-beta.4 release

cbe3abf

Bump package versions for 1.3-cordova-beta.4 release

7bbecc3

Bump package versions for 1.3-cordova-beta.4 release

ae4eac8

Bump package versions for 1.3-cordova-beta.4 release

f2f2308

Bump package versions for 1.3-cordova-beta.4 release

89b2b5b

Bump package versions for 1.3-cordova-beta.4 release

22df8af

Bump package versions for 1.3-cordova-beta.4 release

effcd94

Bump package versions for 1.3-cordova-beta.4 release

8412d84

Bump package versions for 1.3-cordova-beta.4 release

3a3cf22

Bump package versions for 1.3-cordova-beta.5 release

4bf96c9

Bump package versions for 1.3-cordova-beta.5 release

c12fd5e

Bump package versions for 1.3-cordova-beta.5 release

eeeed2a

Bump package versions for 1.3-cordova-beta.5 release

9e59b6e

Bump package versions for 1.3-cordova-beta.5 release

b6900a5

Bump package versions for 1.3-cordova-beta.5 release

edc915e

Bump package versions for 1.3-cordova-beta.5 release

b9af261

Bump package versions for 1.3-cordova-beta.5 release

1f1cd8d

Bump package versions for 1.3-cordova-beta.5 release

6478d98

Bump package versions for 1.3-cordova-beta.5 release

aa2d5ff

Bump package versions for 1.3-cordova-beta.5 release

af4cdb7

Bump package versions for 1.3-beta.9 release

64e8f0e

Bump package versions for 1.3-beta.9 release

db859ab

mitar reviewed Sep 22, 2016

View reviewed changes

mitar mentioned this pull request Sep 24, 2016

Ideas for optimizing Blaze #111

Closed

mitar added pr missing history entry pr missing tests pr should be split into multiple labels Nov 15, 2016

mitar force-pushed the master branch from 239be86 to d95e9f4 Compare December 30, 2016 08:35

mitar added a commit that referenced this pull request Dec 31, 2016

Prevent unnecessary materialization of DOM.

e17991a

Based on #103.

mitar closed this Dec 31, 2016

Feature: improved diffing #103

Feature: improved diffing #103

Conversation

ramezrafla commented Sep 19, 2016

ramezrafla commented Sep 21, 2016

mitar commented Sep 21, 2016

mitar commented Sep 22, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mitar commented Sep 22, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mitar commented Sep 22, 2016 • edited Loading

ramezrafla commented Sep 22, 2016 • edited Loading

mitar commented Sep 26, 2016

ramezrafla commented Sep 26, 2016 • edited by mitar Loading

mitar commented Sep 27, 2016

mitar commented Oct 4, 2016

mitar commented Nov 15, 2016

mitar commented Dec 30, 2016

mitar commented Dec 31, 2016 • edited Loading

arggh commented Dec 31, 2016

mitar commented Dec 31, 2016

arggh commented Dec 31, 2016

ramezrafla commented Dec 31, 2016

mitar commented Jan 1, 2017

mitar commented Sep 22, 2016 •

edited

Loading

ramezrafla commented Sep 22, 2016 •

edited

Loading

ramezrafla commented Sep 26, 2016 •

edited by mitar

Loading

mitar commented Dec 31, 2016 •

edited

Loading