Skip to content
This repository has been archived by the owner on Dec 2, 2024. It is now read-only.

Raduom/marconi initiative #585

Merged
merged 5 commits into from
Jul 27, 2022
Merged

Conversation

raduom
Copy link
Contributor

@raduom raduom commented Jul 12, 2022

Basically put down some thoughts on why Marconi makes sense.

PR: https://input-output.atlassian.net/jira/software/c/projects/PLT/boards/788?modal=detail&selectedIssue=PLT-503

This is an ADR.

Pre-submit checklist:

  • Branch
    • Tests are provided (if possible)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
    • Formatting, PNG optimization, etc. are updated
  • PR
    • Self-reviewed the diff
    • Useful pull request description
    • Reference the ADR in the PR and reference the PR in the ADR (if revelant)
    • Reviewer requested

@raduom raduom self-assigned this Jul 12, 2022
Copy link
Contributor

@andreabedini andreabedini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good. I allowed myself to reword some parts but feel free to take what you like and discard the rest. I commented on a couple of points that are unclear to me.


B. The monolithic architecture lacks the features necessary to customise the indexed set of data, the Chain Index provides only all-or-nothing indexing. While this can be addressed the monolithic architecture makes it an uphill battle.

C. Also due to the monolithic architecture that assumes that there is only one index running we ran into trouble when we made the Plutus Application Backend collect and index information requested by smart contracts. Now we have two components that index information from the blockchain, but they are not synchronised. Querying the Chain Index about transactions received from the Plutus Application Backend often results in no data returned, since the Chain Index indexes data slower than the PAB.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me this sounds just like a PAB bug which can be solved without a re-architecture. Like point B above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion to add a couple of commas:
Line 13: "While this can be addressed, the monolithic..."
Line 15: "...that assumes that there is only one index running, we ran into..."

Copy link
Contributor

@koslambrou koslambrou Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To continue @andreabedini's point, this is a PAB architectural bug which is addresses here: #550 . Once the PR is merged, I'll add a reference.

Copy link
Contributor Author

@raduom raduom Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreabedini I am not really convinced about C not being an architectural bug. You can sort of solve it if your indexing solution fits into one machine (and note, that it may need to fit into a developer's machine, not necessarily a server), but if it does not, then you will need some kind of distribution anyway, in which case life gets complicated for chain-index.

In general, I think that everything can be solved without re-architecting, but I think that the solutions would be rather complex, that was my point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move the discussion to #550.


C. Also due to the monolithic architecture that assumes that there is only one index running we ran into trouble when we made the Plutus Application Backend collect and index information requested by smart contracts. Now we have two components that index information from the blockchain, but they are not synchronised. Querying the Chain Index about transactions received from the Plutus Application Backend often results in no data returned, since the Chain Index indexes data slower than the PAB.

D. The lack of non-functional requirements resulted in software that uses an unreasonable amount of resources and results in slow synchronisation speeds. And since everything is monolithic it is difficult to turn off indexing of data which is not required by our customers there is no way to limit the required resources.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds a bit like point B but lack of non-functional requirements is importan on its own. Perhaps merge with point E below?

e.g.

B. The monolithic architecture lacks the features necessary to customise the indexed set of data, the Chain Index provides only all-or-nothing indexing. Without changing the code, it impossible to turn off indexing of data that is not required by our customers. While this can be addressed in principle, the monolithic architecture makes it an uphill battle.

E. The lack of non-functional requirements resulted in software that uses an unreasonable amount of resources and results in slow synchronisation speeds. Combined with the lack of a specification this makes the testing feel ad-hoc and like an afterthought.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

B is about the inability of splitting up indexing into several components that can be optionally turned off, whereas E (which is the point I suppose you mean) is about the lack of a testing strategy.


1. Managing rollbacks is very simple and fast. We drop the events that were rolled back. (no complicated logic is required to undo the projection of the list of events on disk, which we would need if we would want to store everything on disk as fast as possible).

2. Making 'K' configurable makes the design already quite scalable. Developers do not usually need to guard themselves against rollbacks by K blocks so they can choose to only store 10 events in memory thus sacrificing chain integrity for RAM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thus sacrificing chain integrity for RAM

I think this point is very important but "sacrificing chain integrity" with no extra detail sounds a bit scary. Can you expand?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: reverse order of "only" and "store" -->
"...so they can choose to store only 10 events in memory..."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, you're not exactly sacrificing chain integrity, you're sacrificing the speed in which Marconi will handle rollbacks (since rollbacks on data that is persisted on disk is more expensive than data that is in memory).


The indexed data is accessible through queries. There are no constraints on the format of queries or results. Both are identified by a type variable that the indexer exposes and the code for both of them has to be provided by the user.

This is currently the only area where the users can customise the indexer by providing functions that handle queries and storage. Note that the queries have to run on both the in-memory data and the data that is stored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely need a specification for this API.

Copy link
Contributor

@koslambrou koslambrou Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an ADR coming for this API in the next sprint probably. Right @raduom ?
In that case, the two ADRs should be linked together.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question about line 69: "...the code for both of them has to be provided by the user." Can you clarify what "both of them" is referring to? Is it the queries and the type variable? Or the results?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely need a specification for this API

Is the type signature of the required functions what you had in mind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@koslambrou yes. I am not sure how I am supposed to link them together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question about line 69: "...the code for both of them has to be provided by the user." Can you clarify what "both of them" is referring to? Is it the queries and the type variable? Or the results?

I clarified (I hope) the meaning.

Versioned contents
==================

We need some sort of versioning or sequencing of events, notifications and queries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what you mean by this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote it. Hopefully it's a little more clear now.


E. The same lack of a specification and non-functional requirements makes the testing feel ad-hoc and like an afterthought.

The Chain Index was meant to be a software application that supports the execution of smart contracts. And, in that, it succeeded. However, we found out that our customers would rather have a library of functionality that they can customise to build their indexers, only for data they care about for their application, using whatever storage engine that they prefer and supporting only the queries that they need to support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Chain Index was meant to be a software application that supports the execution of smart contracts. And, in that, it succeeded. However, we found that our customers would rather have a library of functionality that they can customize to do the following:

  • to build their own indexers,
  • to work only with the data that they care about for their application,
  • to use whatever storage engine they prefer, and
  • to support only the queries that they need to support.


1. We must keep K events in memory, which (depending on how large events are) can waste some memory. Our educated guess is that this is a reasonable compromise, but depending on how large events can get that may not be the case for your use case.

2. Queries are more complicated as we need to query both events in memory and the state persisted on disk.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:
"2. Queries are more complicated as we need to query events both in memory and in the state persisted on disk."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed both and replaced the word query with scan.

Event streams
=============

To support PAB functionality which subscribes to a source for a set of event types we need a way to produce events from indexers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding a comma:
"...for a set of event types, we need a way..."

Similar work
============

One of the advantages that Marconi has over Oura and Scrolls is the fact that they are both a streaming solution and an indexer. So you can listen to a stream of events and know that those events are reflected in the index.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly unclear on the meaning. Can you clarify a little bit? Is this saying that Marconi is both a streaming solution and an indexer? I'm not sure what the word "they" is referring to --> "they are both a streaming solution and an indexer"

I like this document! Very informative.

Copy link
Contributor Author

@raduom raduom Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to rewrite this, but I think I need a little bit of time to look over the competing solutions to make sure I am not assuming too many things about them. Leave this conversation open for a little bit in the mean time.

Copy link
Contributor

@joseph-fajen joseph-fajen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Radu, I left some comments. I found this to be a very informative document!

Making a case for Marconi
=========================

Plutus off-chain code oftentimes needs access to indexed portions of the blockchain. The Chain Index project is the initial solution meant to deliver access to this kind of data. However, after release, a couple of shortcomings were identified which prompted the development of an indexing solution that is based on a different set of architectural and functional constraints.
Copy link
Contributor

@koslambrou koslambrou Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Plutus off-chain code oftentimes needs access to indexed portions of the blockchain. The Chain Index project is the initial solution meant to deliver access to this kind of data. However, after release, a couple of shortcomings were identified which prompted the development of an indexing solution that is based on a different set of architectural and functional constraints.
Plutus off-chain code oftentimes needs access to indexed portions of the blockchain. The Chain Index (or `plutus-chain-index`) project is the initial solution meant to deliver access to this kind of data. However, after release, a couple of shortcomings were identified which prompted the development of an indexing solution that is based on a different set of architectural and functional constraints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed all the Chain Index strings to plutus-chain-index. I think it looks better.


Some of the problems we identified due to the above-mentioned approaches are:

A. The use of free algebras (the `freer-simple` package) makes the code fairly complex and difficult to understand (quite a few type-level computations are happening). The separation between syntax and semantics imposed by the library also complicates matters for no clear reason (for example, if we write two semantics, one for pure code used for testing and one for production code, then there would be a lot of production code that would not be tested).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the point, but I don't think I'm in agreement. The use of an effect system in plutus-chain-index was to, like you said, have two implementations: one for the emulator and one for production use. While I agree with the point that "there would be a lot of production code that would not be tested", saying that the effect system "makes the code fairly complex and difficult to understand" is too opinionated. I believe we could have a similar problem (havingcomplex code) even if we tried to reproduce the same behavior with mtl instead of freer-simple (there's even more boilerplate with mtl IMO).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not the only person that said that our code is "fairly complex and difficult to understand". At least one of our clients pointed that to us in a previous occasion. I was referring to the type-level programming here, which is not so prevalent in a MTL based solution. It was not about boilerplate as much as complexity.


E. The same lack of a specification and non-functional requirements makes the testing feel ad-hoc and like an afterthought.

The Chain Index was meant to be a software application that supports the execution of smart contracts. And, in that, it succeeded. However, we found out that our customers would rather have a library of functionality that they can customise to build their indexers, only for data they care about for their application, using whatever storage engine that they prefer and supporting only the queries that they need to support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well. "Succeeded" is probably not the right term. I would say that functional requirement were met, but non-functional requirements were not adequately thought about, which resulted in a bad experience for users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant here is that it was usable. And it still is usable. It can be faster, more modular, etc. But it works, which is a great achievement given the lack of specification and complexity of the problem, which is something I wanted to give props for.

Design principles of Marconi
============================

I follow the Algebra Driven Design approach for Marconi components, so from the get-go, we will have a checked specification for the software that we develop.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I -> We?


The indexing solution has the following basic requirements: it needs to deal with rollbacks as elegantly as possible and provide a way to compromise between memory, disk and CPU usage.

On the Cardano blockchain, there are frequent rollbacks, but they can only span a maximum of 2160 blocks (and most of them are < 10 blocks). We call the 2160 number the security parameter K (and we denote it by 'K' henceforth).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to reconfirm this fact. I read an important detail about the K parameter. I always though that it meant that once a block is K blocks deep in the blockchain, then it is immutable. However, that is not the case. The real formula to determine how deep a block should be in order for it to be immutable is something like (I don't remember it exactly): 2 * K / activeSlotCoef. I'll need to research this more.

Copy link
Contributor Author

@raduom raduom Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as there is a formula, there is a K which, I think, should be enough for this document. But we should investigate the details here.


2. Making 'K' configurable makes the design already quite scalable. Developers do not usually need to guard themselves against rollbacks by K blocks so they can choose to only store 10 events in memory thus sacrificing chain integrity for RAM.

3. In case of a restart recovery is very simple. We only store fully confirmed transactions so there is nothing to do other than resume operation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rephrase this. We only store fully confirmed transactions IF the users decided on storing K blocks in memory and the rest on disk. It's a user configuration option right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated no. 3.


This architectural decision has some desirable effects:

1. Managing rollbacks is very simple and fast. We drop the events that were rolled back. (no complicated logic is required to undo the projection of the list of events on disk, which we would need if we would want to store everything on disk as fast as possible).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's only simple and fast IF the data is memory. If it's on disk, then the rollback is an expensive operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You never rollback things that are on-disk. That is why the indexers are fairly simple.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Then with my current understand, I would heavily object to this decision. Here's my perspective:

If you never rollback things that are on-disk, then it makes no sense to give the user the option of defining how many blocks are kept in memory. You really don't want to affect the integrity of the on-disk data.

IF we want to give the user the option to specify the amount of blocks in memory (like 100 for example), then we definitely need to handle rollbacks on data that is on disk. Customizability is one of the main key points of Marconi. Doesn't made sense to break chain data integrity we need to rollback, let's say, 500 blocks deep. Plus, I though the indexer (hysterical-screams) was meant to make the process of implementing rollbacks simple to handle between data that is on disk and data that is on memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you never rollback things that are on-disk, then it makes no sense to give the user the option of defining how many blocks are kept in memory. You really don't want to affect the integrity of the on-disk data.

I would not mind giving this option to developers working on machines that have very little ram, however for production-level indexers it would be pretty bad.

The way we did it before is that we hid this option under a 'development options' header. We could have a development-mode run configuration which allows setting K < 2160 and a production-mode run configuration that only allows for K > 2160.

As a different alternative, we can always complicate the indexers to handle on-disk rollbacks and provide that as an option, in case users don't mind a bit more complexity.


We need some sort of versioning or sequencing of events, notifications and queries.

1. Synchronisation of multiple indexers (queries have a validity interval)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to the other ADR once available.

@koslambrou
Copy link
Contributor

@raduom This ADR will probably get a lot of feedback/comments. I would suggest:

  • to keep this PR open for some sufficient amount of time in order to allow everyone time to put in their comments
  • mark it as draft in order to not merge it yet

What do you think?

Similar work
============

One of the advantages that Marconi has over Oura and Scrolls is the fact that they are both a streaming solution and an indexer. So you can listen to a stream of events and know that those events are reflected in the index.
Copy link
Contributor

@koslambrou koslambrou Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section should (maybe? probably?) be expanded. Or maybe wait a bit? @joseph-fajen is working on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have this slight problem which is the slot number changes which may add some updates to this document and I also don't know very well how oura/scrolls work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, we can wait for Joseph's work which compares the different tools.

Similar work
============

One of the advantages that Marconi has over Oura and Scrolls is the fact that they are both a streaming solution and an indexer. So you can listen to a stream of events and know that those events are reflected in the index.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Links to Oura and Scrolls would be helpful.

@raduom raduom marked this pull request as draft July 13, 2022 12:44
@raduom
Copy link
Contributor Author

raduom commented Jul 13, 2022

Done.

@raduom
Copy link
Contributor Author

raduom commented Jul 13, 2022

  • mark it as draft in order to not merge it yet

I have marked it as draft.

@raduom raduom force-pushed the raduom/marconi-initiative branch 5 times, most recently from 764d182 to 849f2e0 Compare July 15, 2022 08:41
@raduom raduom requested review from koslambrou, andreabedini and joseph-fajen and removed request for andreabedini July 15, 2022 12:05
Copy link
Contributor

@joseph-fajen joseph-fajen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like some helpful clarifications here.

@raduom raduom force-pushed the raduom/marconi-initiative branch from d071ed5 to 79b0182 Compare July 16, 2022 06:32
@asutherlandus
Copy link

I think this is a really good start on this doc. Its very informative with just the right level of detail. I'm waiting to see the final version of the similar work section, but we could consider just omitting it. I do think we need further discussion on handling rollbacks greater than K. One minor formatting concern: The index tree in the left hand column of the RTD doesnt have ADR 4 with the subheadings nested underneath. https://plutus-apps--585.org.readthedocs.build/en/585/

@koslambrou
Copy link
Contributor

I do think we need further discussion on handling rollbacks greater than K

@asutherlandus After our discussion with Mamba, it seems like there's no need to handle rollbacks greater than K, because there won't be any. Once the block is K (2160 on the current mainnet) blocks deep (with an average of 1 block every 20s), the block should be immutable. That's what I understood from the conversation.

@raduom raduom marked this pull request as ready for review July 26, 2022 19:57
@raduom raduom force-pushed the raduom/marconi-initiative branch 2 times, most recently from e8d7374 to 1917bbc Compare July 27, 2022 04:09
Copy link
Contributor

@andreabedini andreabedini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@raduom raduom force-pushed the raduom/marconi-initiative branch from 1917bbc to 38e0a30 Compare July 27, 2022 07:03
@raduom raduom merged commit e0f2ab8 into IntersectMBO:main Jul 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants