-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iterate on the new verification_jobs
table
#1826
Comments
I mostly agree with above fields. Errors need multiple fields due to the API design in Stoplight. I propose to use exactly these names for the columns: Table:
|
Small comment on |
Additionally this was what's recommended by Claude. I don't know to what extent this is useful but I see we can go beyond the minimal cpu+memory info: const os = require('os');
const v8 = require('v8');
const getHardwareInfo = () => {
const memoryUsage = process.memoryUsage();
const heapStats = v8.getHeapStatistics();
return {
cpu: {
model: os.cpus()[0].model,
cores: os.cpus().length,
speed: os.cpus()[0].speed,
allocated_cores: process.env.CPU_LIMIT || os.cpus().length,
cpu_quota: process.resourceUsage().maxCPU || null
},
memory: {
total_allocated: process.env.MEMORY_LIMIT || os.totalmem(),
available: os.freemem(),
process_usage: {
heapTotal: memoryUsage.heapTotal,
heapUsed: memoryUsage.heapUsed,
external: memoryUsage.external,
rss: memoryUsage.rss
}
},
environment: {
type: process.env.K_SERVICE ? 'cloud-run' : 'local',
node_version: process.version,
container_id: process.env.HOSTNAME || null
}
};
}; Edit: When I asked for only the essential info Claude gave me these: const os = require('os');
const getEssentialHardwareInfo = () => {
return {
cpu: {
allocated_cores: process.env.CPU_LIMIT || os.cpus().length,
model: os.cpus()[0].model
},
memory: {
allocated_mb: Math.floor((process.env.MEMORY_LIMIT || os.totalmem()) / (1024 * 1024))
},
environment: {
type: process.env.K_SERVICE ? 'cloud-run' : 'local'
}
};
}; |
Regarding storing the bytecodes, I'd really like to show a nice UI for the users in the future how these bytecodes diff for failed jobs. Like this: We find ourselves debugging into people's failed verifications a lot of times and it would be useful to have this information without doing everything locally, even better if we can convey this information to the user and they don't have to contact us at all. IMO this is an essential part of building an overall better contract verification experience. However this is not a requisite from my side and this can be added later by extending the table. We can even add another table like |
Updated my proposal regarding this.
I would do a mixture of the extended and minimal information. Not everything of the extended seems useful to me. How about this? const getHardwareInfo = () => {
return {
cpu: {
allocated_cores: process.env.CPU_LIMIT || os.cpus().length,
model: os.cpus()[0].model,
speed: os.cpus()[0].speed
},
memory: {
total_allocated: process.env.MEMORY_LIMIT || os.totalmem()
},
environment: {
type: process.env.K_SERVICE ? 'cloud-run' : 'local',
node_version: process.version,
cloud_run_revision: process.env.K_REVISION
}
};
}; Where do these environment variables actually come from ( The other question here is, do we really want to store this information for every job? If you exclude the current resource usage, this will be a lot of redundant information. We could store this once per revision in a separate table and just reference it by some identifier.
Alright, makes sense to me. From my point of view, we can add it now. But it should reference entries in the
Now, I also wonder if we need to store the |
Yes we can remove the I also agree with the redundancy of the And regarding storing the bytecodes in |
Right, they should only be stored for failed jobs. I expected the verification_jobs table to never get deleted. It's hard to determine what would be a good time frame for deleting. Also consider this scenario: Someone verifies a contract through the Remix plugin and closes the browser immediately after. He opens Remix only after 2 months again. The verification would show as failed in the IDE, because we deleted the information, but actually the verification went successfully.
If we don't change the Cloud Run service, I would expect this, yes. In my proposal, I excluded the resource usage for this reason. |
So maybe our idea for table was a bit different here. I tried to keep it minimal, because I would like to keep the data forever. What if we separate the table into two: One for the minimal information of the job that should be kept forever, and one with additional information that is used for debugging and statistics (maybe bytecodes and server_hardware) and can be deleted from time to time. |
This is a rather edge case and I don't think we need to cover them. It's reasonable users can't find out why their job failed after 30-60 days. But if their verification went through, they can always check with chainId+address.
Bytecodes yes but server hardware was also meant for longer term. The goal of that is to provide context for the compilation and hardware+compilation_time gives a relative resource use information that can be useful when e.g. someone's looking for resource intensive contracts. We can dedupe the server hardware easily with a separate table, so it can stay longer term. For the bytecodes, we can look in the future how much space they take and only then start deleting. We can also explore some compression options as I don't expect the older bytecodes to be accessed frequently. |
Regarding hardware I think it's best to just use an identifier that we provide in the .env, we just need to store the identifier in the Regarding deleting the rows, the remix case @manuelwedler can be fixed on the client side, just by calling Regarding verification_job's bytecode, I would not store them in |
I agree mostly with what Marco said. I think we can indeed just store an identifier for the server hardware. The identifier for the hardware could be the Here is my new proposal composing the above ideas: Table
|
Regarding that, I'm not sure if that would be the right thing to do. It could be that verification failed because a contract is already verified. In that case, it would be weird to show a successful receipt in Remix. I would say we should just move forward with not deleting the |
Nice!
|
This wouldn't give an outsider eye what the server hardware was like CPU and memory, which was the whole point. Should we keep that info somewhere, like in another table? We can still keep this for debugging purposes though. |
Also agree with this point. You would need access to GCP then to gather this information. The other question is, who will actually have access to our database directly? This table should just be for internal use and not included in the parquet export. If we decide to store the hardware info, I think it should be in a separate table. The json properties from above could be its columns. |
I think we can just list all the details of the server somewhere else. I would not create a postgresql table for this. E.g. in our docs |
Implemented the last proposal in #1902 |
Iterate on the new table definition, making sure that it contains all the need information. In particular focus on the foreign key
verification_id
in the case of a successful verification and understand what information to include in the case of a failed verification:/v2/verify/etherscan/
The text was updated successfully, but these errors were encountered: