Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize fd_write for Performance Improvements #205

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

HingchungChu
Copy link

Problem Statement:

File IO benchmarks show low throughput in the fd_write function. This issue is partially caused by frequent buffer allocations and data copying when appending data to a file at vscode-wasm/wasm-wasi-core/src/common/vscodeFileSystemDriver.ts:544. Each write allocates a new buffer and copies the entire array, resulting in significant performance degradation.

Proposed Solution:

Utilizing ECMAScript 2024 In-Place Resizable ArrayBuffers, instead of allocating a new buffer with each append, we can allocate double the necessary space, similar to the C++ std::vector approach.

Outcome:

The following C program compiled by wasi-sdk, which sequentially writes random data to a file. The total runtime for creating an 8MB file dropped from 60 seconds to 15 seconds on an older platform.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main() {
    size_t size_in_kb;

    const char *filename = "output.bin";

    scanf("%zu", &size_in_kb);

    FILE *file = fopen(filename, "wb");
    if (file == NULL) {
        perror("Error opening file");
        return 1;
    }

    srand((unsigned int)time(NULL));

    clock_t start_time = clock();
    
    size_t total_bytes = size_in_kb * 1024;

    for (size_t i = 0; i < total_bytes; i++) {
        unsigned char random_byte = rand() % 256;
        fwrite(&random_byte, 1, 1, file);
    }

    fclose(file);

    clock_t end_time = clock();
    double time_taken = (double)(end_time - start_time) / CLOCKS_PER_SEC;

    printf("Time taken to generate %zu KB file: %.4f seconds\n", size_in_kb, time_taken);

    return 0;
}

Future Improvements:

There is still an IO bottleneck due to non-negligible thread switching overhead. Further improvements can be made by implementing caching for read operations and batching write requests.

@HingchungChu
Copy link
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant