Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage demo about how to upload file/image by grpc-web? #517

Closed
jmzwcn opened this issue Mar 23, 2019 · 11 comments
Closed

Usage demo about how to upload file/image by grpc-web? #517

jmzwcn opened this issue Mar 23, 2019 · 11 comments

Comments

@jmzwcn
Copy link

jmzwcn commented Mar 23, 2019

Anybody know how to upload file/image by grpc-web?

@glerchundi
Copy link
Collaborator

glerchundi commented Mar 23, 2019 via email

@jmzwcn
Copy link
Author

jmzwcn commented Mar 23, 2019

bytes field seems to be a workaround, but it is only for small file.

@jonahbron
Copy link
Collaborator

The closest you could get to a pure GRPC solution would be a client-streaming API where you send up chunks of the image file at a time. Unfortunately that's not possible since GRPC-Web doesn't yet support client streaming. However, the https://github.com/improbable-eng/grpc-web/ implementation does support it, by using WebSockets. That's one option to pursue.

@stanley-cheung
Copy link
Collaborator

stanley-cheung commented Mar 25, 2019

File upload is not a main use case for gRPC or gRPC-Web. There are standard HTTP way of doing file uploads.

@mmohaveri
Copy link

Using plain HTTP in a system that all of it's APIs are GRPC based makes thing a little messy.

Is there a fundamental reason that you're suggesting plain HTTP over GRPC?

I think for small files (e.g. image files) it's possible to put the whole file inside a bytes field and send it to the server using a simple rpc, do you think it's a bad idea?

@jonahbron
Copy link
Collaborator

My app does use the HTML FileReader API to load a file into a string which is then wrapped with a GRPC message. It works, but only for smaller files (fine my my case).

@oreza
Copy link

oreza commented Aug 8, 2019

My app does use the HTML FileReader API to load a file into a string which is then wrapped with a GRPC message. It works, but only for smaller files (fine my my case).

How small a file? My understanding is that string size is 64 bits, but GRPC puts a 64MB limit on the message itself.

@sh0umik
Copy link

sh0umik commented Dec 31, 2019

grpc-web now supports streaming .. you can try sending files in chunk. I have done similar in dart grpc-web

@maltegrosse
Copy link

maltegrosse commented May 16, 2020

I came across this post multiple times when I try to have a simple file uploader implemented via web-grpc.
@sh0umik is right - that it supports streaming, but only server side - which is not helpful in my case.
@oreza message limit of 64MB was a no go for my upload case (files between 1-200mb)
@jonahbron idea is great, and was kind of a starting point for me, but the file reader in string mode can corrupt binary files like images or pdfs, only works with normal text files.

and I didnt want to implement another endpoint for uploading data @stanley-cheung "the standard way".

So i implemented yesterday following idea
(please keep in mind that first Javascript is not my strength, and 2nd this should be seen as a proof of concept and any feedback, improvements/suggestions are more than welcome...)

General idea: 1. chunk the files in browser, 2. upload them in pieces, 3. merge them in the backend

[. .proto file]
I already got a streaming service for uploading (used by other microservices) running, so I want to extend my
rpc UploadFile(stream UploadFileRequest) returns (UploadFileResponse); in a similar way.

by using another method
rpc UploadFileChunk(FileUploadChunkRequest) returns (FileUploadChunkResponse);
creating

// used for plain grpcs service to grpc service
message UploadFileRequest{
    oneof data {
        FileInfo info = 1;
        bytes chunk_data = 2;
    };
}
// used for plain grpcs service to grpc service
message FileInfo{
    string name = 1;
    int64 size = 2;
    string type = 3;

}
// used for plain grpcs service to grpc service and web grpcs
message UploadFileResponse {
    string url = 1;
    string id = 2;
    uint64 size = 3;
}
// used for web grpcs, can not use one of caused by async communication
message FileUploadChunkRequest{
    string uuid = 1;
    bytes chunk = 2;
    uint64 offset = 3;
    uint64 size = 4;
    string name = 5;
    string type = 6;
    bool finished = 7;
}
// used for web grpcs
message FileUploadChunkResponse{
    oneof data {
        UploadFileResponse info = 1;
        Empty empty = 2;
    };
}

[.js file]
I am using vuejs (i am sure its simple adoptable to other frameworks) but limited callback skills :)

first some "global" component variables. (see later in script by using variable with "this")

// storing the waiting time after all chunk uploads done
waitCounter: 0,
// the timeout after all chunks are read until all data is uploaded
maxWait: 0,
// just to show in the front end about the current upload progress: 0-100 (%)
uploadProgress: 0,
// used to avoid multiple output of chunk upload errors
chunkUploadError: false,
// store the filesize globally for this component
fileSize: 0,
// define the chunk size
chunkSize : 256 * 1024
// an array of all junks, defined their upload status either as active (true), or done (false)
currentUploads: [],

and now the different function to handle all the pieces:

// returns the uploadProgress 0-100 in %
calculateUploadStatus() {
                let sum = 0;
                for (let i = 0; i < this.currentUploads.length; i++) {
                    if (!this.currentUploads[i]) {
                        sum++
                    }
                }
                this.uploadProgress = Math.floor(sum / this.currentUploads.length * 100)
            },

// main upload function, needs the file, eg. from the file upload form field (files[0])
            upload(file) {
   
                // create a uuid, perhaps some numbers would save bandwidth...
                let uuid = this.uuidv4();
                // chunk the file
                let ch = new FileUploadChunkRequest();
               
                let amountChunks = Math.ceil(file.size / this.chunkSize);
                // (name, type doesnt have to be set everytime to save bandwidth...)
                ch.setName(file.name);
                ch.setSize(file.size);
                this.fileSize = file.size
                ch.setType(file.type);
                ch.setUuid(uuid);
                ch.setFinished(false);

                this.currentUploads = new Array(amountChunks);
                for (let i = 0; i < amountChunks; i++) { 
                    // set all values true
                    this.currentUploads[i] = true;
                }
                // wait after file reader is finished: amountChunks * 100ms, depends abit on the upload speed
                this.maxWait = amountChunks;
                this.waitCounter = 0;
                this.chunkUploadError = false;
                // define the options for the file reader, 
                let options = {
                    binary: true,
                    chunkSize: this.chunkSize,
                    chunkErrorCallback: () => {
                        this.uploadFailed("file reader error")
                    },
                    chunkReadCallback: (chunk, os, ctn) => {
                        // once a chunk is read
                        if (typeof chunk === "string") {
                            ch.setChunk(btoa(chunk))
                        } else {
                            ch.setChunk(new Uint8Array(chunk))
                        }
                        ch.setOffset(os);
                        somegrpcwebClient.uploadFileChunk(ch).then(() => {
                            // notify when finished upload
                            this.currentUploads[ctn] = false
                            this.calculateUploadStatus();
                        }).catch(error => {
                            // only announce 1x a chunk upload error, otherwise notification got spammed
                            if (!this.chunkUploadError) {
                                this.uploadFailed("chunk", ctn, "error", error)
                            }
                            this.chunkUploadError = true;
                        })

                    },
                    successCallback: (c) => {
                        // success callback doesnt mean that all files already uploaded, just read by filereader

                        //tell the backend to merge the chunks together..
                        ch.setFinished(true);
                        // but first wait for all upload chunks callbacks
                        this.waitFor(() => this.allUploadsDone(), () => {
                            somegrpcwebClient.uploadFileChunk(ch).then((res) => {
                                // res = oneof... should arrive an UploadFileResponse not Empty
                                let url= res.getInfo().toObject().url;
                                console.log(url)
                            }).catch(error => {
                                this.uploadFailed("error final upload piece" + error);

                            })
                        })

                    }
                }
                this.readFileInChunks(file, ch, options)
            },
// check if somewhere in the array a chunk is still active during uploading
            allUploadsDone() {
                for (let i = 0; i < this.currentUploads.length; i++) {
                    if (this.currentUploads[i]) {
                        return false
                    }
                }
                return true
            },
            waitFor(condition, callback) {
                // max wait 5*100 ms,
                if (this.waitCounter > this.maxWait) {
                    this.uploadFailed("timeout waiting for upload completion");
                    return
                }
                if (!condition()) {
                    this.waitCounter++;
                    // wait 100sec
                    window.setTimeout(this.waitFor.bind(null, condition, callback), 100);
                } else {
                    callback();
                }
            },
            // took from https://gist.github.com/alediaferia/cfb3a7503039f9278381#file-tiny_uploader-js-L29
            readFileInChunks(file, ch, options = {}) {
                let counter = 0;
                const defaults = {
                    chunkSize: 128 * 1024, // bytes
                    binary: false,
                    chunkReadCallback: () => {
                    },
                    chunkErrorCallback: () => {
                    },
                    successCallback: () => {
                    }
                };

                options = {
                    ...defaults,
                    ...options
                };

                const {binary, chunkSize, chunkReadCallback, chunkErrorCallback, successCallback} = options;
                const fileSize = file.size;
                let offset = 0;

                const onLoadHandler = evt => {
                    if (evt.target.error == null) {
                        offset += binary ? evt.target.result.byteLength : evt.target.result.length;
                        chunkReadCallback(evt.target.result, offset, counter);
                        counter++
                    } else {
                        return chunkErrorCallback(evt.target.error);
                    }

                    if (offset >= fileSize) {
                        return successCallback(ch);
                    }

                    readBlock(offset, chunkSize, file);
                };

                const readBlock = (_offset, length, _file) => {
                    const reader = new FileReader();
                    const blob = _file.slice(_offset, length + _offset);
                    reader.onload = onLoadHandler;
                    // perhaps Blob.arrayBuffer() should be used: https://developer.mozilla.org/en-US/docs/Web/API/Blob/arrayBuffer
                    if (binary) {
                        reader.readAsArrayBuffer(blob);
                    } else {
                        reader.readAsText(blob);
                    }
                };

                readBlock(offset, chunkSize, file);
            },
            uuidv4() {
                return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, function (c) {
                    let r = Math.random() * 16 | 0, v = c == 'x' ? r : (r & 0x3 | 0x8);
                    return v.toString(16);
                });
            },
            uploadFailed(error) {
                // handle errors
                console.log("error in upload:", error);
                this.uploadProgress = 0;
                this.currentUploads = []
            },

[.go file]
and then buffer the chunks in the backend (here in go lang) and merge them together...
(if you just write to local hard drive - perhaps directly writing it to hard drive would save memory...)

//... skip the stream implementation part

// define a file struct, t is filetype....
type file struct {
	size       uint64
	data       []byte
	name       string
	t          string
	lastAccess time.Time
	lock       bool
}
//the key of each file is defined by its uuid
var fileBuffer = make(map[string]*file)
// some watcher function, could be optimized by signals, just run it when starting the grpc server by go watchBuffer(5, 10, 2), should do some simple cleanup if an upload is failed...
func watchBuffer(staticDelaySec int, dynamicDelaySec int, sleepDelaySec int) {
	var diff time.Duration
	for {
		for k := range fileBuffer {
			// ignore locked files
			if !fileBuffer[k].lock {
				diff = time.Now().Sub(fileBuffer[k].lastAccess)
				// file size in mb == 1 sec delay - staticDelay = add, dynamicDelay = multiplied
				if diff > ((time.Duration(fileBuffer[k].size/1024)*time.Millisecond + (time.Duration(staticDelaySec) * time.Second)) * time.Duration(dynamicDelaySec)) {
					delete(fileBuffer, k)
				}
			}

		}
		time.Sleep(time.Duration(sleepDelaySec) * time.Second)
	}
}
// grpc func implementation
func (as *aFileUploadServiceImp) UploadFileChunk(ctx context.Context, c *FileUploadChunkRequest) (*FileUploadChunkResponse, error) {
	// as long as not the last piece of the chunks arrived...
	if !c.Finished {
		if _, ok := fileBuffer[c.Uuid]; !ok {
			// first time the uuid arrives at the back-end
			as.initFile(c)
		}
		// write the bytes
		f := fileBuffer[c.Uuid]
		as.addData(c.Chunk, c.Offset, *f)
		// update last access for timeout watcher
		f.lastAccess = time.Now()
		return &FileUploadChunkResponse{Data: &FileUploadChunkResponse_Empty{Empty: &Empty{}}}, nil
	} else {
		// lock the file so the watcher wont delete it during slow post processing/file writing
		fileBuffer[c.Uuid].lock = true
		// test write to to local file system, I personally use a s3 bucket/minio
		err := ioutil.WriteFile(fileBuffer[c.Uuid].name, fileBuffer[c.Uuid].data, 0644)
		// always delete buffer even ignore if file writer failed or not ...
		delete(fileBuffer, c.Uuid)
		if err != nil {
			return nil, status.Errorf(codes.Internal, "error during file writing", err)
		}
		return &FileUploadChunkResponse{Data: &FileUploadChunkResponse_Info{Info: &UploadFileResponse{Id: c.Uuid, Size: c.Size, Url: "https://somewhere.com/" + c.Name}}}, nil

	}

}

func (as *aFileUploadServiceImp) initFile(ch *FileUploadChunkRequest) {
	var f file
	f.name = ch.Name
	f.t = ch.Type
	f.size = ch.Size
	f.data = make([]byte, ch.Size)
	f.lastAccess = time.Now()
	fileBuffer[ch.Uuid] = &f
}
func (as *aFileUploadServiceImp) addData(chunk []byte, offset uint64, f file) {
	lenChunk := uint64(len(chunk))
	i := uint64(0)
	for i = 0; i < lenChunk; i++ {
		if offset-lenChunk+i < f.size {
			f.data[(offset-lenChunk)+i] = chunk[i]
		}
	}
}

especially timeouts should be handled and tested carefully as it also depends on the users bandwidth...

P.S. if anyone got a solution how to provide downloads in the browser using the server side streaming functionality - let me know

PPS. a fork of https://github.com/23/resumable.js would provide a more solid frontend solution with nice additional features. "just" the xhr part needs to be replace somehow by the grpc functionalities...

PPPS. perhaps some skilled js people can help to actually merge the chunk reader and upload part together, as the current bottle neck is that all chunks got sent one after another - better would be to wait, e.g 1. read first chunk, wait for file reader callback, 2. upload chunk via grpc, 3. wait for upload callback, continue chunking....

@madneal
Copy link

madneal commented Apr 1, 2021

@maltegrosse The example is so complicated 😭

@SimonBiggs
Copy link

SimonBiggs commented Jun 23, 2021

That being said, Google does something that may interest you, create an method on the server-side which returns an URL that can be used to upload octet streams by using plain old HTTP.

For others reading this, creating a signed URL for uploading a file to a bucket and sending that URL over the gRPC connection is the way I plan to handle this:

https://cloud.google.com/storage/docs/access-control/signed-urls

Or, alternatively, just handle this with resumable uploads. Initialising the upload server side, and then passing through the required session URI to complete the upload client side:

https://googleapis.dev/python/google-resumable-media/latest/resumable_media/requests.html#resumable-uploads

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants