Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid row data #25

Closed
benoist opened this issue Jan 27, 2017 · 7 comments
Closed

Invalid row data #25

benoist opened this issue Jan 27, 2017 · 7 comments

Comments

@benoist
Copy link
Contributor

benoist commented Jan 27, 2017

Hi,

I was having some problems with query results when using specific data.
When reading a table with 1.3 million records it was retrieving all the information correctly when using a limited amount of data per row. When the data per row was larger, it returned the wrong results.

It was able to trace it back to the Packet reader

Currently is says:

  def read(slice : Bytes)
    return 0 unless @remaining > 0
    read_bytes = @io.read(slice)
    @remaining -= read_bytes
    read_bytes
  rescue IO::EOFError
    raise DB::ConnectionLost.new(@connection)
  end

When I change the @io.read(slice) to @io.read_fully(slice) like below, my query run correctly.

  def read(slice : Bytes)
    return 0 unless @remaining > 0
    read_bytes = @io.read_fully(slice)
    @remaining -= read_bytes
    read_bytes
  rescue IO::EOFError
    raise DB::ConnectionLost.new(@connection)
  end

I'm not sure if the fix is the correct fix.

@asterite
Copy link
Member

Thanks for reporting this!

This is for the one that's going to tackle this issue: I don't think using @io.read_fully there is the correct way to fix this, read can return less than available. However, read_fully should be used from the outside: in several places I see packet.read(some_slice) where it should be packet.read_fully(some_slice).

@benoist
Copy link
Contributor Author

benoist commented Mar 8, 2017

I've been trying to think what you said about the read_fully, but are you sure that's correct in this case? If I'm not mistaken the read function in ReadPacket expects the slice to be given to be of the correct size. I don't see it's being used with a buffer slice.

Perhaps you mean that the read function should be renamed read_fully as one would expect all bytes in the slice to be filled?

I would really like to know if the solution I proposed is the correct one so I can submit a pull request and close this issue and safely use the crystal-mysql shard

@benoist
Copy link
Contributor Author

benoist commented Mar 30, 2017

@bcardiff is anyone working on this, or can someone give me some direction on what is required to get a pull request accepted. As always I'm happy to do the work :-)

@bcardiff
Copy link
Member

@benoist could you confirm that the caller to that read(slice : Bytes) is row_packet.read(@null_bitmap_slice) in https://github.com/crystal-lang/crystal-mysql/blob/master/src/mysql/result_set.cr#L54 ? If that the case, then you are probably right and a read_fully should be used.

After a bit of review I think that is the only call that may cause it.

How many columns did you need to experience this issue? Because the @null_bitmap_slice size does depends on columns but not on row.

@benoist
Copy link
Contributor Author

benoist commented Mar 30, 2017

I indeed traced it back to that call with the @null_bitmap_slice and suspected read_fully should have been used. However I've since refactored the code a lot whilst using the read_fully method and I'm using different columns now so I can't really try that out anymore. I forgot which columns I was using exactly and my git commits in my repo are days apart.

@bcardiff
Copy link
Member

Fixed on master now. I was unable to reproduce the issue though. I tried the following spec (which I didn't commit) that inserts 2 million records of 1000 columns with just nulls.

  it "gets many columns from table" do
    with_test_db do |db|
      columns = 1000
      row = 2_000_000
      db.exec "create table table1 (#{(1..columns).map { |i| "c#{i} int null" }.join(", ")})"
      row.times do
        db.exec %(insert into table1 (c1) values (NULL))
      end

      db.query "select * from table1" do |rs|
        row.times do
          rs.move_next.should be_true
          columns.times do
            rs.read.should be_nil
          end
        end
        rs.move_next.should be_false
      end
    end
  end

Even with that I was unable to repro, but it's true that read_fully should be used to completely fill a slice from the underlying IO.

@benoist
Copy link
Contributor Author

benoist commented Mar 30, 2017

Great thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants