Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QuasiRandomTraining(sampling_alg = SobolSample()) is not working correctly #609

Closed
YichengDWu opened this issue Sep 27, 2022 · 18 comments · Fixed by #610
Closed

QuasiRandomTraining(sampling_alg = SobolSample()) is not working correctly #609

YichengDWu opened this issue Sep 27, 2022 · 18 comments · Fixed by #610

Comments

@YichengDWu
Copy link
Contributor

YichengDWu commented Sep 27, 2022

julia> s = NeuralPDE.generate_quasi_random_points(300, [[-1.0,-2.0],[1.0,2.0]], Float64, LatinHypercubeSample())
2×300 Matrix{Float64}:
 -1.08     -1.33333  -1.86     -1.49333    -1.54667  -1.66667  -1.47333
  1.35667   1.53667   1.84333   1.07333      1.28      1.66667   1.24667

julia> scatter(s[1,:],s[2,:])

image

julia> s2 = NeuralPDE.generate_quasi_random_points(300, [[-1.0,-2.0],[1.0,2.0]], Float64, SobolSample())
2×300 Matrix{Float64}:
 -1.00586  -1.50586  -1.75586    -1.2373  -1.7373  -1.9873  -1.4873
  1.00586   1.50586   1.75586      1.2373   1.7373   1.9873   1.4873

julia> scatter(s2[1,:],s2[2,:])

image

@ChrisRackauckas
Copy link
Member

What if the call is directly to QuasiMonteCarlo.jl? Can this be isolated without NeuralPDE.jl?

@YichengDWu
Copy link
Contributor Author

just SobolSample

@YichengDWu
Copy link
Contributor Author

julia> s3 = QuasiMonteCarlo.sample(300, [-3.0,1.0],[-2,2], SobolSample())
2×300 Matrix{Float64}:
 -2.99414  -2.49414  -2.24414    -2.7627  -2.2627  -2.0127  -2.5127
  1.49805   1.99805   1.24805      1.3291   1.8291   1.0791   1.5791

julia> scatter(s3[1,:],s3[2,:])

image

@YichengDWu
Copy link
Contributor Author

YichengDWu commented Sep 27, 2022

It is in NeuralPDE

julia> s3 =  NeuralPDE.generate_quasi_random_points(300, [[0.0,1.0],[0.0,1.0]], Float64, SobolSample())
2×300 Matrix{Float64}:
 0.00585938  0.505859  0.755859  0.255859    0.737305  0.987305  0.487305
 0.00585938  0.505859  0.755859  0.255859     0.737305  0.987305  0.487305

@ChrisRackauckas
Copy link
Member

🤔 how... alright. I haven't looked at the sampling code in NeuralPDE.jl but like the other things I refactored, the answer is probably just to aggressively simplify and delete a bunch of cruft.

@YichengDWu
Copy link
Contributor Author

There is no randomness in the Sobol sequence?

julia> QuasiMonteCarlo.sample(10, [0.0], [1.0], SobolSample())
1×10 Matrix{Float64}:
 0.1875  0.6875  0.9375  0.4375  0.3125    0.5625  0.0625  0.09375  0.59375

julia> QuasiMonteCarlo.sample(10, [0.0], [1.0], SobolSample())
1×10 Matrix{Float64}:
 0.1875  0.6875  0.9375  0.4375  0.3125    0.5625  0.0625  0.09375  0.59375

julia> QuasiMonteCarlo.sample(10, [0.0], [1.0], SobolSample())
1×10 Matrix{Float64}:
 0.1875  0.6875  0.9375  0.4375  0.3125    0.5625  0.0625  0.09375  0.59375

NeuralPDE just samples along each dimension and vcat them.

@YichengDWu
Copy link
Contributor Author

julia> a = QuasiMonteCarlo.sample(20, [0.0], [1.0], SobolSample())
1×20 Matrix{Float64}:
 0.09375  0.59375  0.84375  0.34375  0.46875    0.546875  0.796875  0.296875

julia> b = QuasiMonteCarlo.sample(20, [1.0], [2.0], SobolSample())
1×20 Matrix{Float64}:
 1.09375  1.59375  1.84375  1.34375    1.04688  1.54688  1.79688  1.29688

julia> a .+1 .== b
1×20 BitMatrix:
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1

@ChrisRackauckas
Copy link
Member

NeuralPDE just samples along each dimension and vcat them.

wtf? You definitely cannot do that for high dimensional samplers. The whole point is how they handle the high dimensional space as not just a tensor product.

There is no randomness in the Sobol sequence?

It should be quasi-random. Check direct calls to https://github.com/stevengj/Sobol.jl (QuasiMonteCarlo.jl is just a wrapper over that)

@YichengDWu
Copy link
Contributor Author

@ChrisRackauckas
Copy link
Member

Yup, both more complicated than just directly sampling, and incorrect. Good find, thank you. You handling this or should I?

@YichengDWu
Copy link
Contributor Author

This doesn't look too hard, I'll try to fix it over the weekend.

@YichengDWu
Copy link
Contributor Author

It should be quasi-random.

QuasiMonteCarlo.jl is probably incorrect either. A new SobolSeq is created inside sample and that kills randomness.

https://github.com/SciML/QuasiMonteCarlo.jl/blob/master/src/QuasiMonteCarlo.jl#L170

@ChrisRackauckas
Copy link
Member

The Sobol sequence is a quasi-random series, not a pseudo-random series, that does not have any randomness in its construction. The only thing you could randomize for it is how many digits i from the start you start from, but I'm not sure its convergence is guaranteed either if you don't start at the start of it, so I believe each integration would need to start from 1 and thus it would be a deterministic low-discrepancy sequence.

Latin Hypercubes in contract do have a randomness in their construction because there are (finitely) many different Latin Hypercubes one can construct to tile n dimensions. So we should make sure that one showcases the correct randomness, while the Sobol sequence does not necessarily.

@YichengDWu
Copy link
Contributor Author

It doesn't make any difference whether SobolSeq is reconstructed if only sampled once. But for resampling, I think it should iterate from the end of the last sequence. Although this is an odd setup to me, the data is resampled on each call to the loss function: https://github.com/SciML/NeuralPDE.jl/blob/master/src/training_strategies.jl#L222.

@YichengDWu
Copy link
Contributor Author

resampling=true and resampling=false are the same for SobolSample()

@ChrisRackauckas
Copy link
Member

But for resampling, I think it should iterate from the end of the last sequence.

Reference on whether that gives a convergent integration? There are high dimensional sequences like sparse grids which are deterministic tilings that require that you do not cut the start to get a convergent integration. I don't know if Sobol has that property or not.

@YichengDWu
Copy link
Contributor Author

I'm reading https://arxiv.org/pdf/2207.10289.pdf. I'm just saying that resampling needs to generate different points, and that seems to be good for pinns in the paper.

@ChrisRackauckas
Copy link
Member

Yes, resampling is required for a PINN to get a good loss sampling, otherwise things overfit. However, that doesn't actually answer the question. The QMC methods are supposed to be convergent integration schemes: is starting from n a convergent integration scheme on Sobol? For sparse grids, the answer is no. I am not sure if that's the case for Sobol. If it's not, then it's not a good sampling of the loss function. If it is, then having a mutable counter inside of SobolSampling that updates a starting location would be good idea.

Also, I wouldn't take the LHS result too far. It doesn't seem to use the same optimization https://mrurq.github.io/LatinHypercubeSampling.jl/stable/man/lhcoptim/ vs https://scikit-optimize.github.io/stable/modules/generated/skopt.sampler.Lhs.html .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants