QuasiRandomTraining(sampling_alg = SobolSample()) is not working correctly #609

YichengDWu · 2022-09-27T03:44:14Z

julia> s = NeuralPDE.generate_quasi_random_points(300, [[-1.0,-2.0],[1.0,2.0]], Float64, LatinHypercubeSample())
2×300 Matrix{Float64}:
 -1.08     -1.33333  -1.86     -1.49333  …  -1.54667  -1.66667  -1.47333
  1.35667   1.53667   1.84333   1.07333      1.28      1.66667   1.24667

julia> scatter(s[1,:],s[2,:])

julia> s2 = NeuralPDE.generate_quasi_random_points(300, [[-1.0,-2.0],[1.0,2.0]], Float64, SobolSample())
2×300 Matrix{Float64}:
 -1.00586  -1.50586  -1.75586  …  -1.2373  -1.7373  -1.9873  -1.4873
  1.00586   1.50586   1.75586      1.2373   1.7373   1.9873   1.4873

julia> scatter(s2[1,:],s2[2,:])

ChrisRackauckas · 2022-09-27T05:42:30Z

What if the call is directly to QuasiMonteCarlo.jl? Can this be isolated without NeuralPDE.jl?

YichengDWu · 2022-09-27T05:53:05Z

just SobolSample

YichengDWu · 2022-09-27T06:04:50Z

julia> s3 = QuasiMonteCarlo.sample(300, [-3.0,1.0],[-2,2], SobolSample())
2×300 Matrix{Float64}:
 -2.99414  -2.49414  -2.24414  …  -2.7627  -2.2627  -2.0127  -2.5127
  1.49805   1.99805   1.24805      1.3291   1.8291   1.0791   1.5791

julia> scatter(s3[1,:],s3[2,:])

YichengDWu · 2022-09-27T06:08:56Z

It is in NeuralPDE

julia> s3 =  NeuralPDE.generate_quasi_random_points(300, [[0.0,1.0],[0.0,1.0]], Float64, SobolSample())
2×300 Matrix{Float64}:
 0.00585938  0.505859  0.755859  0.255859  …  0.737305  0.987305  0.487305
 0.00585938  0.505859  0.755859  0.255859     0.737305  0.987305  0.487305

ChrisRackauckas · 2022-09-27T06:10:50Z

🤔 how... alright. I haven't looked at the sampling code in NeuralPDE.jl but like the other things I refactored, the answer is probably just to aggressively simplify and delete a bunch of cruft.

YichengDWu · 2022-09-27T06:22:23Z

There is no randomness in the Sobol sequence?

julia> QuasiMonteCarlo.sample(10, [0.0], [1.0], SobolSample())
1×10 Matrix{Float64}:
 0.1875  0.6875  0.9375  0.4375  0.3125  …  0.5625  0.0625  0.09375  0.59375

julia> QuasiMonteCarlo.sample(10, [0.0], [1.0], SobolSample())
1×10 Matrix{Float64}:
 0.1875  0.6875  0.9375  0.4375  0.3125  …  0.5625  0.0625  0.09375  0.59375

julia> QuasiMonteCarlo.sample(10, [0.0], [1.0], SobolSample())
1×10 Matrix{Float64}:
 0.1875  0.6875  0.9375  0.4375  0.3125  …  0.5625  0.0625  0.09375  0.59375

NeuralPDE just samples along each dimension and vcat them.

YichengDWu · 2022-09-27T06:23:52Z

julia> a = QuasiMonteCarlo.sample(20, [0.0], [1.0], SobolSample())
1×20 Matrix{Float64}:
 0.09375  0.59375  0.84375  0.34375  0.46875  …  0.546875  0.796875  0.296875

julia> b = QuasiMonteCarlo.sample(20, [1.0], [2.0], SobolSample())
1×20 Matrix{Float64}:
 1.09375  1.59375  1.84375  1.34375  …  1.04688  1.54688  1.79688  1.29688

julia> a .+1 .== b
1×20 BitMatrix:
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1

ChrisRackauckas · 2022-09-27T06:25:31Z

NeuralPDE just samples along each dimension and vcat them.

wtf? You definitely cannot do that for high dimensional samplers. The whole point is how they handle the high dimensional space as not just a tensor product.

There is no randomness in the Sobol sequence?

It should be quasi-random. Check direct calls to https://github.com/stevengj/Sobol.jl (QuasiMonteCarlo.jl is just a wrapper over that)

YichengDWu · 2022-09-27T06:33:15Z

It looks like so in https://github.com/SciML/NeuralPDE.jl/blob/master/src/training_strategies.jl#L168.

ChrisRackauckas · 2022-09-27T06:35:03Z

Yup, both more complicated than just directly sampling, and incorrect. Good find, thank you. You handling this or should I?

YichengDWu · 2022-09-27T06:38:36Z

This doesn't look too hard, I'll try to fix it over the weekend.

YichengDWu · 2022-09-28T21:32:43Z

It should be quasi-random.

QuasiMonteCarlo.jl is probably incorrect either. A new SobolSeq is created inside sample and that kills randomness.

https://github.com/SciML/QuasiMonteCarlo.jl/blob/master/src/QuasiMonteCarlo.jl#L170

ChrisRackauckas · 2022-09-29T06:13:08Z

The Sobol sequence is a quasi-random series, not a pseudo-random series, that does not have any randomness in its construction. The only thing you could randomize for it is how many digits i from the start you start from, but I'm not sure its convergence is guaranteed either if you don't start at the start of it, so I believe each integration would need to start from 1 and thus it would be a deterministic low-discrepancy sequence.

Latin Hypercubes in contract do have a randomness in their construction because there are (finitely) many different Latin Hypercubes one can construct to tile n dimensions. So we should make sure that one showcases the correct randomness, while the Sobol sequence does not necessarily.

YichengDWu · 2022-09-29T06:42:30Z

It doesn't make any difference whether SobolSeq is reconstructed if only sampled once. But for resampling, I think it should iterate from the end of the last sequence. Although this is an odd setup to me, the data is resampled on each call to the loss function: https://github.com/SciML/NeuralPDE.jl/blob/master/src/training_strategies.jl#L222.

YichengDWu · 2022-09-29T06:44:27Z

resampling=true and resampling=false are the same for SobolSample()

ChrisRackauckas · 2022-09-29T06:47:01Z

But for resampling, I think it should iterate from the end of the last sequence.

Reference on whether that gives a convergent integration? There are high dimensional sequences like sparse grids which are deterministic tilings that require that you do not cut the start to get a convergent integration. I don't know if Sobol has that property or not.

YichengDWu · 2022-09-29T07:06:27Z

I'm reading https://arxiv.org/pdf/2207.10289.pdf. I'm just saying that resampling needs to generate different points, and that seems to be good for pinns in the paper.

ChrisRackauckas · 2022-09-29T07:17:05Z

Yes, resampling is required for a PINN to get a good loss sampling, otherwise things overfit. However, that doesn't actually answer the question. The QMC methods are supposed to be convergent integration schemes: is starting from n a convergent integration scheme on Sobol? For sparse grids, the answer is no. I am not sure if that's the case for Sobol. If it's not, then it's not a good sampling of the loss function. If it is, then having a mutable counter inside of SobolSampling that updates a starting location would be good idea.

Also, I wouldn't take the LHS result too far. It doesn't seem to use the same optimization https://mrurq.github.io/LatinHypercubeSampling.jl/stable/man/lhcoptim/ vs https://scikit-optimize.github.io/stable/modules/generated/skopt.sampler.Lhs.html .

YichengDWu mentioned this issue Sep 30, 2022

Correct sampling #610

Merged

ChrisRackauckas closed this as completed in #610 Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QuasiRandomTraining(sampling_alg = SobolSample()) is not working correctly #609

QuasiRandomTraining(sampling_alg = SobolSample()) is not working correctly #609

YichengDWu commented Sep 27, 2022 •

edited

Loading

ChrisRackauckas commented Sep 27, 2022

YichengDWu commented Sep 27, 2022

YichengDWu commented Sep 27, 2022

YichengDWu commented Sep 27, 2022 •

edited

Loading

ChrisRackauckas commented Sep 27, 2022

YichengDWu commented Sep 27, 2022

YichengDWu commented Sep 27, 2022

ChrisRackauckas commented Sep 27, 2022

YichengDWu commented Sep 27, 2022

ChrisRackauckas commented Sep 27, 2022

YichengDWu commented Sep 27, 2022

YichengDWu commented Sep 28, 2022

ChrisRackauckas commented Sep 29, 2022

YichengDWu commented Sep 29, 2022

YichengDWu commented Sep 29, 2022

ChrisRackauckas commented Sep 29, 2022

YichengDWu commented Sep 29, 2022

ChrisRackauckas commented Sep 29, 2022

QuasiRandomTraining(sampling_alg = SobolSample()) is not working correctly #609

QuasiRandomTraining(sampling_alg = SobolSample()) is not working correctly #609

Comments

YichengDWu commented Sep 27, 2022 • edited Loading

ChrisRackauckas commented Sep 27, 2022

YichengDWu commented Sep 27, 2022

YichengDWu commented Sep 27, 2022

YichengDWu commented Sep 27, 2022 • edited Loading

ChrisRackauckas commented Sep 27, 2022

YichengDWu commented Sep 27, 2022

YichengDWu commented Sep 27, 2022

ChrisRackauckas commented Sep 27, 2022

YichengDWu commented Sep 27, 2022

ChrisRackauckas commented Sep 27, 2022

YichengDWu commented Sep 27, 2022

YichengDWu commented Sep 28, 2022

ChrisRackauckas commented Sep 29, 2022

YichengDWu commented Sep 29, 2022

YichengDWu commented Sep 29, 2022

ChrisRackauckas commented Sep 29, 2022

YichengDWu commented Sep 29, 2022

ChrisRackauckas commented Sep 29, 2022

YichengDWu commented Sep 27, 2022 •

edited

Loading

YichengDWu commented Sep 27, 2022 •

edited

Loading