-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is -DLARGE_BOND=ON always needed? #81
Comments
Hi, thanks for reporting the problem. For the case when you get zero correlation energies, please provide (1) the complete Python script and (2) the complete output file with at least |
OK. I re-submit the job with |
Thanks for providing the files. I tested your Python script but got different outputs for the MRCISD part. As MRCISD was purely implemented using
Notes:
# diag overlap rsabtupq* size = 65536
assert abs(np.linalg.norm(xn) - 485.73353755963177) < 1E-5
assert abs(np.linalg.norm(umats[key]) - 3300.4487602047843) < 1E-5
assert umats[key].shape == (65536, 32896)
# diag overlap rabctpqg* size = 65536
assert abs(np.linalg.norm(xn) - 558.281417562479) < 1E-5
assert abs(np.linalg.norm(umats[key]) - 53532.995978565734) < 1E-5
assert umats[key].shape == (65536, 65280) Note that |
You said that "which requires more than 500 GB memory". But my computational node only has around 180GB memory (or only 180GB is allowed in my previous slurm script). So I change to a very large node (with 1.4 TB total memory allowed), and modify the python code in lib.logger.debug(ic, 'diag overlap %s size = %d', key, len(xn))
w, v = np.linalg.eigh(xn)
idx = w > ic.mrci_thrds
umats[key] = v[:, idx] * (w[idx] ** (-0.5))
ntr += umats[key].shape[1] ### nothing changed until here
print(np.linalg.norm(xn))
print(np.linalg.norm(umats[key]))
print(umats[key].shape) Here are the input and output files of this job (again, the correlation energy is zero). Python 3.9.13 and A computation using more recent numpy version are still running. |
Thanks for the feedback.
What I said was "The last step requires the full diagonalization of a 98177 x 98177 matrix ... which requires more than 500 GB memory" In your output, the number in "HMAT basis size = 12957" was wrong, so you did not get the 98177 x 98177 matrix with the correct shape. Let's first solve the problem in the wrong "HMAT basis size" then one should worry about the memory requirement. Your new output showed that the the norm of the overlap matrix |
I create a new virtual environment with
This time, the job is abnormally terminated without any error information. Here are the input and output files of this job I'll try other versions of python and numpy. |
Thanks very much for the help with the testing. In order to simplify the checking a little bit, could you run the following python script in your environment (anyone that you have used previously) and let me know the output? You may consider two cases:
import numpy as np, time
np.random.seed(1234)
n = 2 ** 16
mrci_thrds = 1E-10
wmat = np.random.random(n)
wmat[:n // 2] *= mrci_thrds
wmat[n // 2:] += mrci_thrds
np.random.shuffle(wmat)
pmat = np.random.random((n, n))
umat = np.linalg.qr(pmat)[0]
xmat = (umat * wmat[None]) @ umat.T
t1 = time.perf_counter()
w, v = np.linalg.eigh(xmat)
t2 = time.perf_counter()
idx = w > mrci_thrds
u = v[:, idx] * (w[idx] ** (-0.5))
t3 = time.perf_counter()
print('psum = ', np.sum(pmat))
print('pnorm = ', np.linalg.norm(pmat))
print('pcubic = ', np.sum(pmat ** 3))
print('xsum = ', np.sum(xmat))
print('xnorm = ', np.linalg.norm(xmat))
print('xcubic = ', np.sum(xmat ** 3))
print('vsum = ', np.sum(v))
print('vnorm = ', np.linalg.norm(v))
print('vcubic = ', np.sum(v ** 3))
print('ushape = ', u.shape)
print('usum = ', np.sum(u))
print('unorm = ', np.linalg.norm(u))
print('ucubic = ', np.sum(u ** 3))
print('wnorm = ', np.linalg.norm(w))
print(w[:4], '\n', w[-4:], '\n', idx[:10], '\n', idx[-10:])
print('time = ', t2 - t1, t3 - t2) |
The above script used random matrices, which may not match the actual situation. The following script can exactly reproduce the data that you will have in DMRG-FIC-MRCISD (only for the Please use the following script for testing instead of the one provided in the previous reply. import numpy as np, time
nact, nvirt, nelec = 16, 16, 16
mrci_thrds = 1E-10
deltaAA = np.eye(nact)
deltaEE = np.eye(nvirt)
E2 = np.zeros((nact, nact, nact, nact), order='C')
with open("06-twopdm.txt", "r") as f:
E2.reshape(-1)[:] = [float(l.strip().split()[-1]) for l in f.readlines()[1:]]
E2 = np.ascontiguousarray(2.0 * E2.transpose(0, 1, 3, 2))
xmat = np.zeros((nvirt, nvirt, nact, nact, nvirt, nvirt, nact, nact))
xmat += np.einsum('ur,ts,pqba->rsabtupq', deltaEE, deltaEE, E2, optimize=True)
xmat += np.einsum('us,tr,pqab->rsabtupq', deltaEE, deltaEE, E2, optimize=True)
xmat = xmat.reshape((-1, nvirt * nvirt * nact * nact))
print('xsum = ', np.sum(xmat))
print('xnorm = ', np.linalg.norm(xmat))
print('xcubic = ', np.sum(xmat ** 3))
t1 = time.perf_counter()
w, v = np.linalg.eigh(xmat)
t2 = time.perf_counter()
idx = w > mrci_thrds
u = v[:, idx] * (w[idx] ** (-0.5))
t3 = time.perf_counter()
print('vsum = ', np.sum(v))
print('vnorm = ', np.linalg.norm(v))
print('vcubic = ', np.sum(v ** 3))
print('ushape = ', u.shape)
print('usum = ', np.sum(u))
print('unorm = ', np.linalg.norm(u))
print('ucubic = ', np.sum(u ** 3))
print('wnorm = ', np.linalg.norm(w))
print(w[:4], '\n', w[-4:], '\n', idx[:10], '\n', idx[-10:])
print('time = ', t2 - t1, t3 - t2) The reference correct output for the above:
|
Here is the output using the Python script and 06-twopdm.txt you provided:
It can be seen that our results differ since the |
Thanks for the feedback. Your output shows that the data in
import block2
w, v = np.zeros(len(xmat)), np.copy(xmat)
block2.MatrixFunctions.eigs(v, w)
v = v.T
If 3 or 4 works ("works" means that you get non-zero |
It seems that if you are using the Intel Distribution for Python, the problem you see is related this: https://community.intel.com/t5/Intel-Distribution-for-Python/numpy-linalg-eigh-fails-for-large-matrices/m-p/1540256. The page shows that the last reply was in 01-25-2024, so I guess they have not got a chance to release the fix. Beside the suggestions in the above reply, you can switch to the normal Python implementation, which does not have this bug. |
I use Anaconda Python 3 (a large offline package downloaded from Anaconda website or its mirror), and it already contains
it can be found that
So it seems the Anaconda Python is indeed used. I'll create a new virtual environment (with the same python and scipy) and use |
I've made some test for the 06-twopdm.txt case
I suspect this could be a similar issue to numpy/numpy#22590. But I can't explain why numpy from pip is different (because openblas is different from MKL?). |
To check whether the Python is Intel or Anaconda, one can start the Python interpreter interactively in the terminal. If it prints the following it is Anaconda Python (the Python/GCC version and the date are not important): Python 3.8.18 (default, Sep 11 2023, 13:40:15)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> If it prints the following it is Intel Python: Python 3.9.16 (default, Jun 4 2021, 06:52:02)
[GCC 9.3.0] :: Intel Corporation on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> Note that even if you have Anaconda Python, you may end up with an Intel implementation of
The bug is in the Intel implementation of |
Update: use scipy from conda also gives right result but takes 5x more time. The reason may be scipy.linalg.eigh uses dsyevr solver which is more robust but slower (compared to numpy's dsyevd solver).
Thanks for the information. So yes numpy from pip should be a good solution in this issue. Let's see if this solves @1234zou 's original question. |
I tried in the corresponding virtual environment and it is Anaconda Python
But as you said, the
Here are input and output files using I do not know whether the numerical value of |
Thanks for the update.
I can confirm that
It is okay to see different Please go ahead to redo the DMRG-FIC-MRCISD calculation and let me know whether the original problem can be solved. Remember that the whole calculation will need more than 500 GB memory because the last step requires the full diagonalization of a 98177 x 98177 matrix to get the correlation energy. |
With the correct installation of
As this does not seem to be a problem of |
Dear Huanchen,
I've performed a DMRG-FIC-MRCISD calculation for the linear H16 chain, using the 3-21G basis set. This system has 0 core orbitals, 16 active orbitals, and 16 virtual orbitals, the obtained correlation energy is zero
However, I calculated a smaller system using only 2 virtual orbitals, the result seems normal
My guess is that I did not add
-DLARGE_BOND=ON
when compiling block2 so that some integers are beyond the range. Is there any criterion to determine/judge that-DLARGE_BOND=ON
is needed for large active space like (16,16)?Best,
Jingxiang
The text was updated successfully, but these errors were encountered: