Skip to content

DAOS-17344 build: Add dependencies as submodules #16162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 54 commits into from
Apr 19, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
54937f3
DAOS-17344 build: Enhance SCons options
jolivier23 Mar 26, 2025
3f3141b
DAOS-17344 build: Add dependencies as submodules
jolivier23 Mar 27, 2025
f40a625
fix pylint issue
jolivier23 Mar 27, 2025
70bad8f
Merge branch 'jvolivie/fix_copy' into jvolivie/sub_deps
jolivier23 Mar 27, 2025
98fb972
Fix scripts
jolivier23 Mar 27, 2025
d65c5a8
Don't copy patches already in right place
jolivier23 Mar 27, 2025
d5b9c54
Add argobots patches and fix script
jolivier23 Mar 27, 2025
93fdf2a
Fix isort linting issue
jolivier23 Mar 27, 2025
ef33152
Get the submodules
jolivier23 Mar 27, 2025
b92318d
Fix spdk issue
jolivier23 Mar 27, 2025
8ae8f96
Merge branch 'jvolivie/fix_copy' into jvolivie/sub_deps
jolivier23 Mar 27, 2025
45100d1
Extra commit after merge
jolivier23 Mar 27, 2025
a74e861
Fix archive script
jolivier23 Mar 28, 2025
a05ae66
Fix archive script
jolivier23 Mar 28, 2025
fce0ae4
Merge branch 'jvolivie/fix_copy' into jvolivie/sub_deps
jolivier23 Mar 28, 2025
7a400e8
DAOS-504 pool: Add backoffs to rsvc operations (#16074)
liw Apr 3, 2025
457582b
Fix build issue
jolivier23 Apr 3, 2025
dc364fe
Merge remote-tracking branch 'comm/master' into jvolivie/fix_copy
jolivier23 Apr 3, 2025
8b13f7a
Extra commit
jolivier23 Apr 3, 2025
f73565f
Merge branch 'jvolivie/fix_copy' into jvolivie/sub_deps
jolivier23 Apr 3, 2025
b492a19
Empty commit
jolivier23 Apr 3, 2025
e4d0738
Try removing deps from bandit
jolivier23 Apr 3, 2025
1a6ea6a
Try putting submodule checkout in pipeline-lib
jolivier23 Apr 3, 2025
7918ff4
Revert "Try putting submodule checkout in pipeline-lib"
jolivier23 Apr 3, 2025
4f8c078
Try another option for bandit
jolivier23 Apr 4, 2025
62a9be5
Merge remote-tracking branch 'comm/master' into jvolivie/sub_deps
jolivier23 Apr 4, 2025
4737670
Test doing a recurive submodule update in dockerfile
jolivier23 Apr 4, 2025
76fc23c
That didn't work
jolivier23 Apr 4, 2025
013101b
Try disabling bandit folders in pyproject
jolivier23 Apr 4, 2025
e85d704
Try adding checkoutScm
jolivier23 Apr 4, 2025
d9823f5
try checkout in expression
jolivier23 Apr 4, 2025
ac36b59
Try removing dockerignore
jolivier23 Apr 4, 2025
41cac79
Try another approach
jolivier23 Apr 4, 2025
454d34e
Try proper checkout
jolivier23 Apr 4, 2025
3747b43
Another attempt at this
jolivier23 Apr 4, 2025
cd4b3aa
another attempt
jolivier23 Apr 4, 2025
3cd66c7
Merge remote-tracking branch 'comm/master' into jvolivie/sub_deps
jolivier23 Apr 9, 2025
b524778
Try another checkout
jolivier23 Apr 9, 2025
fd59638
See if bandit check is messing stuff up
jolivier23 Apr 9, 2025
a65c419
Merge remote-tracking branch 'comm/master' into jvolivie/sub_deps
jolivier23 Apr 12, 2025
115c210
Try removing pr repos arg
jolivier23 Apr 12, 2025
9114164
Fix bandit
jolivier23 Apr 12, 2025
2d7c30b
Get stuff
jolivier23 Apr 12, 2025
53d2f10
Re-push without checkout step
jolivier23 Apr 14, 2025
336ceab
Merge remote-tracking branch 'comm/master' into jvolivie/sub_deps
jolivier23 Apr 14, 2025
1020d5a
Add empty commit for testing
jolivier23 Apr 14, 2025
7a7d64a
Do this one at a time
jolivier23 Apr 14, 2025
a6e61ae
Try bandit fix
jolivier23 Apr 14, 2025
f804a1f
Not sure why more logs are used
jolivier23 Apr 15, 2025
c70b58a
Do recursive update in daos_build test
jolivier23 Apr 16, 2025
d843e47
Merge remote-tracking branch 'comm/master' into jvolivie/sub_deps
jolivier23 Apr 16, 2025
bb14766
Empty commit
jolivier23 Apr 16, 2025
5b90e18
Review suggestions
jolivier23 Apr 16, 2025
295f814
Merge remote-tracking branch 'comm/master' into jvolivie/sub_deps
jolivier23 Apr 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .clang-format-ignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
deps/**
2 changes: 1 addition & 1 deletion .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@
# just generate noise and extra work for docker.
*
!src
!deps
!utils/build.config
!utils/*.patch
!utils/certs
!utils/ci
!utils/completion
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/landing-builds.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ jobs:
- name: Checkout code
uses: actions/checkout@v4
with:
submodules: 'recursive'
fetch-depth: 500
- name: Setup git hash
run: ./ci/gha_helper.py --single
Expand Down Expand Up @@ -257,6 +258,7 @@ jobs:
uses: actions/checkout@v4
with:
submodules: 'recursive'
fetch-depth: 500
- name: Build dependencies in image.
run: docker build . --file utils/docker/Dockerfile.${{ matrix.base }}
--build-arg DEPS_JOBS
Expand Down Expand Up @@ -345,6 +347,7 @@ jobs:
uses: actions/checkout@v4
with:
submodules: 'recursive'
fetch-depth: 500
- name: Build dependencies in image.
run: docker build . --file utils/docker/Dockerfile.${{ matrix.base }}
--build-arg DEPS_JOBS
Expand Down
30 changes: 30 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,3 +1,33 @@
[submodule "raft"]
path = src/rdb/raft
url = https://github.com/daos-stack/raft.git
[submodule "argobots"]
path = deps/argobots
url = https://github.com/pmodels/argobots.git
[submodule "fused"]
path = deps/fused
url = https://github.com/daos-stack/fused.git
[submodule "ofi"]
path = deps/ofi
url = https://github.com/ofiwg/libfabric.git
[submodule "ucx"]
path = deps/ucx
url = https://github.com/openucx/ucx.git
[submodule "pmdk"]
path = deps/pmdk
url = https://github.com/pmem/pmdk.git
[submodule "isal"]
path = deps/isal
url = https://github.com/intel/isa-l.git
[submodule "isal_crypto"]
path = deps/isal_crypto
url = https://github.com/intel/isa-l_crypto.git
[submodule "protobufc"]
path = deps/protobufc
url = https://github.com/protobuf-c/protobuf-c.git
[submodule "spdk"]
path = deps/spdk
url = https://github.com/spdk/spdk.git
[submodule "mercury"]
path = deps/mercury
url = https://github.com/mercury-hpc/mercury.git
1 change: 1 addition & 0 deletions .yamllint.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,4 @@ ignore: |
/venv/
/build/
/install/
/deps/
1 change: 1 addition & 0 deletions ci/bandit.config
Original file line number Diff line number Diff line change
Expand Up @@ -398,4 +398,5 @@ weak_cryptographic_key:
weak_key_size_rsa_high: 1024
weak_key_size_rsa_medium: 2048
include: ['*.py', "*SConstruct", '*SConscript']
exclude_dirs: [".venv", "./deps", "./utils/rpms/_topdir"]

4 changes: 1 addition & 3 deletions ci/python_bandit_check.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,4 @@ set -uex

git clean -dxf

bandit --format xml -o bandit.xml -r . \
--exclude ./utils/rpms/_topdir \
-c ci/bandit.config || true
bandit --format xml -o bandit.xml -r . -c ci/bandit.config || true
2 changes: 1 addition & 1 deletion ci/unit/test_nlt_node.sh
Original file line number Diff line number Diff line change
Expand Up @@ -41,5 +41,5 @@ pip install /opt/daos/lib/daos/python/
sudo prlimit --nofile=1024:262144 --pid $$
prlimit -n

HTTPS_PROXY="${HTTPS_PROXY:-}" ./utils/node_local_test.py --max-log-size 1900MiB \
HTTPS_PROXY="${HTTPS_PROXY:-}" ./utils/node_local_test.py --max-log-size 1950MiB \
--dfuse-dir /localhome/jenkins/ --log-usage-save nltir.xml --log-usage-export nltr.json all
1 change: 1 addition & 0 deletions deps/argobots
Submodule argobots added at 6d216a
1 change: 1 addition & 0 deletions deps/fused
Submodule fused added at ebb74d
1 change: 1 addition & 0 deletions deps/isal
Submodule isal added at 2df39c
1 change: 1 addition & 0 deletions deps/isal_crypto
Submodule isal_crypto added at 175708
1 change: 1 addition & 0 deletions deps/mercury
Submodule mercury added at 27d7c2
1 change: 1 addition & 0 deletions deps/ofi
Submodule ofi added at 159219
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
diff --git a/src/info.c b/src/info.c
index 4127edf1..5e5bb4b8 100644
--- a/src/info.c
+++ b/src/info.c
@@ -1097,7 +1097,8 @@ void ABTI_info_check_print_all_thread_stacks(void)

/* Decrement the barrier value. */
int dec_value = ABTD_atomic_fetch_sub_int(&print_stack_barrier, 1);
- if (dec_value == 0) {
+ /* previous value should be 1 ! */
+ if (dec_value == 1) {
/* The last execution stream resets the flag. */
ABTD_atomic_release_store_int(&print_stack_flag,
PRINT_STACK_FLAG_UNSET);
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
diff --git a/configure.ac b/configure.ac
index 9c5e4739..8e4f134b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -259,6 +259,14 @@ AC_ARG_WITH([libunwind],
AS_HELP_STRING([--with-libunwind=PATH],
[specify path where libunwind include directory and lib directory can be found]))

+# --enable-stack-unwind
+AC_ARG_ENABLE([stack-unwind],
+[ --enable-stack-unwind@<:@=OPTS@:>@ enable stack unwinding, which is disabled by default.
+ yes|verbose - enable stack unwinding. Dump the raw stack information too
+ unwind-only - enable stack unwinding. Do not dump the raw stack information
+ no|none - disable stack unwinding
+],,[enable_stack_unwind=no])
+
# --with-papi
AC_ARG_WITH([papi],
AS_HELP_STRING([--with-papi=PATH],
110 changes: 110 additions & 0 deletions deps/patches/mercury/0001_na_ucx.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
diff --git a/src/na/na_ucx.c b/src/na/na_ucx.c
index 84eb8b0..e4b6676 100644
--- a/src/na/na_ucx.c
+++ b/src/na/na_ucx.c
@@ -614,7 +614,7 @@ na_ucx_addr_map_update(struct na_ucx_class *na_ucx_class,
*/
static na_return_t
na_ucx_addr_map_remove(
- struct na_ucx_map *na_ucx_map, ucs_sock_addr_t *addr_key);
+ struct na_ucx_map *na_ucx_map, struct na_ucx_addr *remove_addr);

/**
* Hash connection ID.
@@ -1688,8 +1688,12 @@ na_ucp_listener_conn_cb(ucp_conn_request_h conn_request, void *arg)
.addr = (const struct sockaddr *) &conn_request_attrs.client_address,
.addrlen = sizeof(conn_request_attrs.client_address)};
na_ucx_addr = na_ucx_addr_map_lookup(&na_ucx_class->addr_map, &addr_key);
- NA_CHECK_SUBSYS_ERROR_NORET(addr, na_ucx_addr != NULL, error,
- "An entry is already present for this address");
+
+ if (na_ucx_addr != NULL) {
+ NA_LOG_SUBSYS_WARNING(addr,
+ "An entry is already present for this address");
+ na_ucx_addr_map_remove(&na_ucx_class->addr_map, na_ucx_addr);
+ }

/* Insert new entry and create new address */
na_ret = na_ucx_addr_map_insert(na_ucx_class, &na_ucx_class->addr_map,
@@ -1937,10 +1941,14 @@ na_ucp_ep_error_cb(
static void
na_ucp_ep_close(ucp_ep_h ep)
{
- ucs_status_ptr_t status_ptr = ucp_ep_close_nb(ep, UCP_EP_CLOSE_MODE_FORCE);
+ const ucp_request_param_t close_params = {
+ .op_attr_mask = UCP_OP_ATTR_FIELD_FLAGS,
+ .flags = UCP_EP_CLOSE_FLAG_FORCE};
+ ucs_status_ptr_t status_ptr = ucp_ep_close_nbx(ep, &close_params);
+
NA_CHECK_SUBSYS_ERROR_DONE(addr,
status_ptr != NULL && UCS_PTR_IS_ERR(status_ptr),
- "ucp_ep_close_nb() failed (%s)",
+ "ucp_ep_close_nbx() failed (%s)",
ucs_status_string(UCS_PTR_STATUS(status_ptr)));
}

@@ -2722,7 +2730,7 @@ unlock:

/*---------------------------------------------------------------------------*/
static na_return_t
-na_ucx_addr_map_remove(struct na_ucx_map *na_ucx_map, ucs_sock_addr_t *addr_key)
+na_ucx_addr_map_remove(struct na_ucx_map *na_ucx_map, struct na_ucx_addr *remove_addr)
{
struct na_ucx_addr *na_ucx_addr = NULL;
na_return_t ret = NA_SUCCESS;
@@ -2731,13 +2739,14 @@ na_ucx_addr_map_remove(struct na_ucx_map *na_ucx_map, ucs_sock_addr_t *addr_key)
hg_thread_rwlock_wrlock(&na_ucx_map->lock);

na_ucx_addr = hg_hash_table_lookup(
- na_ucx_map->key_map, (hg_hash_table_key_t) addr_key);
- if (na_ucx_addr == HG_HASH_TABLE_NULL)
+ na_ucx_map->key_map, (hg_hash_table_key_t) &remove_addr->addr_key);
+
+ if (na_ucx_addr == HG_HASH_TABLE_NULL || na_ucx_addr->ucp_ep != remove_addr->ucp_ep)
goto unlock;

/* Remove addr key from primary map */
rc = hg_hash_table_remove(
- na_ucx_map->key_map, (hg_hash_table_key_t) addr_key);
+ na_ucx_map->key_map, (hg_hash_table_key_t) &na_ucx_addr->addr_key);
NA_CHECK_SUBSYS_ERROR(addr, rc != 1, unlock, ret, NA_NOENTRY,
"hg_hash_table_remove() failed");

@@ -2841,7 +2850,7 @@ na_ucx_addr_release(struct na_ucx_addr *na_ucx_addr)
NA_UCX_PRINT_ADDR_KEY_INFO("Removing address", &na_ucx_addr->addr_key);

na_ucx_addr_map_remove(
- &na_ucx_addr->na_ucx_class->addr_map, &na_ucx_addr->addr_key);
+ &na_ucx_addr->na_ucx_class->addr_map, na_ucx_addr);
}

if (na_ucx_addr->ucp_ep != NULL) {
@@ -3023,6 +3032,18 @@ na_ucx_rma(struct na_ucx_class NA_UNUSED *na_ucx_class, na_context_t *context,

/* There is no need to have a fully resolved address to start an RMA.
* This is only necessary for two-sided communication. */
+ /* The above assumption is now in question, so the following will resolve
+ * the address if required. */
+
+ /* Check addr to ensure the EP for that addr is still valid */
+ if (!(hg_atomic_get32(&na_ucx_addr->status) & NA_UCX_ADDR_RESOLVED)) {
+ ret = na_ucx_addr_map_update(
+ na_ucx_class, &na_ucx_class->addr_map, na_ucx_addr);
+ NA_CHECK_SUBSYS_NA_ERROR(
+ addr, error, ret, "Could not update NA UCX address");
+ }
+ NA_CHECK_SUBSYS_ERROR(msg, na_ucx_addr->ucp_ep == NULL, error, ret,
+ NA_ADDRNOTAVAIL, "UCP endpoint is NULL for that address");

/* TODO UCX requires the remote key to be bound to the origin, do we need a
* new API? */
@@ -3061,6 +3082,9 @@ na_ucx_rma_key_resolve(ucp_ep_h ep, struct na_ucx_mem_handle *na_ucx_mem_handle,

hg_thread_mutex_lock(&na_ucx_mem_handle->rkey_unpack_lock);

+ NA_CHECK_SUBSYS_ERROR(
+ mem, ep == NULL, error, ret, NA_INVALID_ARG, "Invalid endpoint (%p)", ep);
+
switch (hg_atomic_get32(&na_ucx_mem_handle->type)) {
case NA_UCX_MEM_HANDLE_REMOTE_PACKED: {
ucs_status_t status = ucp_ep_rkey_unpack(ep,
64 changes: 64 additions & 0 deletions deps/patches/mercury/0002_na_ucx_ep_flush.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
diff --git a/src/na/na_ucx.c b/src/na/na_ucx.c
index 6e9c3b0..2f157da 100644
--- a/src/na/na_ucx.c
+++ b/src/na/na_ucx.c
@@ -441,6 +441,12 @@ na_ucp_ep_create(ucp_worker_h worker, ucp_ep_params_t *ep_params,
static void
na_ucp_ep_error_cb(void *arg, ucp_ep_h ep, ucs_status_t status);

+/**
+ * Flush endpoint.
+ */
+static ucs_status_ptr_t
+na_ucp_ep_flush(ucp_ep_h ep);
+
/**
* Close endpoint.
*/
@@ -1940,6 +1946,21 @@ na_ucp_ep_error_cb(
na_ucx_addr_ref_decr(na_ucx_addr);
}

+/*---------------------------------------------------------------------------*/
+static ucs_status_ptr_t
+na_ucp_ep_flush(ucp_ep_h ep)
+{
+ const ucp_request_param_t flush_params = {
+ .op_attr_mask = 0};
+ ucs_status_ptr_t status_ptr = ucp_ep_flush_nbx(ep, &flush_params);
+
+ NA_CHECK_SUBSYS_ERROR_DONE(addr,
+ status_ptr != NULL && UCS_PTR_IS_ERR(status_ptr),
+ "ucp_ep_flush_nb() failed (%s)",
+ ucs_status_string(UCS_PTR_STATUS(status_ptr)));
+ return status_ptr;
+}
+
/*---------------------------------------------------------------------------*/
static void
na_ucp_ep_close(ucp_ep_h ep)
@@ -2859,8 +2880,23 @@ na_ucx_addr_release(struct na_ucx_addr *na_ucx_addr)
if (na_ucx_addr->ucp_ep != NULL) {
/* NB. for deserialized addresses that are not "connected" addresses, do
* not close the EP */
- if (na_ucx_addr->worker_addr == NULL)
+ if (na_ucx_addr->worker_addr == NULL) {
+ if (!na_ucx_addr->na_ucx_class->ucp_listener) {
+ ucs_status_ptr_t status_ptr = na_ucp_ep_flush(na_ucx_addr->ucp_ep);
+
+ if (UCS_PTR_IS_PTR(status_ptr)) {
+ ucs_status_t status;
+
+ do {
+ ucp_worker_progress(na_ucx_addr->na_ucx_class->ucp_worker);
+ status = ucp_request_check_status(status_ptr);
+ } while (status == UCS_INPROGRESS);
+ ucp_request_free(status_ptr);
+ }
+ }
+
na_ucp_ep_close(na_ucx_addr->ucp_ep);
+ }
na_ucx_addr->ucp_ep = NULL;
}

Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
diff --git a/src/libpmem2/aarch64/init.c b/src/libpmem2/aarch64/init.c
index d4dd8812b21..f0b504b4b89 100644
--- a/src/libpmem2/aarch64/init.c
+++ b/src/libpmem2/aarch64/init.c
@@ -7,6 +7,7 @@

#include "auto_flush.h"
#include "flush.h"
+#include "log_internal.h"
#include "out.h"
#include "pmem2_arch.h"

Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
diff --git a/scripts/setup.sh b/scripts/setup.sh
index d0c09430a6f..a56c74dd686 100755
--- a/scripts/setup.sh
+++ b/scripts/setup.sh
@@ -141,6 +141,10 @@ function linux_bind_driver() {

pci_dev_echo "$bdf" "$old_driver_name -> $driver_name"

+ if [[ $driver_name == "none" ]]; then
+ return 0
+ fi
+
echo "$ven_dev_id" > "/sys/bus/pci/drivers/$driver_name/new_id" 2> /dev/null || true
echo "$bdf" > "/sys/bus/pci/drivers/$driver_name/bind" 2> /dev/null || true

@@ -248,6 +252,17 @@ function collect_devices() {
if [[ $PCI_ALLOWED != *"$bdf"* ]]; then
pci_dev_echo "$bdf" "Skipping not allowed VMD controller at $bdf"
in_use=1
+ elif [[ " ${drivers_d[*]} " =~ "nvme" ]]; then
+ if [[ "${DRIVER_OVERRIDE}" != "none" ]]; then
+ if [ "$mode" == "config" ]; then
+ cat <<- MESSAGE
+ Binding new driver to VMD device. If there are NVMe SSDs behind the VMD endpoint
+ which are attached to the kernel NVMe driver,the binding process may go faster
+ if you first run this script with DRIVER_OVERRIDE="none" to unbind only the
+ NVMe SSDs, and then run again to unbind the VMD devices."
+ MESSAGE
+ fi
+ fi
fi
fi
fi
@@ -305,7 +320,9 @@ function configure_linux_pci() {
fi
fi

- if [[ -n "${DRIVER_OVERRIDE}" ]]; then
+ if [[ "${DRIVER_OVERRIDE}" == "none" ]]; then
+ driver_name=none
+ elif [[ -n "${DRIVER_OVERRIDE}" ]]; then
driver_path="$DRIVER_OVERRIDE"
driver_name="${DRIVER_OVERRIDE##*/}"
# modprobe and the sysfs don't use the .ko suffix.
@@ -337,10 +354,12 @@ function configure_linux_pci() {
fi

# modprobe assumes the directory of the module. If the user passes in a path, we should use insmod
- if [[ -n "$driver_path" ]]; then
- insmod $driver_path || true
- else
- modprobe $driver_name
+ if [[ $driver_name != "none" ]]; then
+ if [[ -n "$driver_path" ]]; then
+ insmod $driver_path || true
+ else
+ modprobe $driver_name
+ fi
fi

for bdf in "${!all_devices_d[@]}"; do
Loading
Loading