What's Changed
New Tests
- Added Python model REST protocol test on triton for Modelmesh by @Raghul-M in #2070
- Added Python model gRPC protocol test on triton for Kserve by @Raghul-M in #2025
- Added Python model REST protocol test on triton for Kserve by @Raghul-M in #2009
- Added FIL model REST protocol test on triton for Kserve by @rpancham in #2018
- Added FIL model GRPC protocol test on triton for Kserve by @rpancham in #2028
- Added Tensorflow model REST protocol test on triton for Kserve by @Raghul-M in #1846
Enhancements
- RHOAIENG-15919 Refactor to reflect 'create workbench' UI changes (#2059) by @FedeAlonso in #2077
- [fix] This bumps the expected CUDA version for the workbench images by @jstourac in #2098
- [fix] of the GPU tests for the IDE module by @jstourac in #2102
- Fix Dashboard Smoke Tests for RHOAI 2.17 by @manosnoam in #2142
- Fix Dashboard Suites 0408, 0409, 0410 Tests and Keywords by @manosnoam in #2152
- Update AMD Operator and NFD install scripts by @bdattoma in #2139
- [fix] the IDE Elyra tests based on the RHOAI 2.17 Dashboard changes by @jstourac in #2166
- [fix] tests for the BYON feature by @jstourac in #2177
- Default GPU node replicas to 1 avoiding 0 nodes in SNO clusters by @bdattoma in #2167
- [fix] the Verify Notebook Has Not Restarted upgrade test by @jstourac in #2173
- Fix dsp tests on interop 2.17 & fix variable conflict in Permissions provoking xpath issue by @jgarciao in #2219
Other Changes
- Fix model mesh tests and update runtime images by @rnetser in #2060
- Add changes to handle RHOAI deployment from stage for SelfManaged by @aloganat in #2080
- Increase timeout for kuberay tests by @ChughShilpa in #2097
- Update kfp dependency to 2.10.1. Rebuilds pipeline samples by @jgarciao in #2088
- Increase codeflare-sdk tests timeout by @Srihari1192 in #2093
- Remove Suite variables to avoid overriding values provided by tests in pre/post upgrade tests by @lugi0 in #2078
- Moved fetch cluster type to RHOSi.resources and assign correct subscr… by @asanzgom in #2066
- Add image references to the Distributed Workload image digests by @sutaakar in #2089
- Add tag to disruptive tests so that they can be exclution from Operator suite runs by @mattmahoneyrh in #2079
- update caikit runtime image and runtime validate param by @tarukumar in #2106
- Smoke Test failure - Name fix for Runtime template by @Raghul-M in #2103
- Update images used in nvidia and rocm pipeline testing for 2.16 (master) by @jgarciao in #2086
- Chores on the upgrade test suites by @asanzgom in #2110
- Fix
Verify Model Can Be Deployed Via UI For Upgrade
by @rnetser in #2112 - Update chmod dir path in download model to pvc and increase timeout in non-admin use case by @rnetser in #2109
- Added AutomationBug Tag to RHOAIENG-14306 by @asanzgom in #2116
- add AutomationBug tag to RHOAIENG-14840 by @kobihk in #2121
- DW: Add test coverage for custom ROCm Ray image by @ChughShilpa in #2124
- Add AutomationBug Tag to Verify RHODS User Groups by @asanzgom in #2128
- Remove ROCm tag from CPU tests by @ChughShilpa in #2127
- Updated locators for Verify RHODS Accept Multiple Admin Groups And CR… by @asanzgom in #2130
- [model server] Update post upgrade expected inference response by @rnetser in #2135
- Replace AMD GPU community operator with RH certified one by @bdattoma in #2134
- Add DW UI test to Verify CPU and Memory resource usage Exceeds warning threshold by @Srihari1192 in #2138
- RHOAIENG-12192 - Extend DSP e2e tests by @jiripetrlik in #2129
- Replace all Xpaths with
pf-v5
topf-v6
by @manosnoam in #2140 - update the install type to Cli instead of CLi by @kobihk in #2144
- Add keyword to find ROSA_HCP environment by @ChughShilpa in #2143
- update path for KFTO and FMS tests by @ChughShilpa in #2156
- Add KFTO_MNIST training operator tests by @ChughShilpa in #2159
- Add keyword to find cluster type based on cluster infrastructure by @ChughShilpa in #2154
- Skip FMS Training operator tests for version less than 2.12.0 by @sutaakar in #2158
- Add Monitoring Tag to Test Cases by @asanzgom in #2163
- enhancement: add operator integration tag to test that interacts with the component status by @CFSNM in #2162
- Migrating Python model REST protocol test on triton for Kserve ( UI -> API ) by @Raghul-M in #2133
- enhancement: add monitoring test to check targets are up and running by @CFSNM in #2165
- Update DW UI tests to align with the latest Dashboard UI improvements by @Srihari1192 in #2160
- Migration of Python model kserve grpc testcase UI -> API by @Raghul-M in #2155
- Migration of Onnx model kserve Rest testcase UI -> API by @Raghul-M in #2176
- Migration of Pytorch Rest Protocol test on triton for Kserve (UI -> API) by @rpancham in #2172
- Migration of Onnx model kserve Grpc testcase UI -> API by @Raghul-M in #2178
- enhancement: add monitoring test to check rhoai dashboard metrics by code by @CFSNM in #2175
- Add new runtime images by @tarukumar in #2181
- Migration of Keras Rest Protocol test on triton for Kserve (UI -> API) by @rpancham in #2182
- Update Codeflare SDK release tag for 2.17 release by @jiripetrlik in #2184
- Update KFTO multi-node test names according to recent updates in orig… by @abhijeet-dhumal in #2164
- Migration of Dali model kserve Rest testcase UI -> API by @Raghul-M in #2193
- Onboard ODS-CI testing on OpenShift CI by @liswang89 in #2189
- Upgrade fms-hf-tuning image to 2.5.0 by @sutaakar in #2195
- Add option in ocm.py script to just check if the cluster exists by @bdattoma in #2196
- update KFTO tests to utilise storage bucket in case of disconnected e… by @abhijeet-dhumal in #2198
- Migration of Tensort Rest Protocol test on triton for Kserve (UI -> API) by @rpancham in #2197
- Update dry_run.yml upload artifact action from version v3 -> v4 by @Raghul-M in #2206
- RHOAIENG-13260 - ODS-194: Verify RHOAI Dashboard metrics are defined by @MarianMacik in #2205
- Add KFTO huggingface trainer tests by @abhijeet-dhumal in #2199
- fix: solving disruptive tests issues due to operator reconciliation refactor changes by @CFSNM in #2208
- Migration of FIL Rest Protocol test on triton for Kserve (UI -> API) by @rpancham in #2204
- Update odh generic notebook images for RHOAI 2.17.0 by @abhijeet-dhumal in #2212
- post_upgrade: Add ExcludeOnDisconnected tag to the must-gather test by @MarianMacik in #2215
- Disbale kueue to run ray e2e tests by @ChughShilpa in #2217
New Contributors
- @liswang89 made their first contribution in #2189
Full Changelog: 2.16.0...2.17.0