Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symlinks are not found when loading configuration files #3972

Closed
ElenaKhaustova opened this issue Jun 28, 2024 · 3 comments
Closed

Symlinks are not found when loading configuration files #3972

ElenaKhaustova opened this issue Jun 28, 2024 · 3 comments
Labels
Issue: Bug Report 🐞 Bug that needs to be fixed

Comments

@ElenaKhaustova
Copy link
Contributor

ElenaKhaustova commented Jun 28, 2024

Description

When having parameters in conf/base/sub_folder/parameters_A.yml structure where sub_folder is symlinked, the pipeline gives the following error, however, it is able to read the parameters from the following structure: conf/base/parameters_A.yml

ValueError: Pipeline input(s) {'params:length', 'params:width'} not found in the DataCatalog

Context

The issue is that fsspec.filesystem.glob() which we use to find the paths recursively doesn’t find symlinks.

for each in self._fs.glob(Path(f"{str(conf_path)}/{pattern}").as_posix()):

Steps to Reproduce

For a default spaceflights-pandas project create a symlink folder and place parameters_data_science.yml in the linked folder.
Screenshot 2024-06-28 at 14 55 13

Run the pipeline.

Expected Result

Symlinks are found when loading configuration files.

Your Environment

  • Kedro version used (pip show kedro or kedro -V): kedro, version 0.19.6
  • Python version used (python -V): Python 3.11.9
  • Operating system and version: macOS Sonoma version 14.5
@ElenaKhaustova ElenaKhaustova added the Issue: Bug Report 🐞 Bug that needs to be fixed label Jun 28, 2024
@ElenaKhaustova
Copy link
Contributor Author

Additional context from the user side:

  • they use symlinks in their assembler system on CX;
  • inability to use symlinks blocks them from migrating to the latest kedro version and or away from kedro glass.

@noklam
Copy link
Contributor

noklam commented Jul 3, 2024

Python doc

If recursive is true, the pattern “**” will match any files and zero or more directories, subdirectories and symbolic links to directories. If the pattern is followed by an os.sep or os.altsep then files will not match.

@noklam
Copy link
Contributor

noklam commented Jul 3, 2024

I created a PoC fix that seems to work for me: https://github.com/kedro-org/kedro/compare/noklam/glob-tmp-fix?expand=1

@linear linear bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Bug Report 🐞 Bug that needs to be fixed
Projects
None yet
Development

No branches or pull requests

2 participants