Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix git ls-file parsing in iter_gitworktree() #421

Merged
merged 3 commits into from
Jun 16, 2023

Conversation

christian-monch
Copy link
Contributor

This PR modifies the regular expression that is used during execution of iter_gitworktree() in order to parse the output of git ls-tree.

The current version ensures that only one tab-character is considered to separate the file info from the file name. That ensures that file names that start with tab-characters are properly handled (the previous version, would consider all tab-characters, including the tab-characters belonging to the name, as file-info/file-name separator).

The PR includes a regression test that verifies correct handling of file names starting with tab-characters

This commit changes the regular expression that
analyses git output in such a way, that it only
consumes a single tab character before the file
name. That change makes it work with file names
that start with a tab-character.
@christian-monch christian-monch requested a review from mih as a code owner June 15, 2023 10:13
@christian-monch christian-monch force-pushed the fix-git-lsfile-parsing branch 2 times, most recently from a6822fd to 938e29c Compare June 15, 2023 15:35
@codecov
Copy link

codecov bot commented Jun 15, 2023

Codecov Report

Patch coverage: 92.85% and project coverage change: -0.05 ⚠️

Comparison is base (59a6317) 92.06% compared to head (d1e9efe) 92.01%.

❗ Current head d1e9efe differs from pull request most recent head bec45ae. Consider uploading reports for the commit bec45ae to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #421      +/-   ##
==========================================
- Coverage   92.06%   92.01%   -0.05%     
==========================================
  Files         122      122              
  Lines        9020     9033      +13     
==========================================
+ Hits         8304     8312       +8     
- Misses        716      721       +5     
Impacted Files Coverage Δ
datalad_next/iter_collections/gitworktree.py 100.00% <ø> (ø)
...ext/iter_collections/tests/test_itergitworktree.py 97.77% <92.85%> (-2.23%) ⬇️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@christian-monch christian-monch force-pushed the fix-git-lsfile-parsing branch 2 times, most recently from baa144e to fc2fcd5 Compare June 16, 2023 06:57
Copy link
Member

@mih mih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This looks good to me in general. However, the test conditions needs to be adjusted. This is not about windows, but about whether the FS can handle such filenames at all -- see the crippled FS test failures.

This commit adds a regression test to ensure
that file-names starting with a tab-character
are properly parsed from the output of
`git ls-files`.

The test instantiates PurePosixPath-instances
to check against names in the output of
`iter_gitworktree`, because `iter_gitworktree`
explicitly creates PurePosixPath-instances in
the function `_lsfiles_line2props`.
@christian-monch christian-monch force-pushed the fix-git-lsfile-parsing branch from fc2fcd5 to d1e9efe Compare June 16, 2023 07:17
@christian-monch
Copy link
Contributor Author

Crippled-FS conditions fixed now by using:

    if ds.repo.is_crippled_fs():
        pytest.skip("not applicable on crippled filesystems")

@christian-monch christian-monch force-pushed the fix-git-lsfile-parsing branch from 87c3c4e to bec45ae Compare June 16, 2023 08:47
Copy link
Member

@mih mih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thx!

@mih mih merged commit c3619bb into datalad:main Jun 16, 2023
@christian-monch christian-monch deleted the fix-git-lsfile-parsing branch July 16, 2024 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants