Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to exclude file patterns #35

Open
uptickmetachu opened this issue Feb 27, 2025 · 8 comments
Open

Ability to exclude file patterns #35

uptickmetachu opened this issue Feb 27, 2025 · 8 comments
Labels
enhancement New feature or request

Comments

@uptickmetachu
Copy link

Is your feature request related to a problem? Please describe.

We get these errors: [error]: Too many open files : 'SOME_FILE.d.ts'

when using configcat scan.

Describe the solution you'd like

It would be good if we can exclude file patterns when using config cat scan.

Describe alternatives you've considered

none.

Additional context

@uptickmetachu uptickmetachu added the enhancement New feature or request label Feb 27, 2025
@laliconfigcat
Copy link
Member

Hello @uptickmetachu ,

Thanks for reaching out.

By default the scan function ignores files based on the ignore files in your repository.
The most commonly used one is the .gitignore but we also introduced a custom file name - .ccignore - so you can fine tune your scan without affecting the .gitignore.
https://github.com/configcat/cli/blob/main/src/ConfigCat.Cli.Services/FileSystem/Ignore/IgnoreFile.cs#L11
The scan function reads these files and ignores the files based on these ignore files.

Could you please create a .ccignore file in the repository root and add the files you want to ignore there just like you would do it in a .gitignore file? I hope it will solve the problem.

@uptickmetachu
Copy link
Author

se]: /Users/williamchu/dev/workforce/frontend/services/mixpanel/src/hooks/use-mixpanel-api.ts - scan completed.
[verbose]: /Users/williamchu/dev/workforce/frontend/utils/datetime/node_modules/@ladle/react/node_modules/remark-gfm/node_modules/micromark-extension-gfm/node_modules/micromark-extension-gfm-table/dev/lib/html.d.ts - scan completed.
[verbose]: /Users/williamchu/dev/workforce/frontend/app-components/insights/node_modules/recharts/types/util/TickUtils.d.ts - scan completed.
[verbose]: /Users/williamchu/dev/workforce/frontend/app-components/insights/node_modules/recharts/types/shape/Cross.d.ts - scan completed.
[verbose]: /Users/williamchu/dev/workforce/frontend/app-components/insights/node_modules/recharts/types/util/ActiveShapeUtils.d.ts - scan completed.
[verbose]: /Users/williamchu/dev/workforce/frontend/app-components/insights/node_modules/recharts/lib/util/ChartUtils.js - scan completed.
[verbose]: /Users/williamchu/dev/workforce/frontend/app-components/insights/node_modules/recharts/types/context/chartLayoutContext.d.ts - scan completed.
[verbose]: /Users/williamchu/dev/workforce/frontend/app-components/insights/node_modules/recharts/types/shape/Rect

No combination in my .ccignore is getting node_modules ignored.

>> configcat --version
2.3.3

>> cat .ccignore
.*
**
*
d.ts
node_modules/**
.map
**node_modules**
*node_modules*
.*node_modules.*
node_modules/

@uptickmetachu
Copy link
Author

Right; I think the home brew version doesn't have this code.

@laliconfigcat
Copy link
Member

Hello @uptickmetachu ,

I just tested it on my computer and the .ccignore file worked fine for me.
v2.3.3 is the latest version so that version should include the ignore functionality (https://github.com/configcat/cli/releases).

Could you please show me the full log or at least some part after the HTTP response: 200 OK log message.
It should show if it is using the ignore file or not like this:

Image

@uptickmetachu
Copy link
Author

uptickmetachu commented Mar 2, 2025

Hi @laliconfigcat thanks for your response.

I've managed to get the .gitignore/.ccignore picked up but I realise now why it wasn't being picked up.

I was pointing configcat at a subdirectory; eg: configcat scan SOME_SUBDIR -c XXX from within the root directory of a git repo.

Config cat only looks for the ignore files within the subdirectory rather than detecting .gitignores at the root of the repo.

So a short summary of problems

  1. Config cat does not pick up ignore files from the root of a git repo; only from within the specified directory argument
  2. Config cat does not use standard .gitignore file patterns.
    A typical .gitignore pattern node_modules/ will ignore node_modules within all folders. For it to work with configcat, I have to add an additional pattern **/node_modules/** to my .gitignore to filter out node_modules within other repos. Perhaps something https://github.com/Guiorgy/GitignoreParserNet can be used instead of the custom parser.
  3. I'm getting an IOException Too many open files error still. I'm fairly certain its because the code is opening an unbounded amount of async tasks without a semaphore to limit the number of files being scanned concurrently. In a local branch of configcat; using a semaphore of 10 seems to have fixed it. (Threading / queue may be faster tbh but this is a quick fix)
[error]: System.IO.IOException: Too many open files : '/Users/williamchu/dev/workforce/features/tasks/review-task-margins.feature'
   at Interop.ThrowExceptionForIoErrno(ErrorInfo, String, Boolean)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String, OpenFlags, Int32, Boolean, Boolean& , Func`4)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String, FileMode, FileAccess, FileShare, FileOptions, Int64, UnixFileMode, Int64& , UnixFileMode& , Boolean, Boolean& , Func`4 )
   at System.IO.Strategies.OSFileStreamStrategy..ctor(String, FileMode, FileAccess, FileShare, FileOptions, Int64, Nullable`1)
   at System.IO.FileStream..ctor(String, FileMode, FileAccess, FileShare, Int32, FileOptions, Int64)
   at System.IO.File.AsyncStreamReader(String, Encoding)
   at System.IO.File.ReadLinesAsync(String, Encoding, CancellationToken )
   at ConfigCat.Cli.Services.Scan.ReferenceCollector.CollectAsync(IEnumerable`1 flags, FileInfo file, Int32 contextLines, String[] usagePatterns, List`1 warningTracker, CancellationToken token)
   at ConfigCat.Cli.Services.Scan.FileScanner.<>c__DisplayClass5_0.<<ScanAsync>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at Trybot.Timeout.TimeoutBot`1.ExecuteAsync(IAsyncBotOperation`1, ExecutionContext, CancellationToken)
   at Trybot.Timeout.TimeoutBot`1.ExecuteAsync(IAsyncBotOperation`1, ExecutionContext, CancellationToken)
   at Trybot.BotPolicy`1.ExecuteAsync(IAsyncBotOperation`1, Object, CancellationToken)
   at ConfigCat.Cli.Services.Scan.FileScanner.ScanAsync(FlagModel[] flags, FileInfo[] filesToScan, String[] matchPatterns, String[] usagePatterns, Int32 contextLines, List`1 warningTracker, CancellationToken token)
   at ConfigCat.Cli.Commands.Scan.InvokeAsync(DirectoryInfo directory, String configId, Int32 lineCount, Boolean print, Boolean upload, String repo, String branch, String commitHash, String fileUrlTemplate, String commitUrlTemplate, String runner, String[] aliasPatterns, String[] usagePatterns, String[] excludeFlagKeys, CancellationToken token)
   at System.CommandLine.Invocation.CommandHandler.GetExitCodeAsync(Object, InvocationContext)
   at System.CommandLine.Invocation.ModelBindingCommandHandler.InvokeAsync(InvocationContext)
   at System.CommandLine.Invocation.InvocationPipeline.<>c__DisplayClass4_0.<<BuildInvocationChain>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass23_0.<<UseParseErrorReporting>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at ConfigCat.Cli.Program.<>c__DisplayClass0_0.<<Main>b__3>d.MoveNext()
--- End of stack trace from previous location ---
   at ConfigCat.Cli.Program.<>c__DisplayClass0_0.<<Main>b__2>d.MoveNext()
--- End of stack trace from previous location ---
   at ConfigCat.Cli.Program.<>c__DisplayClass0_0.<<Main>b__1>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass16_0.<<UseHelp>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass27_0.<<UseVersionOption>b__1>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass25_0.<<UseTypoCorrections>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass14_0.<<UseExceptionHandler>b__0>d.MoveNext()

@uptickmetachu
Copy link
Author

uptickmetachu commented Mar 2, 2025

src/ConfigCat.Cli.Services/Scan/FileScanner.cs

using ConfigCat.Cli.Models.Api;
using ConfigCat.Cli.Models.Scan;
using ConfigCat.Cli.Services.Rendering;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using Trybot;

namespace ConfigCat.Cli.Services.Scan;

public interface IFileScanner
{
    Task<IEnumerable<FlagReferenceResult>> ScanAsync(FlagModel[] flags,
        FileInfo[] filesToScan,
        string[] matchPatterns,
        string[] usagePatterns,
        int contextLines,
        List<string> warningTracker,
        CancellationToken token);
}

public class FileScanner : IFileScanner
{
    private readonly IReferenceCollector referenceCollector;
    private readonly IAliasCollector aliasCollector;
    private readonly IBotPolicy<IEnumerable<FlagReferenceResult>> botPolicy;
    private readonly IOutput output;
    private readonly SemaphoreSlim semaphore = new SemaphoreSlim(10); // Limit to 10 concurrent file operations

    public FileScanner(IReferenceCollector referenceCollector,
        IAliasCollector aliasCollector,
        IBotPolicy<IEnumerable<FlagReferenceResult>> botPolicy,
        IOutput output)
    {
        this.referenceCollector = referenceCollector;
        this.aliasCollector = aliasCollector;
        this.botPolicy = botPolicy;
        this.output = output;
        this.botPolicy.Configure(p => p.Timeout(t => t.After(TimeSpan.FromMinutes(30))));
    }

    public async Task<IEnumerable<FlagReferenceResult>> ScanAsync(FlagModel[] flags,
        FileInfo[] filesToScan,
        string[] matchPatterns,
        string[] usagePatterns,
        int contextLines,
        List<string> warningTracker,
        CancellationToken token)
    {
        using var spinner = this.output.CreateSpinner(token);
        return await this.botPolicy.ExecuteAsync(async (_, cancellation) =>
        {
            this.output.Verbose($"Searching for flag ALIASES...", ConsoleColor.Magenta);
            if (matchPatterns.Length > 0)
                this.output.Verbose($"Using the following custom alias patterns: {string.Join(", ", matchPatterns.Select(p => $"'{p}'"))}");
            if (usagePatterns.Length > 0)
                this.output.Verbose($"Using the following custom usage patterns: {string.Join(", ", usagePatterns.Select(p => $"'{p}'"))}");

            var aliasTasks = filesToScan.Select(file => ProcessFileAsync(
                () => this.aliasCollector.CollectAsync(flags, file, matchPatterns, warningTracker, cancellation)));

            var aliasResults = (await Task.WhenAll(aliasTasks)).Where(r => r is not null).ToArray();

            foreach (var (key, aliases) in aliasResults.SelectMany(r => r))
            {
                var flag = flags.FirstOrDefault(f => f.Key == key);
                if (flag is null) continue;

                flag.Aliases ??= [];
                flag.Aliases.AddRange(aliases);
                flag.Aliases = flag.Aliases.Distinct().ToList();
            }

            this.output.Verbose($"Scanning for flag REFERENCES...", ConsoleColor.Magenta);
            var scanTasks = filesToScan.Select(file => ProcessFileAsync(
                () => this.referenceCollector.CollectAsync(flags, file, contextLines, usagePatterns, warningTracker, cancellation)));

            var referenceResults = (await Task.WhenAll(scanTasks)).Where(r => r is not null);

            return referenceResults;
        }, token);
    }

    private async Task<T> ProcessFileAsync<T>(Func<Task<T>> fileTask)
    {
        await semaphore.WaitAsync();
        try
        {
            return await fileTask();
        }
        finally
        {
            semaphore.Release();
        }
    }
}

Some GPT generated code that seemed to have fixed the issue.

@z4kn4fein
Copy link
Member

z4kn4fein commented Mar 3, 2025

Hi @uptickmetachu, thank you for the insights!

What you mentioned is correct, I'll take a look into the options to include the .ignore files from the root of the repo and fix the ignore parsing mechanism. Thank you for the alternative!

For the problem of too many files open, the CLI should probably detect the current ulimit value applied to the process and take that into account during file scanning. I'll also look into our options for this and let you know about the results.

@z4kn4fein z4kn4fein mentioned this issue Mar 4, 2025
4 tasks
@z4kn4fein
Copy link
Member

Hi @uptickmetachu, I've released a new version of the CLI (v2.4.0), which contains the fixes for the issues you mentioned. Could you please try it and let me know the results? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants