Include-Exclude Pattern Matching GuideΒΆ

IntroductionΒΆ

This document explains the pattern matching syntax used in include-exclude patterns, which follows the same syntax as .gitignore files. These patterns are used to specify which files to include or exclude when creating knowledge bases or other file filtering operations.

The pattern syntax allows for powerful and flexible matching of file paths using wildcards, directory specifiers, and character classes. This guide demonstrates each pattern type with concrete examples to help you understand exactly how they work.

Basic File Pattern MatchingΒΆ

Basic patterns match files regardless of their location in the directory structure.

Basic File Name MatchingΒΆ

Pattern

Description

Example Path

Matches?

README.md

Matches any file named README.md in any directory

README.md

βœ“ Yes

README.md

Matches README.md in subdirectories too

folder/README.md

βœ“ Yes

README.md

Matches at any depth in the directory structure

folder/subfolder/README.md

βœ“ Yes

*.py

Matches any Python file in any directory

example.py

βœ“ Yes

*.py

Matches Python files in subdirectories

folder/example.py

βœ“ Yes

*.py

Matches Python files at any depth

folder/subfolder/example.py

βœ“ Yes

Directory-Specific MatchingΒΆ

Patterns can be made directory-specific to match files only in certain locations.

Directory-Specific MatchingΒΆ

Pattern

Description

Example Path

Matches?

src/*.py

Matches Python files directly in the src directory

src/example.py

βœ“ Yes

src/*.py

Does NOT match Python files in other directories

example.py

βœ— No

src/*.py

Does NOT match Python files in other directories

folder/example.py

βœ— No

src/*.py

Does NOT match Python files in subdirectories of src

src/folder/example.py

βœ— No

Recursive Directory MatchingΒΆ

The ** pattern allows matching files recursively through directories.

Recursive Directory MatchingΒΆ

Pattern

Description

Example Path

Matches?

src/**/*.py

Matches Python files directly in the src directory

src/example.py

βœ“ Yes

src/**/*.py

Matches Python files in subdirectories of src

src/folder/example.py

βœ“ Yes

src/**/*.py

Matches Python files at any depth under src

src/folder/subfolder/example.py

βœ“ Yes

docs/source/*/**/index.rst

Matches index.rst files at least 2 levels deep under docs/source

docs/source/Section-1/index.rst

βœ“ Yes

docs/source/*/**/index.rst

Matches index.rst files at deeper levels

docs/source/Section-1/Section-1-1/index.rst

βœ“ Yes

docs/source/*/**/index.rst

Does NOT match index.rst directly in docs/source

docs/source/index.rst

βœ— No

Directory Name MatchingΒΆ

Patterns can match directories and their contents.

Directory Name MatchingΒΆ

Pattern

Description

Example Path

Matches?

tmp

Matches files inside any directory named tmp

tmp/file.txt

βœ“ Yes

tmp

Matches files in subdirectories of tmp

tmp/folder/file.txt

βœ“ Yes

tmp

Matches files at any depth under tmp

tmp/folder/subfolder/file.txt

βœ“ Yes

tmp

Matches files in tmp directories anywhere in the path

tests/tmp/file.txt

βœ“ Yes

tmp

Matches files in nested tmp directories

tests/tmp/folder/file.txt

βœ“ Yes

tmp

Matches deeply nested files in tmp directories

tests/tmp/subfolder/file.txt

βœ“ Yes

Directory Contents MatchingΒΆ

Adding a trailing slash focuses on directory contents rather than the directory itself.

Directory Contents MatchingΒΆ

Pattern

Description

Example Path

Matches?

tmp/

Does NOT match the tmp directory itself

tmp

βœ— No

tmp/

Matches files inside any tmp directory

tmp/file.txt

βœ“ Yes

tmp/

Matches files in subdirectories of tmp

tmp/folder/file.txt

βœ“ Yes

tmp/

Matches files at any depth under tmp

tmp/folder/subfolder/file.txt

βœ“ Yes

tmp/

Matches files in tmp directories anywhere in the path

tests/tmp/file.txt

βœ“ Yes

tmp/

Matches files in nested tmp directories

tests/tmp/folder/file.txt

βœ“ Yes

tmp/

Matches deeply nested files in tmp directories

tests/tmp/subfolder/file.txt

βœ“ Yes

Character Class MatchingΒΆ

Character classes allow matching one character from a set of characters.

Character Class MatchingΒΆ

Pattern

Description

Example Path

Matches?

*.py[cod]

Matches Python bytecode files (.pyc)

test.pyc

βœ“ Yes

*.py[cod]

Matches Python optimized bytecode files (.pyo)

test.pyo

βœ“ Yes

*.py[cod]

Matches Python dynamic library files (.pyd)

test.pyd

βœ“ Yes

Summary of Pattern SyntaxΒΆ

Here’s a quick reference for the pattern syntax demonstrated:

Pattern Syntax SummaryΒΆ

Pattern Element

Description

*

Matches any sequence of characters within a path segment (not including path separators)

**

Matches any sequence of characters spanning multiple path segments (including path separators)

/

When used at the end of a pattern, specifies matching directory contents rather than the directory itself

[abc]

Character class that matches any single character from the set (a, b, or c)

dir/

Specifies a directory prefix, restricting matches to that directory

dir/**

Specifies a directory prefix with recursive matching, finding matches in that directory and all its subdirectories

These patterns can be combined to create powerful file selection rules for your knowledge base configuration. Use the examples above as a reference when creating your own patterns to ensure they match exactly the files you intend.