pycollect

class pycollect.PythonFileCollector(use_regex_patterns: bool = False, additional_file_exclusion_patterns: Iterable[str] = None, additional_dir_exclusion_patterns: Iterable[str] = None)[source]

PythonFileCollector provides method collect() to collect files while applying exclusion patterns to files and directories.

Exclusion patterns are in respect to file and directory names only, NOT taking into account the absolute nor the relative file or directory path.

When not using regex patterns, a single wildcard, *, can be used anywhere in a pattern to filter names “starting with” and/or “ending with”. Also, a single exclamation mark, !, can be used at the beginning of the pattern to negate it. These only applies when the parameter use_regex_patterns is False.

Note

Using regex patterns may be slower as it consumes more CPU.

Parameters
  • use_regex_patterns – (default: False) flag to indicate whether or not to use regex to match patterns. When this flag it set to False the * character is interpreted as wildcard and patterns starting with the ! character are negated.

  • additional_file_exclusion_patterns – (default: None) additional patterns to filter out of collection files. In addition to the PythonFileCollector._DEFAULT_FILE_EXCLUSION_PATTERNS or PythonFileCollector._DEFAULT_FILE_EXCLUSION_REGEX_PATTERNS any file that matches these patterns will be excluded from collection.

  • additional_dir_exclusion_patterns – (default: None) additional patterns to filter out of collection directories. In addition to the PythonFileCollector._DEFAULT_DIR_EXCLUSION_PATTERNS or PythonFileCollector._DEFAULT_DIR_EXCLUSION_REGEX_PATTERNS any directory that matches these patterns will be excluded from collection.

DEFAULT_DIR_EXCLUSION_PATTERNS = {'*~', '.*', '__pycache__', 'bin', 'build', 'develop-eggs', 'dist', 'eggs', 'htmlcov', 'parts', 'pyvenv*', 'sdist', 'tmp', 'var', 'venv*', 'wheelhouse'}

The default set of directory exclusion patterns.

Are excluded by default:

  1. Directories starting with a dot (.);

  2. Directories ending with a tilde (~).

  3. Directories with common names indicating auto-generated content, binaries, cache and temporary content (e.g., __pycache__, tmp, dist)

DEFAULT_FILE_EXCLUSION_PATTERNS = {'!*.py', '.*', '~*'}

The default set of file exclusion patterns.

Are excluded by default:

  1. Files without the .py extension;

  2. Files starting with a dot (.); and

  3. Files starting with a tilde (~).

collect(search_path: Optional[str] = None, recursion_limit: Optional[int] = None, follow_symlinks: bool = True) → Set[posix.DirEntry][source]

Method to perform Python files collection in the specified search path, respecting exclusion patterns set to the class object.

Parameters
  • search_path – (default: uses the caller’s path) absolute or relative path from which to search for when collecting Python files. This is expected to be a directory.

  • recursion_limit – (default: None) directory recursion limit. The directory indicated by the :param search_path: is considered level 0 of recursion.

  • follow_symlinks – (default: True) boolean indicating whether or not to follow symbolic links when collecting Python files.

Returns

A set of DirEntry instances referring to each collected file is returned.

pycollect.find_module_name(filepath: Union[str, os.PathLike], innermost: bool = False) → Optional[str][source]

Utility function to find the Python module name of a python file.

Parameters
  • filepath – The absolute filepath as a DirEntry object, path string or PathLike object.

  • innermost – (default: False) By default the outermost possible module name is returned. When this flag is set to True, the first found, innermost possible module name is then returned without further looking.

Returns

The module name string or None if no module was found for the specified filepath.