ID Assist Filtering is designed to eliminate irrelevant matches and results, effectively reducing false positives that are considered noise. At the full file level, this includes data types that are not meaningful to consider as matches. At the partial match level, irrelevant matches include small code snippets (such as comments, import statements, license declarations, empty functions, and getter/setter methods) that are too generic and commonplace to constitute valid matches.
Using information returned by the CLI, the Workbench ensures that only pertinent findings (with a configurable minimum number of characters) are reported, filtering out irrelevant snippet matches.
The precursor of ID Assist Filtering was the Noise reduction feature which was introduced in 22.2 supporting only Java and C. Starting with Workbench 24.2 the new match filtering feature supports an extended list of programming languages: Java, C, C++, C#, Javascript, Typescript, Objective-C and Rust.
The ‘ID Assist’ section containing Filtering is available for all users who have access to the scans interface.
Before running a scan:
- ID Assist Filtering can be enabled/disabled by checking/unchecking the box
- The minimum number of characters a snippet should still have after filtering to be considered a valid match can be changed:
This minimum number of characters can also be globally configured in fossid.conf, if not set, the default value is 300.
webapp_match_filtering_threshold=300
Before 24.2 version these were the settings for enabling noise reduction in fossid.conf:
webapp_cli_command=/fossid/bin/fossid-cli --config /fossid/etc/fossid.conf --fields +classification
webapp_enable_noise_reduction=1
How it works
ID Assist Filtering analyzes both partial and full file matches, employing various techniques and algorithms to determine whether a match is relevant and significant or irrelevant and therefore noise that should be removed. Any match deemed irrelevant will be ignored and not presented in the Workbench. This process can be combined with a threshold (minimum number of characters), ensuring that any remaining match must exceed this threshold to be included.
If all matches are filtered out, the messages “ignored” and “Match filtering” will be displayed in the Match section of the scan.
For debugging purposes, a message will be logged in the scan log whenever a match reported by the CLI is filtered out by the ID Assist Filtering function.