Post-Scan Options¶
Post-Scan options activate their respective post-scan plugins which execute the task.
All “Post-Scan” Options¶
- --mark-source
Set the “is_source” flag to true for directories that contain over 90% of source files as direct children and descendants. Count the number of source files in a directory as a new “source_file_counts” attribute
Sub-Option of -
--url- --consolidate
Group resources by Packages or license and copyright holder and return those groupings as a list of consolidated packages and a list of consolidated components.
Sub-Option of -
--copyright,--licenseand--packages.- --filter-clues
Filter redundant duplicated clues already contained in detected licenses, copyright texts and notices.
- --is-license-text
Set the “is_license_text” flag to true for files that contain mostly license texts and notices (e.g. over 90% of the content).
Sub-Option of -
--infoand--license-text.
Warning
--is-license-text is an experimental Option.
- --license-clarity-score
Compute a summary license clarity score at the codebase level.
Sub-Option of -
--classify.- --license-policy FILE
Load a License Policy file and apply it to the scan at the Resource level.
- --summary
Summarize license, copyright and other scans at the codebase level.
Sub-Options:
--summary-by-facet--summary-key-files--summary-with-details
- --summary-by-facet
Summarize license, copyright and other scans and group the results by facet.
Sub-Option of -
--summaryand--facet.- --summary-key-files
Summarize license, copyright and other scans for key, top-level files. Key files are top- level codebase files such as COPYING, README and package manifests as reported by the
--classifyoption “is_legal”, “is_readme”, “is_manifest” and “is_top_level” flags.Sub-Option of -
--classifyand--summary.- --summary-with-details
Summarize license, copyright and other scans at the codebase level, keeping intermediate details at the file and directory level.
Note
Please note that different for different install methods, scan commands vary.
The basic command to perform a scan, in case of a download and configure installation (on Linux/MacOS) is:
path/to/scancode [OPTIONS] <OUTPUT FORMAT OPTION(s)> <SCAN INPUT>
The basic usage, if Scancode is installed from pip, or in Windows:
scancode [OPTIONS] <OUTPUT FORMAT OPTION(s)> <SCAN INPUT>
For more information on how the Scan Command varies for Various Installation Methods/Operating Systems, refer Commands Variation.
To see all plugins available via command line help, use --plugins.
Note
Plugins that are shown by using --plugins inlcude the following:
Post-Scan Plugins (and, the following)
Pre-Scan Plugins
Output Options
Output Control
Basic Scan Options
--mark-source Option¶
Dependency
The option
--mark-sourceis a sub-option of and requires the option--info.The
mark-sourceoption marks the “is_source” attribute of a directory to be “True”, if more than 90% of the files under that directory is source files, i.e. Their “is_source” attribute is “True”.When the following command is executed to scan the
samplesdirectory with this option enabled:scancode -clpieu --json-pp output.json samples --mark-sourceThen, the following directories are marked as “Source”, i.e. Their “is_source” attribute is changed from “false” to “True”.
samples/JGroups/src
samples/zlib/iostream2
samples/zlib/gcc_gvmat64
samples/zlib/ada
samples/zlib/infback9
--consolidate Option¶
Dependency
The option
--consolidateis a sub-option of and requires the options--license,--copyrightand--package.The JSON file containing scan results after using the
--consolidatePlugin is structured as follows: (note: “…” in the image contains more data)An example Scan:
scancode -clpieu --json-pp output.json samples --consolidateThe JSON output file is structured as follows:
{ "headers": [ {...} ], "consolidated_components": [ {... }, { "type": "license-holders", "identifier": "dmitriy_anisimkov_1", "consolidated_license_expression": "gpl-2.0-plus WITH ada-linking-exception", "consolidated_holders": [ "Dmitriy Anisimkov" ], "consolidated_copyright": "Copyright (c) Dmitriy Anisimkov", "core_license_expression": "gpl-2.0-plus WITH ada-linking-exception", "core_holders": [ "Dmitriy Anisimkov" ], "other_license_expression": null, "other_holders": [], "files_count": 1 }, {... } ], "consolidated_packages": [], "files": [ ] }Each consolidated component has the following information:
"consolidated_components": [ { "type": "license-holders", "identifier": "dmitriy_anisimkov_1", "consolidated_license_expression": "gpl-2.0-plus WITH ada-linking-exception", "consolidated_holders": [ "Dmitriy Anisimkov" ], "consolidated_copyright": "Copyright (c) Dmitriy Anisimkov", "core_license_expression": "gpl-2.0-plus WITH ada-linking-exception", "core_holders": [ "Dmitriy Anisimkov" ], "other_license_expression": null, "other_holders": [], "files_count": 1 },In addition to this, in every file/directory where the consolidated part (i.e. License information) was present, a “consolidated_to” attribute is added pointing to the “identifier” of “consolidated_components”:
"consolidated_to": [ "dmitriy_anisimkov_1" ],Note that multiple files may have the same “consolidated_to” attribute.
--filter-clues Option¶
The
--filter-cluesPlugin filters redundant duplicated clues already contained in detected licenses, copyright texts and notices.Warning
Running the following scan generates an error:
./scancode -clp --json-pp sample_filter_clues.json samples --filter-clues
--is-license-text Option¶
Dependency
The option
--is-license-textis a sub-option of and requires the options--infoand--license-text. Also, the option--license-textis a sub-option of and requires the options--license.If the
--is-license-textis used, then the “is_license_text” flag is set to true for files that contain mostly license texts and notices. Here mostly means over 90% of the content of the file.An example Scan:
scancode -clpieu --json-pp output.json samples --license-text --is-license-textIf the samples directory is scanned with this plugin, the files containing mostly license texts will have the following attribute set to ‘true’:
"is_license_text": true,The files in samples that will have the “is_license_text” to be true are:
samples/JGroups/EULA samples/JGroups/LICENSE samples/JGroups/licenses/apache-1.1.txt samples/JGroups/licenses/apache-2.0.txt samples/JGroups/licenses/bouncycastle.txt samples/JGroups/licenses/cpl-1.0.txt samples/JGroups/licenses/lgpl.txt samples/zlib/dotzlib/LICENSE_1_0.txtNote that the license objects for each detected license in the files already has “is_license_text” attributes by default, but not the file objects. They only have this attribute if the plugin is used.
Warning
--is-license-textis an experimental Option.
--license-clarity-score Option¶
Dependency
The option
--license-clarity-scoreis a sub-option of and requires the option--classify.The
--license-clarity-scoreplugin when used in a scan, computes a summary license clarity score at the codebase level.An example Scan:
scancode -clpieu --json-pp output.json samples --classify --license-clarity-scoreThe “license_clarity_score” will have the following attributes:
“score”
“declared”
“discovered”
“consistency”
“spdx”
“license_texts”
It whole JSON file is structured as follows, when it has “license_clarity_score”:
{ "headers": [ { ... } ], "license_clarity_score": { "score": 17, "declared": false, "discovered": 0.69, "consistency": false, "spdx": false, "license_texts": false }, "files": [ ... ] }
--license-policy FILE Option¶
The Policy file is a YAML (.yml) document with the following structure:
license_policies: - license_key: mit label: Approved License color_code: '#00800' icon: icon-ok-circle - license_key: agpl-3.0 label: Approved License color_code: '#008000' icon: icon-ok-circleNote
In the policy file only the “license_key” is a required field.
Applying License Policies during a ScanCode scan, using the
--license-policyPlugin:scancode -clipeu --json-pp output.json samples --license-policy policy-file.ymlNote
--license-policy FILEis a not a sub-option of--license. It works normally without-l.This adds to every file/directory an object “license_policy”, having as further attributes under it the fields as specified in the .YAML file. Here according to our example .YAML file, the attributes will be:
“license_key”
“label”
“color_code”
“icon”
Here the
samplesdirectory is scanned, and the Scan Results for a sample file is as follows:{ "path": "samples/JGroups/licenses/apache-2.0.txt", ... ... ... "licenses": [ ... ... ... ], "license_expressions": [ "apache-2.0" ], "copyrights": [], "holders": [], "authors": [], "packages": [], "emails": [], "license_policy": { "license_key": "apache-2.0", "label": "Approved License", "color_code": "#008000", "icon": "icon-ok-circle" }, "urls": [], "files_count": 0, "dirs_count": 0, "size_count": 0, "scan_errors": [] },More information on the License Policy Plugin and usage.
--summary Option¶
Sub-Option
The option
--summary-by-facet,--summary-key-filesand--summary-with-details``are sub-options of ``--summary. These Sub-Options are all Post-Scan Options.An example Scan:
scancode -clpieu --json-pp output.json samples --summaryThe whole JSON file is structured as follows, when the
--summaryplugin is applied:{ "headers": [ { ... } ], "summary": { "license_expressions": [ ... ], "copyrights": [ ... ], "holders": [ ... ], "authors": [ ... ], "programming_language": [ ... ], "packages": [] }, "files": [ ... ] }The Summary object has the following attributes.
“license_expressions”
“copyrights”
“holders”
“authors”
“programming_language”
“packages”
Each attribute has multiple entries each containing “value” and “count”, with their values having the summary information inside them.
A sample summary object generated:
"summary": { "license_expressions": [ { "value": "zlib", "count": 13 }, ] ], "copyrights": [ { "value": "Copyright (c) Mark Adler", "count": 4 }, { "value": "Copyright (c) Free Software Foundation, Inc.", "count": 2 }, { "value": "Copyright (c) The Apache Software Foundation", "count": 1 }, { "value": "Copyright Red Hat, Inc. and individual contributors", "count": 1 } ], "holders": [ { "value": null, "count": 10 }, { "value": "Mark Adler", "count": 4 }, { "value": "Red Hat, Inc. and individual contributors", "count": 1 }, { "value": "The Apache Software Foundation", "count": 1 }, ], "authors": [ { "value": "Bela Ban", "count": 4 }, { "value": "Brian Stansberry", "count": 1 }, { "value": "the Apache Software Foundation (http://www.apache.org/)", "count": 1 } ], "programming_language": [ { "value": "C++", "count": 13 }, { "value": "Java", "count": 7 }, ], "packages": []
--summary-by-facet Option¶
Dependency
The option
--summary-by-facetis a sub-option of and requires the options--facetand--summary.Running the scan with
--summary --summary-by-facetPlugins creates individual summaries for all the facets with the same license, copyright and other scan information, at a codebase level (in addition to the codebase level general summary generated by--summaryPlugin)An example scan using the
--summary-by-facetPlugin:scancode -clieu --json-pp output.json samples --summary --facet dev="*.java" --facet dev="*.c" --summary-by-facetNote
All other files which are not
devare marked to be included in the facetcore.Warning
Running the same scan with
./scancode -clpieui.e. with-pgenerates an error. Avoid this.The JSON file containing scan results is structured as follows:
{ "headers": [ ... ], "summary": { ... }, "summary_by_facet": [ { "facet": "core", "summary": { ... } }, { "facet": "dev", "summary": { ... } }, { "facet": "tests", "summary": { ... } }, { "facet": "docs", "summary": { ... } }, { "facet": "data", "summary": { ... } }, { "facet": "examples", "summary": { ... } } ], "files": [ }A sample “summary_by_facet” object generated by the previous scan (shortened):
"summary_by_facet": [ { "facet": "core", "summary": { "license_expressions": [ { "value": "mit", "count": 1 }, ], "copyrights": [ { "value": "Copyright (c) Free Software Foundation, Inc.", "count": 2 }, ], "holders": [ { "value": "The Apache Software Foundation", "count": 1 }, "authors": [ { "value": "Gilles Vollant", "count": 1 }, ], "programming_language": [ { "value": "C++", "count": 8 }, ] } }, { "facet": "dev", "summary": { "license_expressions": [ { "value": "zlib", "count": 5 }, "copyrights": [ { "value": "Copyright Red Hat Middleware LLC, and individual contributors", "count": 1 }, ], "holders": [ { "value": "Mark Adler", "count": 3 }, ], "authors": [ "value": "Brian Stansberry", "count": 1 }, ], "programming_language": [ { "value": "Java", "count": 7 }, { "value": "C++", "count": 5 } ] } }, ],Note
Summaries for all the facets are generated by default, regardless of facets not having any files under them.
For users who want to know What is a Facet?.
--summary-key-files Option¶
Dependency
The option
--summary-key-filesis a sub-option of and requires the options--classifyand--summary.An example Scan:
scancode -clpieu --json-pp output.json samples --classify --summary --summary-key-filesRunning the scan with
--summary --summary-key-filesPlugins creates summaries for key files with the same license, copyright and other scan information, at a codebase level (in addition to the codebase level general summary generated by--summaryPlugin)The resulting JSON file containing the scan results is structured as follows:
{ "headers": [ ... ], "summary": { "license_expressions": [ ... ], "copyrights": [ ... ], "holders": [ ... ], "authors": [ ... ], "programming_language": [ ... ], "packages": [] }, "summary_of_key_files": { "license_expressions": [ { "value": null, "count": 1 } ], "copyrights": [ { "value": null, "count": 1 } ], "holders": [ { "value": null, "count": 1 } ], "authors": [ { "value": null, "count": 1 } ], "programming_language": [ { "value": null, "count": 1 } ] }, "files": [These following flags for each file/directory is also present (generated by
--classify)
“is_legal”
“is_manifest”
“is_readme”
“is_top_level”
“is_key_file”
--summary-with-details Option¶
The
--summaryplugin summarizes license, copyright and other scan information at the codebase level. Now running the scan with the--summary-with-detailsplugin instead creates summaries at individual file/directories with the same license, copyright and other scan information, but at a file/directory level (in addition to the the codebase level summary).An example Scan:
scancode -clpieu --json-pp output.json samples --summary-with-detailsNote
--summaryis redundant in a scan when--summary-with-detailsis already selected.A sample file object in the scan results (a directory level summary of
samples/arch) is structured as follows:{ "path": "samples/arch", "type": "directory", "name": "arch", "base_name": "arch", "extension": "", "size": 0, "date": null, "sha1": null, "md5": null, "mime_type": null, "file_type": null, "programming_language": null, "is_binary": false, "is_text": false, "is_archive": false, "is_media": false, "is_source": false, "is_script": false, "licenses": [], "license_expressions": [], "copyrights": [], "holders": [], "authors": [], "packages": [], "emails": [], "urls": [], "is_legal": false, "is_manifest": false, "is_readme": false, "is_top_level": true, "is_key_file": false, "summary": { "license_expressions": [ { "value": "zlib", "count": 3 }, { "value": null, "count": 1 } ], "copyrights": [ { "value": null, "count": 1 }, { "value": "Copyright (c) Jean-loup Gailly", "count": 1 }, { "value": "Copyright (c) Jean-loup Gailly and Mark Adler", "count": 1 }, { "value": "Copyright (c) Mark Adler", "count": 1 } ], "holders": [ { "value": null, "count": 1 }, { "value": "Jean-loup Gailly", "count": 1 }, { "value": "Jean-loup Gailly and Mark Adler", "count": 1 }, { "value": "Mark Adler", "count": 1 } ], "authors": [ { "value": null, "count": 4 } ], "programming_language": [ { "value": "C++", "count": 3 }, { "value": null, "count": 1 } ] }, "files_count": 4, "dirs_count": 2, "size_count": 127720, "scan_errors": [] },These following flags for each file/directory is also present (generated by
--classify)
“is_legal”
“is_manifest”
“is_readme”
“is_top_level”
“is_key_file”