/v3/businesses/submit/file/{scanId}
Scan files to find where the content has been used elsewhere and check its originality. Using submit-file you can scan various file types for plagiarism and identify copied content. See supported formats .
For integration testing purposes, use sandbox mode - for free.
Request
URL Parameters
scanId
REQUIREDWe recommend you use the same id in your database to represent the scan in the Copyleaks database. This will help you to debug incidents.
Using the same ID for the same file will help you to avoid network problems that may lead to multiple scans for the same file.
Allowed characters are [a-z0-9]
and the following symbols: !@$^&-+%=_(){}<>';:/.",~`|
Learn more about the criteria for creating a Scan ID .
Body Parameters
base64
REQUIREDaGVsbG8gd29ybGQ=
filename
REQUIREDMyfile.pdf
Max length: 255 characters. properties.action
Possible values:
- Scan: Start scan immediately.
- Check Credits: Check how many credits will be used for this scan.
- Index Only: Only index the file in the Copyleaks internal database or Copyleaks Repository(depends on your submit request). No credits will be used.
0
Optional Values: 0
: Scan 1
: Check-Credits 2
: Index Only
properties.includeHtml
false
Possible values: True
: results will be generated as HTML format, if possible. Otherwise, it will be generated as text format. False
: results will be generated as text format.
properties.developerPayload
Default: null
properties.sandbox
You will be able to submit content for a scan and get back mock results, simulating the way Copyleaks will work to make sure that you successfully integrated with the API.
Turn off this feature on production environment.
false
Rate Limiting: This method has a maximum call rate limit of 100 sandbox scans within 1 hour. See the 429
Response code section at the bottom of this page.
properties.expiration
When expired, the scan will be deleted and will no longer be accessible.
2800
Range: 1 to 2800
properties.scanMethodAlgorithm
0
- MaximumCoverage. Available Options: 0
- MaximumCoverage: prioritize higher similarity score. 1
- MaximumResults: prioritize finding more sources.
properties.customMetadata
If this document is found as a repository result, your custom properties will be added to the result.
[]
Example:
[
{
"key":"Test1",
"value":"Test1"
},
...
]
properties.author.id
Using this feature Copyleaks can detect the author's writing patterns and get better results.
null
properties.webhooks.newResult
null
Example: https://yoursite.com/webhook/new-result
properties.webhooks.newResultHeaders
[
[
"header-key",
"header-value"
],
...
]
properties.webhooks.status
REQUIRED Use the special token {STATUS}
to track the current scan status. This special token will automatically be replaced by the Copyleaks servers with the optional values: completed
, error
, creditsChecked
and indexed
.
Read more about webhooks .
https://yoursite.com/webhook/{STATUS}
properties.webhooks.statusHeaders
[
[
"header-key",
"header-value"
],
...
]
properties.filters.identicalEnabled
true
properties.filters.minorChangesEnabled
true
properties.filters.relatedMeaningEnabled
true
properties.filters.minCopiedWords
minCopiedWords
copied words. null
properties.filters.safeSearch
false
properties.filters.domains
domainsMode
. []
properties.filters.domainsMode
domains
property When Include is selected, Copyleaks will filter out all results that are not part of the properties.filters.domains list.
When Exclude is selected, Copyleaks will only find results outside of the properties.filters.domains list.
1
Optional Values:
0
: Include 1
: Exclude properties.scanning.internet
true
properties.scanning.exclude.idPattern
Supported pattern wildcards:
*
: Matches any, zero or more, characters. .
: Matches a single (non whitespace) character. null
Example:
abc*
: will exclude any submissions that have an id starting with 'abc'. ab..
: will exclude any submittions with exactly 4 letter id starting with 'ab'. properties.scanning.repositories[]
[]
properties.scanning.repositories[].id
null
properties.scanning.repositories[].includeMySubmissions
false
properties.scanning.repositories[].includeOthersSubmissions
false
properties.scanning.crossLanguages.languages[]
Supported languages list.
[]
Max length: 5
properties.scanning.crossLanguages.languages[].code
null
properties.indexing.repositories[]
[]
properties.indexing.repositories[].id
null
properties.indexing.repositories[].maskingPolicy
If the repo has it's own masking policy, the stricter policy will be applied to results from this document.
0
Available policies: 0
: don't mask results from this document. 1
: Mask all results coming from this document, unless the requesting user owns this file. 2
: Mask all results from this document.
properties.exclude.quotes
false
properties.exclude.citations
false
properties.exclude.tableOfContents
false
properties.exclude.titles
false
properties.exclude.htmlTemplate
false
properties.pdf.create
Set to true in order to generate a pdf report for this scan.
false
properties.pdf.title
null
Max length: 256 characters.
properties.pdf.largeLogo
We only support
png
format. null
Max file size: 100kb
Recommended size:
width: 185px
height: 50px
properties.pdf.rtl
false
properties.pdf.version
1
Available values: 1
2
properties.sensitivityLevel
3
Optional Values:
Range between 1 (faster) to 5 (slower but more comprehensive)
properties.cheatDetection
false
properties.aiGeneratedText.detect
Upon detection a scan alert of type "suspected-ai-text" will be added to the scan completion webhook.
false
properties.sensitiveDataProtection.driversLicense
- Australia driver's license number
- Canada driver's license number
- United Kingdom driver's license number
- USA drivers license number
- Japan driver's license number
- Spain driver's license number
- Germany driver's license number
false
properties.sensitiveDataProtection.credentials
- Authentication token
- Amazon Web Services credentials
- Azure JSON Web Token
- HTTP basic authentication header
- Google Cloud Platform service account credentials
- Google Cloud Platform API key
- JSON Web Token
- Encryption key
- Password
false
properties.sensitiveDataProtection.passport
- Canada passport number
- China passport number
- France passport number
- Germany passport number
- Ireland passport number
- Japan passport number
- Korea passport number
- Mexico passport number
- Spain passport number
- United Kingdom passport number
- USA passport number
- Netherlands passport number
- Poland passport
- Sweden passport number
- Australia passport number
- Singapore passport number
- Taiwan passport number
false
properties.sensitiveDataProtection.network
- IP address
- Local MAC address
- MAC address
false
properties.sensitiveDataProtection.url
false
properties.sensitiveDataProtection.emailAddress
false
properties.sensitiveDataProtection.creditCard
false
properties.sensitiveDataProtection.phoneNumber
false
Request Example
Response
Codes
201
400
{ "properties.webhooks.status": [ "The field is required." ] }
401
Authorization has been denied for this request.
409
429
This may happen when sending too many scans in Sandbox mode.
Other resources:
- Performance Considerations Important! - How to improve your scan performance.
- Exponential Backoff - Algorithm that helps applications define a retry strategy for consuming a network service.
- Technical Specifications - See API's limits and supported formats.

Do you have a technical question?
Use stackoverflow.com to get help from our development team and other Copyleaks users.