PUT /v3/education/submit/ocr/{scanId}

Scan images with textual content to find where the content has been used before and check its originality. Using submit-ocr you can scan various image file types for plagiarism and identify infringed content. Only the textual content in the picture will be scanned and not the graphics. See supported formats.

You need to login with a user and api key in order to access this method.
Add this HTTP header to your request:
Authorization: Bearer <Your-Login-Token>
Not sure how to generate your login token? Read here.

For integration testing purposes, use sandbox mode - for free.

Request

URI Parameters

Name Description

scanId Required

A unique scan id provided by you.

We recommend you use the same id in your database to represent the scan in the Copyleaks database. This will help you to debug incidents.

Using the same ID for the same file will help you to avoid network problems that may lead to multiple scans for the same file.

String
Length: 3-36 characters.

Allowed characters are [a-zA-Z0-9] and the following symobols: [email protected]$^&-+%=_(){}<>';:/.",~`|

Body Parameters

Name Description

url Required

The url to be scanned

String (uri)

Example: http://example.com

base64 Required

A base64 data string of a file.

If you would like to scan plain text, encode it as base64 and submit it.

String

Example: aGVsbG8gd29ybGQ=

filename Required

The name of the file as it will appear in the Copyleaks scan report

Make sure to include the right extension for your filetype.

String

Example: Myfile.pdf

Example: image.jpg

Max length: 255 characters.

langCode Required

The language code of your content. The selected language should be on the OCR supported languages list.

String

Example: en

properties.action

Types of content submission actions.

Possible values:

  • Scan: Start scan immediately.
  • Check Credits: Check how many credits will be used for this scan.
  • Index Only: Only index the file in the Copyleaks internal database. No credits will be used.

Integer (enum)

Default: 0

Optional Values:
0: Scan
1: Check-Credits
2: Index Only

properties.includeHtml

By default, Copyleaks will present the report in text format. If set to true, Copyleaks will also include html format.

Possible values:
Text: results will be generated as text format.
Text and Html: results will be generated as HTML format.

Boolean

Default: false

properties.developerPayload

Add custom developer payload that will then be provided on the webhooks.

String

Length: up to 512 characters.

Default: null

properties.sandbox

You can test the integration with the Copyleaks API for free using the sandbox mode.

You will be able to submit content for a scan and get back mock results, simulating the way Copyleaks will work to make sure that you successfully integrated with the API.

Turn off this feature on production environment.

Boolean

Default: false

properties.expiration

Specify the maximum life span of a scan in hours on the Copyleaks servers.

When expired, the scan will be deleted and will no longer be accessible.

Integer

Default: 2880

properties.author.id

A unique identifier that represents the author of the content. Make sure to use the same ID for the same author.

Using this feature Copyleaks can detect the author's writing patterns and get better results.

String

Default: null

properties.webhooks.newResult

Http endpoint to be triggered while the scan is still running and a new result is found. This is useful when the report is being viewed by the user in real time so the results will load gradually as they are found.

String (uri)

Default: null

Example: https://yoursite.com/webhook/new-result

properties.webhooks.status Required

This webhook event is triggered once the scan status changes.

Use the special token {STATUS} to track the current scan status. This special token will automatically be replaced by the Copyleaks servers with the optional values: completed, error, creditsChecked and indexed.

Read more about webhooks.

String (uri)

Example: https://yoursite.com/webhook/{STATUS}

properties.filters.identicalEnabled

Enable matching of exact words in the text.

Boolean

Default: true

properties.filters.minorChangesEnabled

Enable matching of nearly identical words with small differences like slow becomes slowly.

Boolean

Default: true

properties.filters.relatedMeaningEnabled

Enable matching of paraphrased content stating similar ideas with different words.

Boolean

Default: true

properties.filters.minCopiedWords

Select results with at least minCopiedWords copied words.

Unsigned Integer

Default: null

properties.filters.safeSearch

Block explicit adult content from the scan results such as web pages containing inappropriate images and videos. SafeSearch is not 100% effective with all websites.

Boolean

Default: false

properties.filters.domains

A list of domains to either include or exclude from the scan - depending on the value of domainsMode.

String Array

Default: []

properties.filters.domainsMode

Include or Exclude the list of domains you specified under the domains property

When Include is selected, Copyleaks will filter out all results that are not part of the properties.filters.domains list.

When Exclude is selected, Copyleaks will only find results outside of the properties.filters.domains list.

Integer (Enum)

Default: 1

Optional Values:
0: Include
1: Exclude

properties.scanning.internet

Compare your content with online sources.

Boolean

Default: true

properties.scanning.copyleaksDb.includeMySubmissions

When set to true: Copyleaks will also compare against content which was uploaded by YOU to the Copyleaks internal database.

If true, it will also index the scan in the Copyleaks internal database.

Boolean

Default: true

properties.scanning.copyleaksDb.includeOthersSubmissions

When set to true: Copyleaks will also compare against content which was uploaded by OTHERS to the Copyleaks internal database.

If true, it will also index the scan in the Copyleaks internal database.

Boolean

Default: true

properties.exclude.quotes

Exclude quoted text from the scan.

Boolean

Default: false

properties.exclude.references

Exclude referenced text from the scan.

Boolean

Default: false

properties.exclude.tableOfContents

Exclude table of contents from the scan.

Boolean

Default: false

properties.exclude.titles

Exclude titles from the scan.

Boolean

Default: false

properties.exclude.htmlTemplate

When the scanned document is an HTML document, exclude irrelevant text that appears across the site like the website footer or header.

Boolean

Default: false

properties.pdf.create

Add a request to generate a customizable export of the scan report, in a pdf format.

Set to true in order to generate a pdf report for this scan.

Boolean

Default: false

properties.pdf.title

Customize the title for the PDF report.

Boolean

Default: null

properties.pdf.largeLogo

Customize the logo image in the PDF report.

String (base64)

Default: null

Max size: 100kb

properties.pdf.rtl

When set to true the text in the report will be aligned from right to left.

Boolean

Default: false

properties.sensitivityLevel

You can control the level of plagiarism sensitivity that will be identified according to the speed of the scan. If you prefer a faster scan with the results that contains the highest amount of plagiarism choose 1, and if a slower, more comprehensive scan, that will also detect the smallest instances choose 5.

Integer

Default: 3

Optional Values:

Range between 1 (faster ) to 5 (slower but more comprehensive)

Show Advanced Configuration Hide Advanced Configuration

Request Example

curl -XPUT -H 'Authorization: Bearer YOUR-LOGIN-TOKEN' -H "Content-type: application/json" -d '{
  "base64": "YOUR BASE64 HERE",
  "filename": "image.jpg",
  "langCode": "en",
  "properties": {
    "action": 0,
    "includeHtml": false,
    "developerPayload": "Custom developer payload",
    "sandbox": true,
    "expiration": 480,
    "author": {
      "id": "Author id"
    },
    "webhooks": {
      "newResult": "https://yoursite.com/webhook/new-result",
      "status": "https://yoursite.com/webhook/{STATUS}/my-custom-id"
    },
    "filters": {
      "identicalEnabled": true,
      "minorChangesEnabled": true,
      "relatedMeaningEnabled": true,
      "minCopiedWords": 10,
      "safeSearch": false,
      "domains": [
        "www.example.com"
      ],
      "domainsMode": 1
    },
    "scanning": {
      "internet": true
    },
    "exclude": {
      "quotes": false,
      "titles": false,
      "htmlTemplate": false
    },
    "sensitivityLevel": 3
  }
}' 'https://api.copyleaks.com/v3/businesses/submit/ocr/my-custom-id'
curl -XPUT -H 'Authorization: Bearer YOUR-LOGIN-TOKEN' -H "Content-type: application/json" -d '{
  "base64": "YOUR BASE64 HERE",
  "filename": "image.jpg",
  "langCode": "en",
  "properties": {
    "webhooks": {
      "status": "https://yoursite.com/webhook/{STATUS}/my-custom-id"
    }
  }
}' 'https://api.copyleaks.com/v3/businesses/submit/ocr/my-custom-id'
PUT https://api.copyleaks.com/v3/businesses/submit/ocr/my-custom-id

Content-Type: application/json;
Authorization: Bearer YOUR-LOGIN-TOKEN;

{
  "base64": "YOUR BASE64 HERE",
  "filename": "image.jpg",
  "langCode": "en",
  "properties": {
    "action": 0,
    "includeHtml": false,
    "developerPayload": "Custom developer payload",
    "sandbox": true,
    "expiration": 480,
    "author": {
      "id": "Author id"
    },
    "webhooks": {
      "newResult": "https://yoursite.com/webhook/new-result",
      "status": "https://yoursite.com/webhook/{STATUS}/my-custom-id"
    },
    "filters": {
      "identicalEnabled": true,
      "minorChangesEnabled": true,
      "relatedMeaningEnabled": true,
      "minCopiedWords": 10,
      "safeSearch": false,
      "domains": [
        "www.example.com"
      ],
      "domainsMode": 1
    },
    "scanning": {
      "internet": true
    },
    "exclude": {
      "quotes": false,
      "titles": false,
      "htmlTemplate": false
    },
    "sensitivityLevel": 3
  }
}
PUT https://api.copyleaks.com/v3/businesses/submit/ocr/my-custom-id

Content-Type: application/json;
Authorization: Bearer YOUR-LOGIN-TOKEN;

{
  "base64": "YOUR BASE64 HERE",
  "filename": "image.jpg",
  "langCode": "en",
  "properties": {
    "webhooks": {
      "status": "https://yoursite.com/webhook/{STATUS}/my-custom-id"
    }
  }
}
import requests

headers = {
    'Authorization': 'Bearer YOUR-LOGIN-TOKEN',
    'Content-type': 'application/json',
}

data = '{\n  "base64": "YOUR BASE64 HERE",\n  "filename": "image.jpg",\n  "langCode": "en",\n  "properties": {\n    "action": 0,\n    "includeHtml": false,\n    "developerPayload": "Custom developer payload",\n    "sandbox": true,\n    "expiration": 480,\n    "author": {\n      "id": "Author id"\n    },\n    "webhooks": {\n      "newResult": "https://yoursite.com/webhook/new-result",\n      "status": "https://yoursite.com/webhook/{STATUS}/my-custom-id"\n    },\n    "filters": {\n      "identicalEnabled": true,\n      "minorChangesEnabled": true,\n      "relatedMeaningEnabled": true,\n      "minCopiedWords": 10,\n      "safeSearch": false,\n      "domains": [\n        "www.example.com"\n      ],\n      "domainsMode": 1\n    },\n    "scanning": {\n      "internet": true\n    },\n    "exclude": {\n      "quotes": false,\n      "titles": false,\n      "htmlTemplate": false\n    },\n    "sensitivityLevel": 3\n  }\n}'

response = requests.put('https://api.copyleaks.com/v3/businesses/submit/ocr/my-custom-id', headers=headers, data=data)

import requests

headers = {
    'Authorization': 'Bearer YOUR-LOGIN-TOKEN',
    'Content-type': 'application/json',
}

data = '{\n  "base64": "YOUR BASE64 HERE",\n  "filename": "image.jpg",\n  "langCode": "en",\n  "properties": {\n    "webhooks": {\n      "status": "https://yoursite.com/webhook/{STATUS}/my-custom-id"\n    }\n  }\n}'

response = requests.put('https://api.copyleaks.com/v3/businesses/submit/ocr/my-custom-id', headers=headers, data=data)

using (var httpClient = new HttpClient())
{
    using (var request = new HttpRequestMessage(new HttpMethod("POST"), "https://api.copyleaks.com/v3/businesses/submit/ocr/my-custom-id"))
    {
        request.Headers.TryAddWithoutValidation("Authorization", "Bearer YOUR-LOGIN-TOKEN"); 

        request.Content = new StringContent("{\n  \"base64\": \"YOUR BASE64 HERE\",\n  \"filename\": \"image.jpg\",\n  \"langCode\": \"en\",\n  \"properties\": {\n    \"action\": 0,\n    \"includeHtml\": false,\n    \"developerPayload\": \"Custom developer payload\",\n    \"sandbox\": true,\n    \"expiration\": 480,\n    \"author\": {\n      \"id\": \"Author id\"\n    },\n    \"webhooks\": {\n      \"newResult\": \"https://yoursite.com/webhook/new-result\",\n      \"status\": \"https://yoursite.com/webhook/{STATUS}/my-custom-id\"\n    },\n    \"filters\": {\n      \"identicalEnabled\": true,\n      \"minorChangesEnabled\": true,\n      \"relatedMeaningEnabled\": true,\n      \"minCopiedWords\": 10,\n      \"safeSearch\": false,\n      \"domains\": [\n        \"www.example.com\"\n      ],\n      \"domainsMode\": 1\n    },\n    \"scanning\": {\n      \"internet\": true\n    },\n    \"exclude\": {\n      \"quotes\": false,\n      \"titles\": false,\n      \"htmlTemplate\": false\n    },\n    "sensitivityLevel": 3\n  }\n}", Encoding.UTF8, "application/json"); 

        var response = await httpClient.SendAsync(request);
    }
}
using (var httpClient = new HttpClient())
{
    using (var request = new HttpRequestMessage(new HttpMethod("POST"), "https://api.copyleaks.com/v3/businesses/submit/ocr/my-custom-id"))
    {
        request.Headers.TryAddWithoutValidation("Authorization", "Bearer YOUR-LOGIN-TOKEN"); 

        request.Content = new StringContent("{\n  \"base64\": \"YOUR BASE64 HERE\",\n  \"filename\": \"image.jpg\",\n  \"langCode\": \"en\",\n  \"properties\": {\n    \"webhooks\": {\n      \"status\": \"https://yoursite.com/webhook/{STATUS}/my-custom-id\"\n    }\n  }\n}", Encoding.UTF8, "application/json"); 

        var response = await httpClient.SendAsync(request);
    }
}
var request = require('request');

var headers = {
    'Authorization': 'Bearer YOUR-LOGIN-TOKEN',
    'Content-type': 'application/json'
};

var dataString = '{
  "base64": "YOUR BASE64 HERE",
  "filename": "image.jpg",
  "langCode": "en",
  "properties": {
    "action": 0,
    "includeHtml": false,
    "developerPayload": "Custom developer payload",
    "sandbox": true,
    "expiration": 480,
    "author": {
      "id": "Author id"
    },
    "webhooks": {
      "newResult": "https://yoursite.com/webhook/new-result",
      "status": "https://yoursite.com/webhook/{STATUS}/my-custom-id"
    },
    "filters": {
      "identicalEnabled": true,
      "minorChangesEnabled": true,
      "relatedMeaningEnabled": true,
      "minCopiedWords": 10,
      "safeSearch": false,
      "domains": [
        "www.example.com"
      ],
      "domainsMode": 1
    },
    "scanning": {
      "internet": true
    },
    "exclude": {
      "quotes": false,
      "titles": false,
      "htmlTemplate": false
    },
    "sensitivityLevel": 3
  }
}';

var options = {
    url: 'https://api.copyleaks.com/v3/businesses/submit/ocr/my-custom-id',
    method: 'PUT',
    headers: headers,
    body: dataString
};

function callback(error, response, body) {
    if (!error && response.statusCode == 200) {
        console.log(body);
    }
}

request(options, callback);

var request = require('request');

var headers = {
    'Authorization': 'Bearer YOUR-LOGIN-TOKEN',
    'Content-type': 'application/json'
};

var dataString = '{
  "base64": "YOUR BASE64 HERE",
  "filename": "image.jpg",
  "langCode": "en",
  "properties": {
    "webhooks": {
      "status": "https://yoursite.com/webhook/{STATUS}/my-custom-id"
    }
  }
}';

var options = {
    url: 'https://api.copyleaks.com/v3/businesses/submit/ocr/my-custom-id',
    method: 'PUT',
    headers: headers,
    body: dataString
};

function callback(error, response, body) {
    if (!error && response.statusCode == 200) {
        console.log(body);
    }
}

request(options, callback);
<?php
include('vendor/rmccue/requests/library/Requests.php');
Requests::register_autoloader();
$headers = array(
    'Authorization' => 'Bearer YOUR-LOGIN-TOKEN',
    'Content-type' => 'application/json'
);
$data = '{
  "base64": "YOUR BASE64 HERE",
  "filename": "image.jpg",
  "langCode": "en",
  "properties": {
    "action": 0,
    "includeHtml": false,
    "developerPayload": "Custom developer payload",
    "sandbox": true,
    "expiration": 480,
    "author": {
      "id": "Author id"
    },
    "webhooks": {
      "newResult": "https://yoursite.com/webhook/new-result",
      "status": "https://yoursite.com/webhook/{STATUS}/my-custom-id"
    },
    "filters": {
      "identicalEnabled": true,
      "minorChangesEnabled": true,
      "relatedMeaningEnabled": true,
      "minCopiedWords": 10,
      "safeSearch": false,
      "domains": [
        "www.example.com"
      ],
      "domainsMode": 1
    },
    "scanning": {
      "internet": true
    },
    "exclude": {
      "quotes": false,
      "titles": false,
      "htmlTemplate": false,
      "sensitivityLevel": 3
    }
  }
}';
$response = Requests::put('https://api.copyleaks.com/v3/businesses/submit/ocr/my-custom-id', $headers, $data);
<?php
include('vendor/rmccue/requests/library/Requests.php');
Requests::register_autoloader();
$headers = array(
    'Authorization' => 'Bearer YOUR-LOGIN-TOKEN',
    'Content-type' => 'application/json'
);
$data = '{
  "base64": "YOUR BASE64 HERE",
  "filename": "image.jpg",
  "langCode": "en",
  "properties": {
    "webhooks": {
      "status": "https://yoursite.com/webhook/{STATUS}/my-custom-id"
    }
  }
}';
$response = Requests::put('https://api.copyleaks.com/v3/businesses/submit/ocr/my-custom-id', $headers, $data);

Responses

Status Code Description Example
201

The scan was Created.

400

Bad request.

{
  "properties.webhooks.status": ["The field is required."]
}
401

Authorization has been denied for this request

409

A scan with the same Id already exists in the system