Download the PHP package helgesverre/receipt-scanner without Composer

On this page you can find all versions of the php package helgesverre/receipt-scanner. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package receipt-scanner

Need more flexibility? Try the Extractor package instead, a AI-Powered data extraction library for Laravel

AI-Powered Receipt and Invoice Scanner for Laravel

Latest Version on Packagist Total Downloads

Easily extract structured receipt data from images, PDFs, and emails within your Laravel application using OpenAI.

Features

Installation

Install the package via composer:

Publish the config file:

All the configuration options are documented in the configuration file.

Since this package uses the OpenAI Laravel Package, so you also need to publish their config and add the OPENAI_API_KEY to your .env file:

Usage

Extracting receipt data from Plain Text

Plain text scanning is useful when you already have the textual representation of a receipt or invoice.

The example is from a Paddle.com receipt email, where I copied all the text in the email, and removed all the empty lines.

Extracting data from other formats

After loading, you can pass the TextContent or the plain text (which can be retrieved by calling ->toString()) into the ReceiptScanner::scan() method.

Receipt Data Model

The scanned receipt is parsed into a DTO which consists of a main Receipt class, which contains the receipt metadata, and a Merchant dto, representing the seller on the receipt or invoice, and an array of LineItem DTOs holding each individual line item.

The DTO has a toArray() method, which will result in a structure like this:

For flexibility, all fields are nullable.

Returning an Array instead of a DTO

If you prefer to work with an array instead of the built-in DTO, you can specify asArray: true when calling scan()

Specifying the model

To use a different model, you can specify the model name to use with the model named argument when calling the scan() method.

All parameters and what they do

$text (TextContent|string)

The input text from the receipt or invoice that needs to be parsed. It accepts either a TextContent object or a string.

**$model (string)

This parameter specifies the OpenAI model used for the extraction process.

HelgeSverre\ReceiptScanner\ModelNames is a class containing constants for each model, provided for convenience. However, you can also directly use a string to specify the model if you prefer.

Different models have different speed/accuracy characteristics.

If you require high accuracy, use a GPT-4 model, if you need speed, use a GPT-3 model, if you need even more speed, use the gpt-3.5-turbo-instruct model.

The default model is ModelNames::TURBO_INSTRUCT.

ModelNames Constant Value
ModelNames::TURBO gpt-3.5-turbo
ModelNames::TURBO_INSTRUCT gpt-3.5-turbo-instruct
ModelNames::TURBO_1106 gpt-3.5-turbo-1106
ModelNames::TURBO_16K gpt-3.5-turbo-16k
ModelNames::TURBO_0613 gpt-3.5-turbo-0613
ModelNames::TURBO_16K_0613 gpt-3.5-turbo-16k-0613
ModelNames::TURBO_0301 gpt-3.5-turbo-0301
ModelNames::GPT4 gpt-4
ModelNames::GPT4_32K gpt-4-32k
ModelNames::GPT4_32K_0613 gpt-4-32k-0613
ModelNames::GPT4_1106_PREVIEW gpt-4-1106-preview
ModelNames::GPT4_0314 gpt-4-0314
ModelNames::GPT4_32K_0314 gpt-4-32k-0314

$maxTokens (int)

The maximum number of tokens that the model will processes. The default value is 2000, adjusting this value may be necessary for very long text, but 2000 is "usually" fairly good.

$temperature (float)

Controls the randomness/creativity of the model's output.

A higher value (e.g., 0.8) makes the output more random, which is usually not what we want in this scenario, I usually go with 0.1 or 0.2, anything over 0.5 becomes useless. Defaults to 0.1.

$template (string)

This parameter specifies the template used for the prompt.

The default template is 'receipt'. You can create and use additional templates by adding new blade files in the resources/views/vendor/receipt-scanner/ directory and specifying the file name (without extension) as the $template value (eg: "minimal_invoice".

$asArray (bool)

If true, returns the response from the AI model as an array instead of as a DTO, useful if you need to modifythe default DTO to have more/less fields or want to convert the response into your own DTO, defaults to false

Example Usage:

List of supported models

Enum Value Model name Endpoint
TURBO_INSTRUCT gpt-3.5-turbo-instruct Completion
TURBO_16K gpt-3.5-turbo-16k Chat
TURBO gpt-3.5-turbo Chat
GPT4 gpt-4 Chat
GPT4_32K gpt-4-32 Chat

OCR Configuration with AWS Textract

To use AWS Textract for extracting text from large images and multi-page PDFs, the package needs to upload the file to S3 and pass the s3 object location along to the textract service.

So you need to configure your AWS Credentials in the config/receipt-scanner.php file as follows:

You also need to configure a seperate Textract disk where the files will be stored, open your config/filesystems.php configuration file and add the following:

Ensure the textract_disk setting in config/receipt-scanner.php is the same as your disk name in the filesystems.php config, you can change it with the .env value TEXTRACT_DISK.

.env

Note

Textract is not available in all regions:

Q: In which AWS regions is Amazon Textract available? Amazon Textract is currently available in the US East (Northern Virginia), US East (Ohio), US West (Oregon), US West ( N. California), AWS GovCloud (US-West), AWS GovCloud (US-East), Canada (Central), EU (Ireland), EU (London), EU ( Frankfurt), EU (Paris), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Seoul), and Asia Pacific ( Mumbai) Regions.

See: https://aws.amazon.com/textract/faqs/

Publishing Prompts

You may publish the prompt file that is used under the hood by running this command:

This package simply uses blade files as prompts, the {{ $context }} variable will be replaced by the text you pass to ReceiptScanner::scan("text here").

Adding prompts/templates

By default, the package uses the receipt.blade.php file as its prompt template, you may add additional templates by simply creating a blade file in resources/views/vendor/receipt-scanner/minimal_invoice.blade.php and changing the $template parameter when calling scan()

Example prompt:

License

This package is licensed under the MIT License. For more details, refer to the License File.


All versions of receipt-scanner with dependencies

PHP Build Version
Package Version
Requires php Version ^8.1|^8.2
ext-zip Version *
aws/aws-sdk-php Version ^3.281
illuminate/contracts Version ^10.0|^11.0
jstewmc/rtf Version ^0.5.2
league/flysystem-aws-s3-v3 Version ^3.16
openai-php/laravel Version ^v0.8.1
prinsfrank/standards Version ^2.1
smalot/pdfparser Version *
spatie/laravel-package-tools Version ^1.14.0
symfony/dom-crawler Version ^6.3
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package helgesverre/receipt-scanner contains the following files

Loading the files please wait ....