PHP download

Download the PHP package batnieluyo/receipt-scanner without Composer

On this page you can find all versions of the php package batnieluyo/receipt-scanner. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

Table of contents
Download batnieluyo/receipt-scanner
More information about batnieluyo/receipt-scanner
Files in batnieluyo/receipt-scanner

Vendor batnieluyo
Package receipt-scanner
Short Description Use OpenAI to extract structured receipt and invoice data from Text, Html, Images and PDFs.
License MIT
Homepage https://github.com/batnieluyo/receipt-scanner

Keywords scanner receipt laravel-package

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:

If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.

Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
To use Composer is sometimes complicated. Especially for beginners.
Composer needs much resources. Sometimes they are not available on a simple webspace.
If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.

Please rate this library. Is it a good library?

Example code of batnieluyo/receipt-scanner

Informations about the package receipt-scanner

Need more flexibility? Try the Extractor package instead, a AI-Powered data extraction library for Laravel

AI-Powered Receipt and Invoice Scanner for Laravel

Latest Version on Packagist Total Downloads

Easily extract structured receipt data from images, PDFs, and emails within your Laravel application using OpenAI.

Features

Light wrapper around OpenAI Chat and Completion endpoints.
Accepts text as input and returns structured receipt information.
Includes a well-tuned prompt for parsing receipts.
Supports various input formats including Plain Text, PDF, Images, Word documents, and Web content.
Integrates with Textract for OCR functionality.

Installation

Install the package via composer:

Publish the config file:

All the configuration options are documented in the configuration file.

Since this package uses the OpenAI Laravel Package, so you also need to publish their config and add the OPENAI_API_KEY to your .env file:

Usage

Extracting receipt data from Plain Text

Plain text scanning is useful when you already have the textual representation of a receipt or invoice.

The example is from a Paddle.com receipt email, where I copied all the text in the email, and removed all the empty lines.

Extracting data from other formats

After loading, you can pass the TextContent or the plain text (which can be retrieved by calling ->toString()) into the ReceiptScanner::scan() method.

Receipt Data Model

The scanned receipt is parsed into a DTO which consists of a main Receipt class, which contains the receipt metadata, and a Merchant dto, representing the seller on the receipt or invoice, and an array of LineItem DTOs holding each individual line item.

TheAi\ReceiptScanner\Data\Receipt
TheAi\ReceiptScanner\Data\Merchant
TheAi\ReceiptScanner\Data\LineItem

The DTO has a toArray() method, which will result in a structure like this:

For flexibility, all fields are nullable.

Returning an Array instead of a DTO

If you prefer to work with an array instead of the built-in DTO, you can specify asArray: true when calling scan()

Specifying the model

To use a different model, you can specify the model name to use with the model named argument when calling the scan() method.

All parameters and what they do

$text (TextContent|string)

The input text from the receipt or invoice that needs to be parsed. It accepts either a TextContent object or a string.

**$model (string)

This parameter specifies the OpenAI model used for the extraction process.

TheAi\ReceiptScanner\ModelNames is a class containing constants for each model, provided for convenience. However, you can also directly use a string to specify the model if you prefer.

Different models have different speed/accuracy characteristics.

If you require high accuracy, use a GPT-4 model, if you need speed, use a GPT-3 model, if you need even more speed, use the gpt-3.5-turbo-instruct model.

The default model is ModelNames::TURBO_INSTRUCT.

`ModelNames` Constant	Value
`ModelNames::TURBO`	`gpt-3.5-turbo`
`ModelNames::TURBO_INSTRUCT`	`gpt-3.5-turbo-instruct`
`ModelNames::TURBO_1106`	`gpt-3.5-turbo-1106`
`ModelNames::TURBO_16K`	`gpt-3.5-turbo-16k`
`ModelNames::TURBO_0613`	`gpt-3.5-turbo-0613`
`ModelNames::TURBO_16K_0613`	`gpt-3.5-turbo-16k-0613`
`ModelNames::TURBO_0301`	`gpt-3.5-turbo-0301`
`ModelNames::GPT4`	`gpt-4`
`ModelNames::GPT4_32K`	`gpt-4-32k`
`ModelNames::GPT4_32K_0613`	`gpt-4-32k-0613`
`ModelNames::GPT4_1106_PREVIEW`	`gpt-4-1106-preview`
`ModelNames::GPT4_0314`	`gpt-4-0314`
`ModelNames::GPT4_32K_0314`	`gpt-4-32k-0314`

$maxTokens (int)

The maximum number of tokens that the model will processes. The default value is 2000, adjusting this value may be necessary for very long text, but 2000 is "usually" fairly good.

$temperature (float)

Controls the randomness/creativity of the model's output.

A higher value (e.g., 0.8) makes the output more random, which is usually not what we want in this scenario, I usually go with 0.1 or 0.2, anything over 0.5 becomes useless. Defaults to 0.1.

$template (string)

This parameter specifies the template used for the prompt.

The default template is 'receipt'. You can create and use additional templates by adding new blade files in the resources/views/vendor/receipt-scanner/ directory and specifying the file name (without extension) as the $template value (eg: "minimal_invoice".

$asArray (bool)

If true, returns the response from the AI model as an array instead of as a DTO, useful if you need to modifythe default DTO to have more/less fields or want to convert the response into your own DTO, defaults to false

Example Usage:

List of supported models

Enum Value	Model name	Endpoint
TURBO_INSTRUCT	gpt-3.5-turbo-instruct	Completion
TURBO_16K	gpt-3.5-turbo-16k	Chat
TURBO	gpt-3.5-turbo	Chat
GPT4	gpt-4	Chat
GPT4_32K	gpt-4-32	Chat

OCR Configuration with AWS Textract

To use AWS Textract for extracting text from large images and multi-page PDFs, the package needs to upload the file to S3 and pass the s3 object location along to the textract service.

So you need to configure your AWS Credentials in the config/receipt-scanner.php file as follows:

You also need to configure a seperate Textract disk where the files will be stored, open your config/filesystems.php configuration file and add the following:

Ensure the textract_disk setting in config/receipt-scanner.php is the same as your disk name in the filesystems.php config, you can change it with the .env value TEXTRACT_DISK.

.env

Note

Textract is not available in all regions:

Q: In which AWS regions is Amazon Textract available? Amazon Textract is currently available in the US East (Northern Virginia), US East (Ohio), US West (Oregon), US West ( N. California), AWS GovCloud (US-West), AWS GovCloud (US-East), Canada (Central), EU (Ireland), EU (London), EU ( Frankfurt), EU (Paris), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Seoul), and Asia Pacific ( Mumbai) Regions.

See: https://aws.amazon.com/textract/faqs/

Publishing Prompts

You may publish the prompt file that is used under the hood by running this command:

This package simply uses blade files as prompts, the {{ $context }} variable will be replaced by the text you pass to ReceiptScanner::scan("text here").

Adding prompts/templates

By default, the package uses the receipt.blade.php file as its prompt template, you may add additional templates by simply creating a blade file in resources/views/vendor/receipt-scanner/minimal_invoice.blade.php and changing the $template parameter when calling scan()

Example prompt:

License

This package is licensed under the MIT License. For more details, refer to the License File.

All versions of receipt-scanner with dependencies

PHP Build Version

Package Version

Version 1.0.1 Release 23. Jul 2024
create-project require 0 people chose require and
0 people chose create-project.

Download

Download latest version of receipt-scanner from vendor batnieluyo

Requires php Version ^8.1|^8.2
ext-zip Version *
aws/aws-sdk-php Version ^3.281
illuminate/contracts Version ^10.0|^11.0
jstewmc/rtf Version ^0.5.2
league/flysystem-aws-s3-v3 Version ^3.16
openai-php/laravel Version ^0.10.1
prinsfrank/standards Version ^2.1
smalot/pdfparser Version *
spatie/laravel-package-tools Version ^1.14.0
symfony/dom-crawler Version ^7.1.1

Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package batnieluyo/receipt-scanner contains the following files

Loading the files please wait ....