- 06 Oct, 2017 40 commits
-
-
Jérome Perrin authored
-
Julien Muchembled authored
-
Tristan Cavelier authored
-
Tomáš Peterka authored
-
Tomáš Peterka authored
-
francois authored
-
francois authored
This commit contain a testing business template for the receipt recognition module test the "Receipt" type update as well as the OCR success and fail on a set of poor This commit contain binary files that are test images..
-
francois authored
This commit contain the business template that take a receipt image as a source, binarize then segmentize it, and apply OCR on it. It then extract the meaning with regular expressions. The image should already be loaded inside the image module before it can be read. The business template contain: * The receipt recognition module * An extension containing the code that binarize, crop and segmentize the image then analize it. * A new type "Receipt" that contain a source image and the field that contain the "total" value * A portal skin folder containing the extension externalMethods aswell as the conversion script that call the recognition and update the Receipt "total" field Improvements (not limited to this list): - Easier loading of picture: directly from the receipt page. - Easier loading of picture 2: from phone with OfficeJS (or any renderJS) application? - Detect when images are sideway and rotate them straight - Better "boxing" and segmentation: some lines are deleted from the original image during the segmentation when they are too close from other - Modify the neural network (lstm) to increase weight of signs like $, euro, / and numbers - Use of a faster/smaller neural network: Most of the time is lost with the loading of the neural network - Caching the neural network: See previous statement. - Extract currency, date and receipt emettor. - Use a neural network for the meaning extraction?
-
Vincent Pelletier authored
Avoid iterating over all columns known to catalog to then restrict to a single table by using SQLCatalog API. Only check for one range column suffix as code anyway relies on the triplet of columns to be consistently present. Document this in the code and get rid of now-unneeded range_column_set mechanism.
-
Julien Muchembled authored
-
Julien Muchembled authored
str.encode() first performs an implicit conversion to unicode using sys.getdefaultencoding(), which is usually 'ascii'. The 'isort' module changes the default encoding to utf-8, leading to UnicodeEncodeError instead of UnicodeDecodeError. Let's simplify all this.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Vincent Bechu authored
-
Vincent Bechu authored
-
Ayush Tiwari authored
-
Ayush Tiwari authored
These methods will be used in erp5 catalog view in restricted environment. Loosen the security on methods which we will use in erp5 catalog view in restricted environment.
-
Ayush Tiwari authored
-
Ayush Tiwari authored
-
Ayush Tiwari authored
-
Ayush Tiwari authored
Required as preparation for migrating to ERP5 catalog which itself has an accessor `isIndexable` which we plan to remain disabled all the time. So better to rely directly on isIndexable from erp5 site object.
-
Ayush Tiwari authored
This would also remove the need of evaluating the globals every time.
-
Julien Muchembled authored
-
Julien Muchembled authored
/reviewed-on nexedi/erp5!314
-
Julien Muchembled authored
-
Vincent Bechu authored
-
Vincent Bechu authored
-
Vincent Bechu authored
-
Vincent Bechu authored
-
Vincent Bechu authored
-
Nicolas Wavrant authored
-
Nicolas Wavrant authored
on Organisation and Person property sheets
-
Nicolas Wavrant authored
-
Jérome Perrin authored
-
Jérome Perrin authored
-
Jérome Perrin authored
-
Jérome Perrin authored
correct case is *validated*
-
Jérome Perrin authored
- get tools from portal, not context, to prevent slow acquisition - stop using abbreviated names (ctool, stool etc)
-
Jérome Perrin authored
-
Jérome Perrin authored
This is not public API and is not supposed to be used outside of catalog tool. Using getObject here is also wrong because it does not apply security and does not check provided argument type. To prevent retrieving too many documents, we usually pass limit=2 for safety. To detect inconsistencies, when only one result is expected we rely on `brain, = portal_catalog(limit=2)` style unpacking to fail if more than one result were found.
-