Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
X
xfw
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
nexedi
xfw
Commits
570e15a7
Commit
570e15a7
authored
Dec 26, 2011
by
Vincent Pelletier
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Initial commit.
parents
Changes
3
Expand all
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
186 additions
and
0 deletions
+186
-0
README
README
+166
-0
setup.py
setup.py
+20
-0
xfw.py
xfw.py
+0
-0
No files found.
README
0 → 100644
View file @
570e15a7
xfw is an eXtensible Fixed-Width file handling module.
Features
========
- field types (integers, strings, dates) are declared independently of
file structure, and can be extended through subclassing. (BaseField
subclasses)
- multi-field structure declaration (FieldList class)
- non-homogeneous file file structure declaration (FieldListFile)
- checksum/hash computation helpers (ChecksumedFile succlasses)
- does not depend on line notion (file may not contain CR/LF chars at all
between successive field sets)
Missing features / bugs
=======================
- string trucating is multi-byte (UTF-8, ...) agnostic, and will mindless cut
in the middle of any entity
- proper interface declaration
- fields (IntegerField, DateTimeField) should cast by default when parsing
- FieldList total length should be made optional, and only used to
auto-generate annonymous padding at record end when longer that the sum of
individual fields lengths.
Example
=======
Dislaimer: give file format is purely hypothetical, does not some from any spec
I know of, should not be taken as a guideline but just as a showcase of xfw
capabilities.
Let's assume a file composed of a general header, containing some
constant-value 5-char identifier, a 3-char integer giving the number of records
contained, and an optional 20-char comment. It is followed by records, composed
of a header itself composed of a date (YYYYMMDD), a row type (2-char integer)
and number of rows (2-char integer), and followed by rows. Row types all start
with a time (HHMMSS), followed by fields which depend on row type:
- type 1: a 10-char string
- type 2: a 2-char integer, 8 chars of padding, a 1-char integer
To run the following code as a doctest, run:
python -m doctest README
Declare all file structures:
>>> import xfw
>>> ROOT_HEADER = xfw.FieldList([
... (xfw.StringField(5), True, 'header_id'),
... (xfw.IntegerField(3, cast=True), True, 'block_count'),
... (xfw.StringField(15), False, 'comment'),
... ], 23, fixed_value_dict={
... 'header_id': 'HEAD1',
... })
>>> BLOCK_HEADER = xfw.FieldList([
... (xfw.DateTimeField('%Y%m%d', cast=True), True, 'date'),
... (xfw.IntegerField(2, cast=True), True, 'row_type'),
... (xfw.IntegerField(2, cast=True), True, 'row_count'),
... ], 12)
>>> ROW_BASE = xfw.FieldList([
... (xfw.DateTimeField('%H%M%S', cast=True), True, 'time'),
... ], 6)
>>> ROW_TYPE_DICT = {
... 1: xfw.FieldList([
... ROW_BASE,
... (xfw.StringField(10), True, 'description'),
... ], 16),
... 2: xfw.FieldList([
... ROW_BASE,
... (xfw.IntegerField(2, cast=True), True, 'some_value'),
... (xfw.StringField(8), False, None), # annonymous padding
... (xfw.IntegerField(1, cast=True), True, 'another_value'),
... ], 17),
... }
>>> def blockCallback(head, item_list=None):
... if item_list is None:
... row_count = head['row_count']
... else:
... row_count = len(item_list)
... return row_count, ROW_TYPE_DICT[head['row_type']]
>>> FILE_STRUCTURE = xfw.ConstItemTypeFile(
... ROOT_HEADER,
... 'block_count',
... xfw.FieldListFile(
... BLOCK_HEADER,
... blockCallback,
... separator='\n',
... ),
... separator='\n',
... )
Parse sample file through a hash helper wrapper (SHA1):
>>> from cStringIO import StringIO
>>> sample_file = StringIO(
... 'HEAD1002blah \n'
... '201112260101\n'
... '115500other str \n'
... '201112260201\n'
... '11550099 8'
... )
>>> from datetime import datetime
>>> checksumed_wrapper = xfw.SHA1ChecksumedFile(sample_file)
>>> parsed_file = FILE_STRUCTURE.parseStream(checksumed_wrapper)
>>> parsed_file == \
... (
... {
... 'header_id': 'HEAD1',
... 'block_count': 2,
... 'comment': 'blah',
... },
... [
... (
... {
... 'date': datetime(2011, 12, 26, 0, 0),
... 'row_type': 1,
... 'row_count': 1,
... },
... [
... {
... 'time': datetime(1900, 1, 1, 11, 55),
... 'description': 'other str',
... },
... ]
... ),
... (
... {
... 'date': datetime(2011, 12, 26, 0, 0),
... 'row_type': 2,
... 'row_count': 1,
... },
... [
... {
... 'time': datetime(1900, 1, 1, 11, 55),
... 'some_value': 99,
... 'another_value': 8,
... },
... ]
... ),
... ],
... )
True
Verify SHA1 was properly accumulated:
>>> import hashlib
>>> hashlib.sha1(sample_file.getvalue()).hexdigest() == checksumed_wrapper.getHexDigest()
True
Generate a file from parsed data (as it was verified correct above):
>>> generated_stream = StringIO()
>>> FILE_STRUCTURE.generateStream(generated_stream, parsed_file)
>>> generated_stream.getvalue() == sample_file.getvalue()
True
setup.py
0 → 100644
View file @
570e15a7
from
distutils.core
import
setup
setup
(
name
=
'xfw'
,
version
=
'0.9'
,
description
=
'Extensible fixed-width file handling module'
,
keywords
=
'fixed width file'
,
author
=
'Nexedi'
,
author_email
=
'erp5-dev@erp5.org'
,
url
=
'http://git.erp5.org/gitweb/xfw.git'
,
license
=
'GPL 2+'
,
platforms
=
[
"any"
],
py_modules
=
[
'xfw'
],
classifiers
=
[
'Intended Audience :: Developers'
,
'License :: OSI Approved :: GNU General Public License (GPL)'
,
'Operating System :: OS Independent'
,
],
)
xfw.py
0 → 100644
View file @
570e15a7
This diff is collapsed.
Click to expand it.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment