Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
E
ebulk
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
nexedi
ebulk
Commits
eab67c55
Commit
eab67c55
authored
Nov 15, 2018
by
roqueporchetto@gmail.com
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
New commands init and store-credentials
parent
6d052c97
Changes
15
Hide whitespace changes
Inline
Side-by-side
Showing
15 changed files
with
91 additions
and
42 deletions
+91
-42
ebulk
ebulk
+32
-11
ebulk-data/config/download-config_template.yml
ebulk-data/config/download-config_template.yml
+0
-2
ebulk-data/config/ingestion-config_template.yml
ebulk-data/config/ingestion-config_template.yml
+1
-4
ebulk-data/config/ingestion-custom-config_template.yml
ebulk-data/config/ingestion-custom-config_template.yml
+1
-2
ebulk-data/config/ingestion-ftp-config_template.yml
ebulk-data/config/ingestion-ftp-config_template.yml
+1
-2
ebulk-data/config/ingestion-http-config_template.yml
ebulk-data/config/ingestion-http-config_template.yml
+1
-2
ebulk-data/config/ingestion-s3-config_template.yml
ebulk-data/config/ingestion-s3-config_template.yml
+1
-2
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/dataset_utils.rb
.../embulk-wendelin-dataset-tool/lib/embulk/dataset_utils.rb
+36
-0
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/filelogger.rb
...ata/embulk-wendelin-dataset-tool/lib/embulk/filelogger.rb
+1
-1
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/input/fif.rb
...data/embulk-wendelin-dataset-tool/lib/embulk/input/fif.rb
+5
-3
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/input/wendelin.rb
...embulk-wendelin-dataset-tool/lib/embulk/input/wendelin.rb
+2
-6
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/output/wendelin.rb
...mbulk-wendelin-dataset-tool/lib/embulk/output/wendelin.rb
+5
-7
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/parser/binary.rb
.../embulk-wendelin-dataset-tool/lib/embulk/parser/binary.rb
+2
-0
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/wendelin_client.rb
...mbulk-wendelin-dataset-tool/lib/embulk/wendelin_client.rb
+1
-0
ebulk-data/help.md
ebulk-data/help.md
+2
-0
No files found.
ebulk
View file @
eab67c55
...
...
@@ -9,6 +9,7 @@ DATASET_REPORT_FILE_NAME="/.dataset-task-report"
DATASET_COMPLETE_FILE_NAME
=
"/.dataset-completed"
DISCARD_CHANGES_FILE_NAME
=
"/.discard-changes"
LOG_DIR
=
"
$EBULK_DATA_PATH
/logs"
CREDENTIALS_FILE
=
"
$EBULK_DATA_PATH
/.credentials"
TOOL_PATH
=
"
$(
dirname
"
$0
"
)
/ebulk-data"
DOWN_FILE
=
"
$EBULK_DATA_PATH
/download-config.yml"
DOWN_TEMPLATE_FILE
=
"
$TOOL_PATH
/config/download-config_template.yml"
...
...
@@ -125,11 +126,11 @@ function checkParameters {
fi
}
function
ask
Credentials
{
function
store
Credentials
{
echo
echo
"Please, enter your ebulk user and password:"
echo
echo
"** You can use
the same Telecom-Wendelin-IA site user and password
**"
echo
"** You can use
your Telecom-Wendelin-IA credentials
**"
echo
"** If you don't have a user, please feel free to request one to roqueporchetto@gmail.com **"
echo
echo
"User:"
...
...
@@ -142,6 +143,8 @@ function askCredentials {
echo
>
&2
;
return
1
fi
PASS
=
read
-s
-e
-p
Password:
pwd
echo
echo
"
$USER
;
$pwd
"
>
"
$CREDENTIALS_FILE
"
2>/dev/null
}
function
updateConfigFile
{
...
...
@@ -176,10 +179,9 @@ function updateConfigFile {
$PARAMETER_FUNCTION
fi
TOOL_DIR
=
\"
$
LOG_DIR
\"
TOOL_DIR
=
\"
$
EBULK_DATA_PATH
\"
DATA_SET
=
\"
$DATA_SET
\"
USER
=
\"
$USER
\"
pwd
=
\"
$pwd
\"
CHUNK
=
\"
$CHUNK
\"
DATASET_DIR
=
\"
$DATASET_DIR
\"
DOWN_URL
=
\"
$DOWN_URL
\"
...
...
@@ -226,12 +228,6 @@ function runProcess {
fi
fi
fi
if
[
-z
"
$STATUS
"
]
;
then
if
!
askCredentials
;
then
return
1
fi
fi
echo
updateConfigFile
echo
"[INFO] Starting operation..."
if
[
!
-d
$LOG_DIR
]
;
then
...
...
@@ -516,7 +512,10 @@ while [ "$1" != "" ]; do
-r
|
--readme
)
less
$TOOL_PATH
/README.md
exit
;;
status
|
push
|
pull
)
OPERATION
=
$1
store-credentials
)
storeCredentials
exit
;;
status
|
push
|
pull
|
init
)
OPERATION
=
$1
;;
add
|
remove
|
reset
)
OPERATION
=
$1
shift
...
...
@@ -586,6 +585,28 @@ case $OPERATION in
echo
runProcess
;;
init
)
EBULK_DATASET_FILE
=
"
$DATASET_DIR$EBULK_DATASET_FILE_NAME
"
if
[
-f
"
$EBULK_DATASET_FILE
"
]
;
then
echo
echo
-e
"
${
ORANGE
}
[WARNING] The specified directory was already init as data set."
echo
-e
"[WARNING] Do you want to reset this directory as dataset? (Y/n)
${
NC
}
"
read
-e
OPTION
if
[
"
$OPTION
"
=
"n"
]
;
then
exit
fi
fi
checkParameters
if
[
!
$?
-eq
0
]
;
then
exit
fi
DATASET_REPORT_FILE
=
"
$DATASET_DIR$DATASET_REPORT_FILE_NAME
"
DATASET_COMPLETE_FILE
=
"
$DATASET_DIR$DATASET_COMPLETE_FILE_NAME
"
rm
$DATASET_REPORT_FILE
2>/dev/null
rm
$DATASET_COMPLETE_FILE
2>/dev/null
touch
$DATASET_REPORT_FILE
2>/dev/null
touch
$DATASET_COMPLETE_FILE
2>/dev/null
;;
pull
)
welcome
FILE
=
$DOWN_FILE
...
...
ebulk-data/config/download-config_template.yml
View file @
eab67c55
...
...
@@ -4,8 +4,6 @@ exec:
in
:
type
:
wendelin
erp5_url
:
$DOWN_URL
user
:
$USER
password
:
$pwd
data_set
:
$DATA_SET
chunk_size
:
$CHUNK
output_path
:
$DATASET_DIR
...
...
ebulk-data/config/ingestion-config_template.yml
View file @
eab67c55
...
...
@@ -8,16 +8,13 @@ in:
data_set
:
$DATA_SET
chunk_size
:
$CHUNK
erp5_url
:
$DOWN_URL
user
:
$USER
password
:
$pwd
tool_dir
:
$TOOL_DIR
status
:
$STATUS
out
:
type
:
wendelin
erp5_url
:
$ING_URL
user
:
$USER
password
:
$pwd
type_input
:
"
filesystem"
data_set
:
$DATA_SET
erp5_base_url
:
$DOWN_URL
tool_dir
:
$TOOL_DIR
ebulk-data/config/ingestion-custom-config_template.yml
View file @
eab67c55
...
...
@@ -28,8 +28,7 @@ in:
out
:
type
:
wendelin
erp5_url
:
$ING_URL
user
:
$USER
password
:
$pwd
tool_dir
:
$TOOL_DIR
data_set
:
$DATA_SET
erp5_base_url
:
$DOWN_URL
...
...
ebulk-data/config/ingestion-ftp-config_template.yml
View file @
eab67c55
...
...
@@ -23,8 +23,7 @@ in:
out
:
type
:
wendelin
erp5_url
:
$ING_URL
user
:
$USER
password
:
$pwd
tool_dir
:
$TOOL_DIR
data_set
:
$DATA_SET
erp5_base_url
:
$DOWN_URL
...
...
ebulk-data/config/ingestion-http-config_template.yml
View file @
eab67c55
...
...
@@ -25,8 +25,7 @@ in:
out
:
type
:
wendelin
erp5_url
:
$ING_URL
user
:
$USER
password
:
$pwd
tool_dir
:
$TOOL_DIR
data_set
:
$DATA_SET
erp5_base_url
:
$DOWN_URL
...
...
ebulk-data/config/ingestion-s3-config_template.yml
View file @
eab67c55
...
...
@@ -29,8 +29,7 @@ in:
out
:
type
:
wendelin
erp5_url
:
$ING_URL
user
:
$USER
password
:
$pwd
tool_dir
:
$TOOL_DIR
data_set
:
$DATA_SET
erp5_base_url
:
$DOWN_URL
...
...
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/dataset_utils.rb
View file @
eab67c55
...
...
@@ -15,6 +15,7 @@ class DatasetUtils
SPLIT_FILE
=
".split-operation"
SPLIT_CONTROL_FILE
=
".control-split-operation"
FIRST_INGESTION_FILE
=
".first-ingestion"
CREDENTIALS_FILE
=
".credentials"
RUN_DONE
=
"done"
RUN_ERROR
=
"error"
...
...
@@ -70,6 +71,41 @@ class DatasetUtils
}.
flatten
.
select
{
|
file
|
File
.
file?
(
file
)
}
end
def
getCredentials
(
tool_dir
)
credential_path
=
appendSlashTo
(
tool_dir
)
+
CREDENTIALS_FILE
if
File
.
exist?
(
credential_path
)
credentials
=
File
.
open
(
credential_path
).
read
.
chomp
.
split
(
RECORD_SEPARATOR
)
user
=
credentials
[
0
]
password
=
credentials
[
1
]
@logger
.
info
(
"Using stored credentials for user '
#{
user
}
'"
,
print
=
TRUE
)
else
puts
puts
"Please, enter your ebulk user and password:"
puts
puts
"** You can use your Telecom-Wendelin-IA credentials **"
puts
"** If you don't have a user, please feel free to request one to roqueporchetto@gmail.com **"
puts
@logger
.
info
(
"Remember that you can store your credentials for automatic authentication by running 'ebulk store-credentials'"
,
print
=
TRUE
)
puts
print
"User:"
user
=
gets
.
chomp
if
not
/^[A-Za-z][-A-Za-z0-9_]*$/
.
match
(
user
)
puts
@logger
.
error
(
"Invalid user name. Did enter an invalid character?"
,
print
=
TRUE
)
@logger
.
info
(
"Please enter a valid user name."
,
print
=
TRUE
)
@logger
.
abortExecution
(
error
=
FALSE
)
end
print
"Password:"
password
=
STDIN
.
noecho
(
&
:gets
).
chomp
puts
puts
@logger
.
info
(
"Remember that you can store your credentials for automatic authentication by running 'ebulk store-credentials'"
,
print
=
TRUE
)
puts
sleep
1
end
return
user
,
password
end
def
getLocalFiles
(
remove
=
nil
)
local_files
=
{}
begin
...
...
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/filelogger.rb
View file @
eab67c55
...
...
@@ -14,7 +14,7 @@ class LogManager
end
def
setFilename
(
tool_dir
,
prefix
)
log_dir
=
"
#{
tool_dir
}
/"
log_dir
=
"
#{
tool_dir
}
/
logs/
"
if
not
File
.
directory?
(
log_dir
)
Dir
.
mkdir
log_dir
end
...
...
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/input/fif.rb
View file @
eab67c55
...
...
@@ -79,7 +79,7 @@ module Embulk
@logger
=
LogManager
.
instance
()
@logger
.
setFilename
(
tool_dir
,
"ingestion"
)
task
=
{
'paths'
=>
[]
}
task
[
'supplier'
]
=
config
.
param
(
'supplier'
,
:string
)
task
[
'supplier'
]
=
config
.
param
(
'supplier'
,
:string
,
default:
"default"
)
task
[
'data_set'
]
=
config
.
param
(
'data_set'
,
:string
)
task
[
'chunk_size'
]
=
config
.
param
(
'chunk_size'
,
:float
,
default:
0
)
*
DatasetUtils
::
MEGA
if
task
[
'chunk_size'
]
==
0
...
...
@@ -108,14 +108,16 @@ module Embulk
end
end
end
if
not
@status
user
,
password
=
@dataset_utils
.
getCredentials
(
tool_dir
)
@wendelin
=
WendelinClient
.
new
(
config
.
param
(
'erp5_url'
,
:string
),
user
,
password
)
end
@logger
.
info
(
"Checking local files..."
,
print
=
TRUE
)
task
[
'paths'
]
=
@dataset_utils
.
getLocalPaths
(
paths
)
if
@status
self
.
status
(
task
)
@logger
.
abortExecution
(
error
=
FALSE
)
end
@wendelin
=
WendelinClient
.
new
(
config
.
param
(
'erp5_url'
,
:string
),
config
.
param
(
'user'
,
:string
),
config
.
param
(
'password'
,
:string
))
@logger
.
info
(
"Checking remote dataset..."
,
print
=
TRUE
)
data_stream_dict
=
@wendelin
.
getDataStreams
(
task
[
'data_set'
])
if
data_stream_dict
[
"status_code"
]
!=
0
...
...
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/input/wendelin.rb
View file @
eab67c55
...
...
@@ -87,21 +87,15 @@ module Embulk
@erp5_url
=
config
.
param
(
'erp5_url'
,
:string
)
@data_set
=
config
.
param
(
'data_set'
,
:string
)
@logger
.
info
(
"Dataset name:
#{
@data_set
}
"
)
@user
=
config
.
param
(
"user"
,
:string
,
defualt:
nil
)
@logger
.
info
(
"User:
#{
@user
}
"
)
@password
=
config
.
param
(
"password"
,
:string
,
default:
nil
)
@chunk_size
=
config
.
param
(
'chunk_size'
,
:float
,
default:
0
)
*
DatasetUtils
::
MEGA
@output_path
=
config
.
param
(
"output_path"
,
:string
,
:default
=>
nil
)
if
not
File
.
directory?
(
@output_path
)
@logger
.
error
(
"Output directory not found."
,
print
=
TRUE
)
@logger
.
abortExecution
()
end
@wendelin
=
WendelinClient
.
new
(
@erp5_url
,
@user
,
@password
)
task
=
{
'erp5_url'
=>
@erp5_url
,
'data_set'
=>
@data_set
,
'user'
=>
@user
,
'password'
=>
@password
,
'chunk_size'
=>
@chunk_size
,
'output_path'
=>
@output_path
,
'tool_dir'
=>
@tool_dir
...
...
@@ -125,6 +119,8 @@ module Embulk
@logger
.
abortExecution
(
error
=
FALSE
)
end
end
task
[
'user'
],
task
[
'password'
]
=
@dataset_utils
.
getCredentials
(
@tool_dir
)
@wendelin
=
WendelinClient
.
new
(
@erp5_url
,
task
[
'user'
],
task
[
'password'
])
@logger
.
info
(
"Getting remote file list from dataset '
#{
@data_set
}
'..."
,
print
=
TRUE
)
data_stream_list
=
@wendelin
.
getDataStreams
(
@data_set
)
if
data_stream_list
[
"status_code"
]
==
0
...
...
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/output/wendelin.rb
View file @
eab67c55
...
...
@@ -11,18 +11,18 @@ module Embulk
def
self
.
transaction
(
config
,
schema
,
count
,
&
control
)
task
=
{
"erp5_url"
=>
config
.
param
(
"erp5_url"
,
:string
),
"user"
=>
config
.
param
(
"user"
,
:string
,
defualt:
nil
),
"password"
=>
config
.
param
(
"password"
,
:string
,
default:
nil
),
"path_prefix"
=>
config
.
param
(
"path_prefix"
,
:string
,
:default
=>
nil
),
"type_input"
=>
config
.
param
(
"type_input"
,
:string
,
:default
=>
nil
),
"data_set"
=>
config
.
param
(
"data_set"
,
:string
,
default:
nil
),
"erp5_base_url"
=>
config
.
param
(
"erp5_base_url"
,
:string
,
default:
nil
)
"erp5_base_url"
=>
config
.
param
(
"erp5_base_url"
,
:string
,
default:
nil
),
"tool_dir"
=>
config
.
param
(
'tool_dir'
,
:string
)
}
storage_ingestion
=
!
task
[
"type_input"
]
@dataset_utils
=
DatasetUtils
.
new
(
Dir
.
pwd
)
task
[
"user"
],
task
[
"password"
]
=
@dataset_utils
.
getCredentials
(
task
[
"tool_dir"
])
task_reports
=
yield
(
task
)
next_config_diff
=
{}
@logger
=
LogManager
.
instance
()
@dataset_utils
=
DatasetUtils
.
new
(
Dir
.
pwd
)
if
task_reports
.
length
>
0
@logger
.
info
(
"Your ingested files will be available in the site in a few minutes. Thank for your patience."
,
print
=
TRUE
)
if
storage_ingestion
...
...
@@ -50,10 +50,8 @@ module Embulk
def
init
credentials
=
{}
@erp5_url
=
task
[
"erp5_url"
]
@user
=
task
[
"user"
]
@password
=
task
[
"password"
]
@logger
=
LogManager
.
instance
()
@wendelin
=
WendelinClient
.
new
(
@erp5_url
,
@user
,
@password
)
@wendelin
=
WendelinClient
.
new
(
@erp5_url
,
task
[
"user"
],
task
[
"password"
]
)
end
def
add
(
page
)
...
...
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/parser/binary.rb
View file @
eab67c55
...
...
@@ -93,7 +93,9 @@ module Embulk
@dataset_utils
.
deleteSplitOperationControlFile
(
reference
)
@logger
.
abortExecution
()
rescue
Exception
=>
e
puts
@logger
.
error
(
"An error occurred during file ingestion: "
+
e
.
to_s
,
print
=
TRUE
)
puts
@logger
.
error
(
e
.
backtrace
)
puts
"[INFO] For more detailed information, please refer to the log file: "
+
@logger
.
getLogPath
()
@dataset_utils
.
deleteSplitOperationControlFile
(
reference
)
...
...
ebulk-data/embulk-wendelin-dataset-tool/lib/embulk/wendelin_client.rb
View file @
eab67c55
...
...
@@ -8,6 +8,7 @@ require_relative 'filelogger'
class
WendelinClient
HTTP_MESSAGE_401
=
"Unauthorized access. Please check your user credentials and try again."
HTTP_MESSAGE_400
=
"There was a problem with the http request. If the error persists, please contact the administrator."
HTTP_MESSAGE_5XX
=
"Internal Server Error: if the error persists, please contact the administrator."
HTTP_MESSAGE_OTHER
=
"Sorry, an error ocurred. If the error persists, please contact the administrator."
HTTP_MESSAGE_NOT_2XX
=
"HTTP-NOT-OK"
...
...
ebulk-data/help.md
View file @
eab67c55
...
...
@@ -9,12 +9,14 @@ commands:
pull [
<dataset>
] Downloads the content of the target dataset from the site into the output location
push [
<dataset>
] Ingests the content of the input location into a target dataset on the site
status [
<dataset>
] Lists the local changes of target dataset
init [
<dataset>
] Sets the directory as a dataset, so add and remove operations can be performed
add
<path>
Marks new or modified files in path for ingestion
remove
<path>
Marks files in path for removal
reset
<path>
Resets marked files in path
-h, --help Tool help
-r, --readme Opens README file
-e, --examples Shows some tool usage examples
store-credentials Stores user and password for automatic authentication
argument:
dataset argument Unique reference for the target dataset
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment