Ebulk tool is a wrapper for Embulk, an open-source bulk data loader that helps data transfer between various databases, storages, file formats, and cloud services. It supports any kind of input file formats, parallel and distributed execution to deal with big data sets, transaction control to guarantee All-or-Nothing file transfer, and operation resuming. Ebulk is as easy as git to use, allowing the big data transfering to be done by using very few commands.
# BIG DATA SHARING PLATFORM
Along with Wendelin platform, ebulk is combined to form an easy to use Data Lake to share petabytes of data grouped into data sets. This project offers a solution to the big data sharing problem by solving the following key points:
- Huge transfer (over slow and unreliable network)