Dataset Importer¶
Dataset Importer is a tool for uploading JSON Lines, CSV and Excel files into Elasticsearch to make them accessible for TEXTA Toolkit.
Creation¶
Parameters:¶
description - Normal description to separate any given task from the other ones.
index - Name of the newly created index, please note that Elasticsearch index naming restrictions apply.
separator - Only needed for .csv files, defaults to a comma (,). Allows to configure the separator for csv files.
file - File to import (JSON Lines, CSV, Excel)
Note
- As ElasticSearch has some restrictions, the dataset name format has the following constraints:
GUI¶
Set a description for the import task
Set the Dataset name
Specify a separator for CSV files (usually a comma)
Browse and choose the file to be uploaded by clicking on the folder button.
Click on the Create button to start the Importer Task. Upon completion you can add the dataset to your project.
API¶
@ is special syntax for reading the binary of the given file name.
curl -H "Authorization: Token 8229898dccf960714a9fa22662b214005aa2b049" \
-F "description=Articles" \
-F "index=en_articles" \
-F "file=@FILE_NAME.csv" \
http://localhost:8000/api/v1/projects/11/dataset_imports/