`EN `_
`ET `_
#########
Embedding
#########
.. _create_embedding:
Create
*******
GUI
=====
Navigate to **Models -> Embeddings** and click on the **CREATE** button on top-left. Choose the name for you :ref:`embedding ` (*Description*).
Define the :ref:`query ` and select indices on which the :ref:`query ` will be executed. If you leave *Query* empty, it will take all documents from the selected indices.
If you have any searches defined in your :ref:`project `, they will appear in a dropdown menu if you click on the field *Query* - you can use existing searches as queries.
Choose *fields* on which the :ref:`embedding ` will be trained. The selected fields should contain textual data.
.. note::
It is recommended to use lemmatized or tokenized data. Lemmatization is especially useful with morphologically complex languages. You can tokenize and lemmatize the data with :ref:`MLP `.
Field *Number of dimensions* defines the length of the word vectors.
100-200 dimensions is usually a good place to start with.
Field *Minimum frequency* sets how many times a word must occur in the data in minimum in order to
get included into the :ref:`embedding `. Again, you can leave it with the default value *5* if you are unsure which value to pick.
.. note::
The quality of the embedding depends on the size of the dataset. The larger the better.
API
===
Endpoint: **/projects/{project_pk}/embeddings/**
Example:
.. code-block:: bash
curl -X POST "http://localhost:8000/api/v1/projects/11/embeddings/" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-H "Authorization: Token 8229898dccf960714a9fa22662b214005aa2b049" \
-d '{
"description": "My embedding",
"indices": [{"name": "texta_test_index"}],
"fields": ["comment_content_lemmas"],
"num_dimensions": 100,
"max_documents": 10000,
"min_freq": 5
}'
View
*******
GUI
=====
Navigate to **Models -> Embeddings** to view existing :ref:`embedding `.
If any of your embeddings is still training, the view will show you the progress of the training (:numref:`embedding_view`).
Besides than that, the view shows you general information about your embeddings.
.. _embedding_view:
.. figure:: images/embedding/embedding_view.png
*Embedding view*
API
===
Endpoint: **/projects/{project_pk}/embeddings/**
Example:
.. code-block:: bash
curl -X GET "http://localhost:8000/api/v1/projects/9/embeddings/" \
-H "Authorization: Token 8229898dccf960714a9fa22662b214005aa2b049"
Delete
*******
GUI
=====
Navigate to **Models -> Embeddings** and click on the three dots under **Actions** column and choose **Delete** (:numref:`embedding_actions`).
.. _embedding_actions:
.. figure:: images/embedding/embedding_actions.png
*Embedding actions*
API
===
Endpoint: **/projects/{project_pk}/embeddings/{embedding_id}**
Example:
.. code-block:: bash
curl -X DELETE "http://localhost:8000/api/v1/projects/9/embeddings/9/" \
-H "Authorization: Token 8229898dccf960714a9fa22662b214005aa2b049"
Edit
*******
GUI
=====
Navigate to **Models -> Embeddings** and click on the three dots under **Actions** column and choose **Edit** (:numref:`embedding_actions`).
API
===
Endpoint: **/projects/{project_pk}/embeddings/{embedding_id}**
.. code-block:: bash
curl -X PATCH "http://localhost:8000/api/v1/projects/9/embeddings/8/" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-H "Authorization: Token 8229898dccf960714a9fa22662b214005aa2b049" \
-d '{"description":"changed"}'
Apply phraser
*************
GUI
=====
Navigate to **Models -> Embeddings**, click on the three dots under **Actions** and choose **Phrase** (:numref:`embedding_actions`).
Insert text that you want to phrase and click **Post**. You should see phrased version of the text (:numref:`apply_phraser`).
.. _apply_phraser:
.. figure:: images/embedding/phraser_gui.png
*Apply phraser*
API
===
Endpoint: **/projects/{project_pk}/embeddings/**
Example:
.. code-block:: bash
curl -X POST "http://localhost:8000/api/v1/projects/9/embeddings/8/phrase_text/" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-H "Authorization: Token 8229898dccf960714a9fa22662b214005aa2b049" \
-d '{
"text": "Venus is the second planet from the Sun."
}'