Skip to Content

Upload Data to Data Attribute Recommendation

test
0 %
Upload Data to Data Attribute Recommendation
Details

Upload Data to Data Attribute Recommendation

2020-04-05
Upload a dataset to your Data Attribute Recommendation service instance to afterwards be able to train your machine learning model.

You will learn

  • How to authorize your client to communicate with your Data Attribute Recommendation service instance
  • How to do upload data to your Data Attribute Recommendation service instance to train a machine learning model

To try out Data Attribute Recommendation, the first step is to upload data that will be used to train a machine learning model. For more information, see Data Attribute Recommendation. For further definition of specific terms, see Concepts.


Step 1: Get an access token

To communicate with your service instance, you need to retrieve an OAuth access token which will grant you access to the Data Attribute Recommendation APIs. This access token is added to all your service instance requests.

Open Postman and make sure that your Data Attribute Recommendation environment is selected. For detailed steps, see Set Up Postman Environment and Collection to call Data Attribute Recommendation APIs.

On the left, expand the Data Attribute Recommendation collection and open the subfolder Setup. In this folder, select the request, Get Authorization.

Click Send to send the request to your service instance.

Get Access Token

The response includes your access_token plus its expiration time expires_in. If the token expires, you need to request a new one. There is no need to copy the access token as the collection automatically adds the token to all requests.

Receive Access Token
Log on to answer question
Step 2: Create dataset schema

Now, you need to create a new dataset schema. A dataset schema describes the structure of datasets.

In these tutorials, you are using a dataset from Best Buy. The original dataset as well as other dataset from Best Buy can be found here. From the original dataset the product information description, manufacturer and price as well as three levels of product categories were picked to illustrate the possibility of the service to deal with such information. In a generic use case you may pick a number and combination of properties yourself.

Expand the subfolder Upload Data and select the request Create new Dataset Schema. click the Body tab to see the dataset schema that you are going to create.

The schema is divided into features and labels. The features are the inputs for the machine learning model whereas the labels are the fields that shall be predicted. Thus, this schema provides the product information as an input and wants to predict the product categories.

To create this dataset schema, click Send to send the request.

Create Dataset Schema

You have successfully created a dataset schema.

Log on to answer question
Step 3: List all dataset schemas

To see the details of all dataset schemas that you have created, select the request Get Datasets Schemas collection in the folder Upload Data. click Send to send the request.

List Dataset Schemas

In the response, you find your newly created dataset schema plus all other schemas that you have created. For every schema, its details are listed including the name, id and features of the schema.

List Dataset Schemas Response
Log on to answer question
Step 4: Create dataset

Next, you need to create a dataset using the dataset schema that you have created. The dataset is a table that holds the data that you will upload later.

To create a dataset, select the request Create new Dataset in the folder Upload Data. Click Send to send the request.

Create Dataset

In the response, you find the details of your newly created dataset, including the current status of the dataset. The status is NO_DATA as no data file has been uploaded yet.

Create Dataset Response

You have successfully created a dataset.

Log on to answer question
Step 5: Upload data

The final step is to upload data to your dataset.

For that, use the sample data available on GitHub. Download the CSV file that contains the data.

Download CSV

If your browser displays the data instead of downloading it, right-click anywhere and click Save as… to save the file.

Save CSV

Take a moment to look at the dataset. As mentioned in step 2 the dataset contains product information as well as product categories. You might ask why the product categories are in the dataset when we actually want to predict them?

The categories are only necessary for training as the service does not know yet which product information are common for certain categories. Instead, the service will recognize patterns and establish such connections during the training process. This allows the service to make predictions for categories solely based on the product information later on.

In Postman, select the request Upload data in the folder Upload Data. Now select the tab Body where you can add the data. Make sure that the type Binary is selected and click Select File to upload the file that contains the data. Press Send to send the request.

Upload Data

In the response, you see that the status of your dataset has changed to VALIDATING. The service is now validating the data that you have uploaded. When validation is done, the status will change accordingly.

Upload Data Response

You have successfully uploaded data to your dataset.

Log on to answer question
Step 6: See dataset status

To see the current status of your dataset, select the request Get Dataset by id in the folder Upload Data. Then click Send to send the request.

Get Dataset

In the response, you can observe the status of your dataset. If the status is SUCCEEDED, your data has been validated successfully. If the status is VALIDATING, your data is still in the validation process. Wait a minute and then send the request again to check whether the status has changed.

Note that you will not be able to Use Data Attribute Recommendation to Train a Machine Learning Model until the dataset is validated and the status changes from VALIDATING to SUCCEEDED.

Get Dataset Response

Your data is now validated and ready to be used to train a machine learning model.

Log on to answer question
Step 7: Test yourself
What is the status of your dataset right after you upload data?
×

Next Steps

Back to top