Download OpenAPI specification:Download
Get the list of alerts.
Alerts are the notifications to the user about what has gone wrong and how to fix it.
Examples:
/alerts?end=14&limit=3
End in alert ID 14 and show up to 3 alerts. So it could return alerts 12, 13, and 14./alerts?start=2&limit=3
Start from alert ID 2 and show up to 3 alerts. So it could return alerts 2, 3, and 4/alerts?limit=3&reverse=false
Get up to 3 alerts and don not reverse the order. By default we reverse the order so it is descending from the latest to the earliest alerts.start | integer (Start) The ID (primary key) of alerts to start from |
end | integer (End) The ID (primary key) of alerts to end at. Note: you either pass the start or end but not both. |
limit | integer (Limit) Default: 20 Max number of alerts to be returned. |
resolved | boolean (Resolved) Default: false Filter by whether the alerts are resolved or not |
reverse | boolean (Reverse) Default: true Reverse the sorting of returned results |
dataset_id | integer (Dataset Id) Limit the results to a specific dataset |
is_muted | boolean (Is Muted) Default: false Limit the results to the muted alerts or unmuted. A muted alert means that it will not pop up to the user in the interface. Instead, it is an alert related to a bad row that has ended up in the quarantine table. The user can only see it when inspecting the bad row in the quarantine table. |
data_source_model_id | integer (Data Source Model Id) Limit the results to a specific data source |
{- "next": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "previous": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "results": [
- {
- "id": 0,
- "dataset_name": "string",
- "alert_type": "datetime_casting",
- "count": 0,
- "msg": "string",
- "created_at": "2019-08-24T14:15:22Z",
- "resolvable_by_user": true,
- "resolve_after_migration": false,
- "stored_items_to_alerts": [ ]
}
], - "filters": {
- "resolved": false,
- "reverse": true,
- "dataset_id": 0,
- "data_source_model_id": 0,
- "is_muted": false
}, - "title": ""
}
Get one alert based on its ID
alert_item_id required | integer (Alert Item Id) |
{- "id": 0,
- "settings_model_id": 0,
- "dataset_name": "string",
- "dataset_id": 0,
- "count": 0,
- "field_name": "string",
- "original_field_name": "string",
- "field_value": "string",
- "row_value": [
- "string"
], - "row_header": [
- "string"
], - "original_headers": [
- "string"
], - "msg": "string",
- "body": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "alert_type": "datetime_casting",
- "previous_values": [
- "string"
], - "resolved": false,
- "resolve_after_migration": false,
- "blocks_learning": false,
- "blocks_ingestion_for_dataset": false,
- "blocks_ingestion_for_data_source": false,
- "blocks_stored_item": false,
- "created_at": "2019-08-24T14:15:22Z",
- "resolved_at": "2019-08-24T14:15:22Z",
- "expired_at": "2019-08-24T14:15:22Z",
- "training_data_item_id": 0,
- "training_job_id": 0,
- "migration_model_id": 0,
- "datetime_formats": [ ],
- "null_values": [ ],
- "null_values_for_this_column": [ ],
- "alert_actions": [ ],
- "alert_secondary_actions": {
- "property1": [
- {
- "text": "string",
- "info": "string",
- "action": "apply_migration",
- "url": "string"
}
], - "property2": [
- {
- "text": "string",
- "info": "string",
- "action": "apply_migration",
- "url": "string"
}
]
}, - "resolvable_by_user": true,
- "data_source_model_name": "string",
- "data_source_model_id": 0,
- "stored_items_to_alerts": [ ],
- "submit_button_text": "",
- "redirect_url": "string",
- "new_fields_recommendations": {
- "property1": [
- null
], - "property2": [
- null
]
}, - "boundary_data": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "recommended_settings_module_to_update": "SettingsModel",
- "recommended_settings_field_name": "string",
- "recommended_settings_value": "string",
- "stack_trace": "",
- "related_urls": [
- "string"
]
}
Resolve an alert.
alert_item_id required | integer (Alert Item Id) |
new_datetime_format | string (New Datetime Format) Default: "" The datetime format to be used to resolve a DatetimeCasting alert. You can learn about the formats here: https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes |
sheet_name | string (Sheet Name) Default: "" The sheet name to use |
action | string Enum: "apply_migration" "discard_migration" "rerun_training" "expand_fields" "override_model" "edit_destination_settings" "go_to_migration_review" "check_learning_jobs" "edit_settings" "edit_data_source_model_settings" "review_migration_again" "put_file_for_learning" "mark_data_as_valid" "set_value_to_null" "set_value_to_null_for_field" "set_value_to_null_for_source" "set_value_to_null_for_field_for_source" "set_value_to_true" "set_value_to_true_for_field" "set_value_to_true_for_source" "set_value_to_true_for_field_for_source" "set_value_to_false" "set_value_to_false_for_field" "set_value_to_false_for_source" "set_value_to_false_for_field_for_source" "define_datetime_format" "define_datetime_format_for_field" "define_datetime_format_for_source" "define_datetime_format_for_field_for_source" "infer_datetime_format_for_field" "infer_datetime_format_for_field_for_source" "allow_multi_datetime_formats" "allow_multi_formats_n_define_datetime_format" "allow_multi_formats_n_define_datetime_format_for_field" "allow_multi_formats_n_define_datetime_format_for_source" "allow_multi_formats_n_define_datetime_format_for_field_for_source" "not_a_datetime_field" "try_again" "hide" "ignore_file" "ignore_alert_associated_files" "ignore_row" "ignore_row_for_file" "validation_add_a_new_keyword" "validation_rename_to_existing_keyword" "reprocess_file" "proceed_with_sheet_name" "download_file" The action that was chosen by the user in order to resolve the alert. |
object (New Fields Actions) The details of how to resolve an | |
line_signature | string (Line Signature) The hash signature of the row to be ignored. The user can decide to ignore a row that caused an alert and as a result, resolve that alert. |
{- "new_datetime_format": "",
- "sheet_name": "",
- "action": "apply_migration",
- "new_fields_actions": {
- "property1": "string",
- "property2": "string"
}, - "line_signature": "string"
}
{- "redirect_post_submit_to": ""
}
Get the list of datasets.
start | integer (Start) |
end | integer (End) |
limit | integer (Limit) Default: 20 |
reverse | boolean (Reverse) Default: true |
states | Array of strings (States) Default: ["active","disabled"] |
destination_id | integer (Destination Id) |
{- "next": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "previous": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "results": [
- {
- "id": 0,
- "version_id": 0,
- "name": "string",
- "state": "active",
- "destination_name": "string",
- "unresolved_alerts": 0
}
], - "filters": {
- "destination_id": 0
}, - "title": ""
}
Creates new datasets. The ids and creation dates will be ignored in the request. table_name and schema_name and name are required. For the full list of what is required and what is ignored, take a look at the DataSetPostRequestSchema
Example request:
{ "name": "new", "table_name": "table1", "schema_name": "schema1", "settings_model": { "boolean_true": [ "blah" ] } }
name required | string (Name) |
schema_name required | string (Schema Name) |
database_name required | string (Database Name) |
table_name required | string (Table Name) |
migration_policy | string (MigrationPolicy) Enum: "ask_user" "apply_asap" "migration_window" "locked" An enumeration. |
encrypt_raw_data_during_backup | boolean (Encrypt Raw Data During Backup) |
compression_type_of_backup_data | string (CompressionType) Enum: "gzip" "zip" "snappy_stream" An enumeration. |
backup_key_format | string (Backup Key Format) |
backup_settings_id required | integer (Backup Settings Id) |
destination_id | integer (Destination Id) |
destinations_list | Array of Array of any (Destinations List) [ items 2 items [ items ] ] |
data_loading_process | string (DataLoadingProcess) Enum: "replace" "upsert" "snapshot" An enumeration. |
should_reprocess | boolean (Should Reprocess) |
max_retry_count | integer (Max Retry Count) |
redirect_post_submit_to | string (Redirect Post Submit To) |
strictly_one_datetime_format_in_a_column | boolean (Strictly One Datetime Format In A Column) |
guess_datetime_format_in_ingestion | boolean (Guess Datetime Format In Ingestion) |
successful_rows_count | integer (Successful Rows Count) |
bad_rows_count | integer (Bad Rows Count) |
disabled_fields | Array of strings (Disabled Fields) |
max_tries_to_fix_json | integer (Max Tries To Fix Json) |
whodunit_id | string <uuid> (Whodunit Id) |
object (NewSettingsSchema) |
{- "name": "string",
- "schema_name": "string",
- "database_name": "string",
- "table_name": "string",
- "migration_policy": "ask_user",
- "encrypt_raw_data_during_backup": true,
- "compression_type_of_backup_data": "gzip",
- "backup_key_format": "string",
- "backup_settings_id": 0,
- "destination_id": 0,
- "destinations_list": [
- [
- 0,
- "string"
]
], - "data_loading_process": "replace",
- "should_reprocess": true,
- "max_retry_count": 0,
- "redirect_post_submit_to": "string",
- "strictly_one_datetime_format_in_a_column": true,
- "guess_datetime_format_in_ingestion": true,
- "successful_rows_count": 0,
- "bad_rows_count": 0,
- "disabled_fields": [
- "string"
], - "max_tries_to_fix_json": 0,
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "settings_model": {
- "null_values": [
- "string"
], - "null_values_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true": [
- "string"
], - "boolean_false": [
- "string"
], - "dollar_to_cent": true,
- "percent_to_decimal": true,
- "decimal_field_padding": 0,
- "string_field_padding": 0,
- "datetime_allowed_characters": "string",
- "datetime_formats": [
- "string"
], - "array_delimiters": [
- "string"
], - "datetime_formats_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "field_name_full_conversion": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion": {
- "property1": "string",
- "property2": "string"
}, - "default_value_for_field_when_casting_error": { },
- "dollar_value_if_word_in_field_name": [
- "string"
], - "non_string_fields_are_all_nullable": true,
- "use_text_instead_of_string": true,
- "trim_string_instead_of_raising_err": true,
- "string_fields_can_be_nullable": true,
- "ignore_lines_that_include_only_subset_of_characters": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words": [
- "string"
], - "ignore_fields_in_signature_calculation": [
- "string"
], - "quarantine_row_if_row_level_issue": true,
- "ignore_not_seen_before_fields_when_importing": true,
- "ignore_matchers": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "encrypt_columns": [
- "string"
], - "infer_datetime_for_columns": [
- "string"
], - "ignore_column_names": [
- "string"
], - "enable_small_integer": true,
- "enable_integer": true,
- "max_boundary_element_length": 0,
- "monetary_columns_override": {
- "property1": true,
- "property2": true
}, - "data_marked_as_valid": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "xls_date_mode": "mode_0",
- "column_overrides": {
- "property1": {
- "property1": true,
- "property2": true
}, - "property2": {
- "property1": true,
- "property2": true
}
}, - "fields_to_expand": [
- "string"
]
}
}
{- "id": 0,
- "version_id": 0,
- "name": "string",
- "schema_name": "string",
- "database_name": "string",
- "table_name": "string",
- "migration_policy": "ask_user",
- "state": "active",
- "encrypt_raw_data_during_backup": true,
- "compression_type_of_backup_data": "gzip",
- "backup_key_format": "string",
- "backup_settings_list": [
- [
- 0,
- "string"
]
], - "backup_settings_id": 0,
- "destination_id": 0,
- "destinations_list": [
- [
- 0,
- "string"
]
], - "data_loading_process": "replace",
- "should_reprocess": true,
- "max_retry_count": 0,
- "redirect_post_submit_to": "",
- "strictly_one_datetime_format_in_a_column": true,
- "guess_datetime_format_in_ingestion": true,
- "successful_rows_count": 0,
- "bad_rows_count": 0,
- "disabled_fields": [
- "string"
], - "max_tries_to_fix_json": 0,
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "settings_model": {
- "id": 0,
- "version_id": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "null_values": [
- "string"
], - "null_values_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true": [
- "string"
], - "boolean_false": [
- "string"
], - "dollar_to_cent": true,
- "percent_to_decimal": true,
- "decimal_field_padding": 0,
- "string_field_padding": 0,
- "datetime_allowed_characters": "string",
- "datetime_formats": [
- "string"
], - "array_delimiters": [
- "string"
], - "datetime_formats_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "field_name_full_conversion": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion": {
- "property1": "string",
- "property2": "string"
}, - "default_value_for_field_when_casting_error": { },
- "dollar_value_if_word_in_field_name": [
- "string"
], - "non_string_fields_are_all_nullable": true,
- "use_text_instead_of_string": true,
- "trim_string_instead_of_raising_err": true,
- "string_fields_can_be_nullable": true,
- "ignore_lines_that_include_only_subset_of_characters": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words": [
- "string"
], - "ignore_fields_in_signature_calculation": [
- "string"
], - "quarantine_row_if_row_level_issue": true,
- "ignore_not_seen_before_fields_when_importing": true,
- "ignore_matchers": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "encrypt_columns": [
- "string"
], - "infer_datetime_for_columns": [
- "string"
], - "ignore_column_names": [
- "string"
], - "enable_small_integer": true,
- "enable_integer": true,
- "max_boundary_element_length": 0,
- "monetary_columns_override": {
- "property1": true,
- "property2": true
}, - "data_marked_as_valid": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "xls_date_mode": "mode_0",
- "column_overrides": {
- "property1": {
- "property1": true,
- "property2": true
}, - "property2": {
- "property1": true,
- "property2": true
}
}, - "fields_to_expand": [
- "string"
]
}
}
Modify a dataset using a PATCH query.
Example
PATCH /datasets
body
{
'id': 1,
'version_id': 1,
'settings_model': {
'id': 1,
'version_id': 2,
'dollar_to_cent': true,
},
'name': 'foo',
}
In this example we are modifying the dataset ID 1 to have a new name: foo
.
We are slow modifying its associated settings_model to have the dollar_to_cent
value of True.
id required | integer (Id) |
version_id required | integer (Version Id) |
name | string (Name) |
migration_policy | string (MigrationPolicy) Enum: "ask_user" "apply_asap" "migration_window" "locked" An enumeration. |
encrypt_raw_data_during_backup | boolean (Encrypt Raw Data During Backup) |
compression_type_of_backup_data | string (CompressionType) Enum: "gzip" "zip" "snappy_stream" An enumeration. |
backup_key_format | string (Backup Key Format) |
destinations_list | Array of Array of any (Destinations List) [ items 2 items [ items ] ] |
data_loading_process | string (DataLoadingProcess) Enum: "replace" "upsert" "snapshot" An enumeration. |
should_reprocess | boolean (Should Reprocess) |
max_retry_count | integer (Max Retry Count) |
redirect_post_submit_to | string (Redirect Post Submit To) |
strictly_one_datetime_format_in_a_column | boolean (Strictly One Datetime Format In A Column) |
guess_datetime_format_in_ingestion | boolean (Guess Datetime Format In Ingestion) |
successful_rows_count | integer (Successful Rows Count) |
bad_rows_count | integer (Bad Rows Count) |
max_tries_to_fix_json | integer (Max Tries To Fix Json) |
whodunit_id | string <uuid> (Whodunit Id) |
object (SettingsSchemaOptional) |
{- "id": 0,
- "version_id": 0,
- "name": "string",
- "migration_policy": "ask_user",
- "encrypt_raw_data_during_backup": true,
- "compression_type_of_backup_data": "gzip",
- "backup_key_format": "string",
- "destinations_list": [
- [
- 0,
- "string"
]
], - "data_loading_process": "replace",
- "should_reprocess": true,
- "max_retry_count": 0,
- "redirect_post_submit_to": "string",
- "strictly_one_datetime_format_in_a_column": true,
- "guess_datetime_format_in_ingestion": true,
- "successful_rows_count": 0,
- "bad_rows_count": 0,
- "max_tries_to_fix_json": 0,
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "settings_model": {
- "id": 0,
- "version_id": 0,
- "null_values": [
- "string"
], - "null_values_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true": [
- "string"
], - "boolean_false": [
- "string"
], - "dollar_to_cent": true,
- "percent_to_decimal": true,
- "decimal_field_padding": 0,
- "string_field_padding": 0,
- "datetime_allowed_characters": "string",
- "datetime_formats": [
- "string"
], - "array_delimiters": [
- "string"
], - "datetime_formats_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "field_name_full_conversion": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion": {
- "property1": "string",
- "property2": "string"
}, - "default_value_for_field_when_casting_error": { },
- "dollar_value_if_word_in_field_name": [
- "string"
], - "non_string_fields_are_all_nullable": true,
- "use_text_instead_of_string": true,
- "trim_string_instead_of_raising_err": true,
- "string_fields_can_be_nullable": true,
- "ignore_lines_that_include_only_subset_of_characters": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words": [
- "string"
], - "ignore_fields_in_signature_calculation": [
- "string"
], - "quarantine_row_if_row_level_issue": true,
- "ignore_not_seen_before_fields_when_importing": true,
- "ignore_matchers": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "encrypt_columns": [
- "string"
], - "infer_datetime_for_columns": [
- "string"
], - "ignore_column_names": [
- "string"
], - "enable_small_integer": true,
- "enable_integer": true,
- "max_boundary_element_length": 0,
- "monetary_columns_override": {
- "property1": true,
- "property2": true
}, - "data_marked_as_valid": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "xls_date_mode": "mode_0",
- "column_overrides": {
- "property1": {
- "property1": true,
- "property2": true
}, - "property2": {
- "property1": true,
- "property2": true
}
}, - "fields_to_expand": [
- "string"
]
}
}
{- "id": 0,
- "version_id": 0,
- "name": "string",
- "schema_name": "string",
- "database_name": "string",
- "table_name": "string",
- "migration_policy": "ask_user",
- "state": "active",
- "encrypt_raw_data_during_backup": true,
- "compression_type_of_backup_data": "gzip",
- "backup_key_format": "string",
- "backup_settings_list": [
- [
- 0,
- "string"
]
], - "backup_settings_id": 0,
- "destination_id": 0,
- "destinations_list": [
- [
- 0,
- "string"
]
], - "data_loading_process": "replace",
- "should_reprocess": true,
- "max_retry_count": 0,
- "redirect_post_submit_to": "",
- "strictly_one_datetime_format_in_a_column": true,
- "guess_datetime_format_in_ingestion": true,
- "successful_rows_count": 0,
- "bad_rows_count": 0,
- "disabled_fields": [
- "string"
], - "max_tries_to_fix_json": 0,
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "settings_model": {
- "id": 0,
- "version_id": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "null_values": [
- "string"
], - "null_values_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true": [
- "string"
], - "boolean_false": [
- "string"
], - "dollar_to_cent": true,
- "percent_to_decimal": true,
- "decimal_field_padding": 0,
- "string_field_padding": 0,
- "datetime_allowed_characters": "string",
- "datetime_formats": [
- "string"
], - "array_delimiters": [
- "string"
], - "datetime_formats_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "field_name_full_conversion": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion": {
- "property1": "string",
- "property2": "string"
}, - "default_value_for_field_when_casting_error": { },
- "dollar_value_if_word_in_field_name": [
- "string"
], - "non_string_fields_are_all_nullable": true,
- "use_text_instead_of_string": true,
- "trim_string_instead_of_raising_err": true,
- "string_fields_can_be_nullable": true,
- "ignore_lines_that_include_only_subset_of_characters": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words": [
- "string"
], - "ignore_fields_in_signature_calculation": [
- "string"
], - "quarantine_row_if_row_level_issue": true,
- "ignore_not_seen_before_fields_when_importing": true,
- "ignore_matchers": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "encrypt_columns": [
- "string"
], - "infer_datetime_for_columns": [
- "string"
], - "ignore_column_names": [
- "string"
], - "enable_small_integer": true,
- "enable_integer": true,
- "max_boundary_element_length": 0,
- "monetary_columns_override": {
- "property1": true,
- "property2": true
}, - "data_marked_as_valid": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "xls_date_mode": "mode_0",
- "column_overrides": {
- "property1": {
- "property1": true,
- "property2": true
}, - "property2": {
- "property1": true,
- "property2": true
}
}, - "fields_to_expand": [
- "string"
]
}
}
Get a dataset object with default values filled up so it can be used as a basis by the user to create a new dataset by doing a POST query.
{- "id": 0,
- "version_id": 0,
- "name": "string",
- "schema_name": "string",
- "database_name": "string",
- "table_name": "string",
- "migration_policy": "ask_user",
- "state": "active",
- "encrypt_raw_data_during_backup": true,
- "compression_type_of_backup_data": "gzip",
- "backup_key_format": "string",
- "backup_settings_list": [
- [
- 0,
- "string"
]
], - "backup_settings_id": 0,
- "destination_id": 0,
- "destinations_list": [
- [
- 0,
- "string"
]
], - "data_loading_process": "replace",
- "should_reprocess": true,
- "max_retry_count": 0,
- "redirect_post_submit_to": "",
- "strictly_one_datetime_format_in_a_column": true,
- "guess_datetime_format_in_ingestion": true,
- "successful_rows_count": 0,
- "bad_rows_count": 0,
- "disabled_fields": [
- "string"
], - "max_tries_to_fix_json": 0,
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "settings_model": {
- "id": 0,
- "version_id": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "null_values": [
- "string"
], - "null_values_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true": [
- "string"
], - "boolean_false": [
- "string"
], - "dollar_to_cent": true,
- "percent_to_decimal": true,
- "decimal_field_padding": 0,
- "string_field_padding": 0,
- "datetime_allowed_characters": "string",
- "datetime_formats": [
- "string"
], - "array_delimiters": [
- "string"
], - "datetime_formats_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "field_name_full_conversion": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion": {
- "property1": "string",
- "property2": "string"
}, - "default_value_for_field_when_casting_error": { },
- "dollar_value_if_word_in_field_name": [
- "string"
], - "non_string_fields_are_all_nullable": true,
- "use_text_instead_of_string": true,
- "trim_string_instead_of_raising_err": true,
- "string_fields_can_be_nullable": true,
- "ignore_lines_that_include_only_subset_of_characters": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words": [
- "string"
], - "ignore_fields_in_signature_calculation": [
- "string"
], - "quarantine_row_if_row_level_issue": true,
- "ignore_not_seen_before_fields_when_importing": true,
- "ignore_matchers": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "encrypt_columns": [
- "string"
], - "infer_datetime_for_columns": [
- "string"
], - "ignore_column_names": [
- "string"
], - "enable_small_integer": true,
- "enable_integer": true,
- "max_boundary_element_length": 0,
- "monetary_columns_override": {
- "property1": true,
- "property2": true
}, - "data_marked_as_valid": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "xls_date_mode": "mode_0",
- "column_overrides": {
- "property1": {
- "property1": true,
- "property2": true
}, - "property2": {
- "property1": true,
- "property2": true
}
}, - "fields_to_expand": [
- "string"
]
}
}
Get a detail view of a dataset including the associated settings model.
Note: There is a 1:1 relationship between datasets and settings_models. The reason they are separate is that a settings_model's change can potentially cause a training job. Because we want to reflect any changes to the settings in the output of the training job which can ultimately cause changes in the database schema.
However the configs in the dataset itself are not anything that can cause a retraining.
dataset_id required | integer (Dataset Id) |
{- "id": 0,
- "version_id": 0,
- "name": "string",
- "schema_name": "string",
- "database_name": "string",
- "table_name": "string",
- "migration_policy": "ask_user",
- "state": "active",
- "encrypt_raw_data_during_backup": true,
- "compression_type_of_backup_data": "gzip",
- "backup_key_format": "string",
- "backup_settings_list": [
- [
- 0,
- "string"
]
], - "backup_settings_id": 0,
- "destination_id": 0,
- "destinations_list": [
- [
- 0,
- "string"
]
], - "data_loading_process": "replace",
- "should_reprocess": true,
- "max_retry_count": 0,
- "redirect_post_submit_to": "",
- "strictly_one_datetime_format_in_a_column": true,
- "guess_datetime_format_in_ingestion": true,
- "successful_rows_count": 0,
- "bad_rows_count": 0,
- "disabled_fields": [
- "string"
], - "max_tries_to_fix_json": 0,
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "settings_model": {
- "id": 0,
- "version_id": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "null_values": [
- "string"
], - "null_values_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true": [
- "string"
], - "boolean_false": [
- "string"
], - "dollar_to_cent": true,
- "percent_to_decimal": true,
- "decimal_field_padding": 0,
- "string_field_padding": 0,
- "datetime_allowed_characters": "string",
- "datetime_formats": [
- "string"
], - "array_delimiters": [
- "string"
], - "datetime_formats_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "field_name_full_conversion": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion": {
- "property1": "string",
- "property2": "string"
}, - "default_value_for_field_when_casting_error": { },
- "dollar_value_if_word_in_field_name": [
- "string"
], - "non_string_fields_are_all_nullable": true,
- "use_text_instead_of_string": true,
- "trim_string_instead_of_raising_err": true,
- "string_fields_can_be_nullable": true,
- "ignore_lines_that_include_only_subset_of_characters": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words": [
- "string"
], - "ignore_fields_in_signature_calculation": [
- "string"
], - "quarantine_row_if_row_level_issue": true,
- "ignore_not_seen_before_fields_when_importing": true,
- "ignore_matchers": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "encrypt_columns": [
- "string"
], - "infer_datetime_for_columns": [
- "string"
], - "ignore_column_names": [
- "string"
], - "enable_small_integer": true,
- "enable_integer": true,
- "max_boundary_element_length": 0,
- "monetary_columns_override": {
- "property1": true,
- "property2": true
}, - "data_marked_as_valid": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "xls_date_mode": "mode_0",
- "column_overrides": {
- "property1": {
- "property1": true,
- "property2": true
}, - "property2": {
- "property1": true,
- "property2": true
}
}, - "fields_to_expand": [
- "string"
]
}
}
Get the history of the dataset
dataset_id required | integer (Dataset Id) |
{- "dataset_history": [
- {
- "version_id": 0,
- "history_items": [
- {
- "action": "changed",
- "field_name": "string",
- "new_value": null,
- "old_value": null
}
], - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "whodunit_first_name": "string",
- "whodunit_last_name": "string",
- "datetime": "string"
}
], - "settings_model_history": [
- {
- "version_id": 0,
- "history_items": [
- {
- "action": "changed",
- "field_name": "string",
- "new_value": null,
- "old_value": null
}
], - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "whodunit_first_name": "string",
- "whodunit_last_name": "string",
- "datetime": "string"
}
]
}
Get the row stats for a dataset.
Currently we only return the bad_rows_count which are the count of the rows of the dataset that are in the quarantine table.
dataset_id required | integer (Dataset Id) |
{- "bad_rows_count": 0
}
Make a copy of the configurations of a dataset
dataset_id required | integer (Dataset Id) |
{- "id": 0,
- "version_id": 0,
- "name": "string",
- "schema_name": "string",
- "database_name": "string",
- "table_name": "string",
- "migration_policy": "ask_user",
- "state": "active",
- "encrypt_raw_data_during_backup": true,
- "compression_type_of_backup_data": "gzip",
- "backup_key_format": "string",
- "backup_settings_list": [
- [
- 0,
- "string"
]
], - "backup_settings_id": 0,
- "destination_id": 0,
- "destinations_list": [
- [
- 0,
- "string"
]
], - "data_loading_process": "replace",
- "should_reprocess": true,
- "max_retry_count": 0,
- "redirect_post_submit_to": "",
- "strictly_one_datetime_format_in_a_column": true,
- "guess_datetime_format_in_ingestion": true,
- "successful_rows_count": 0,
- "bad_rows_count": 0,
- "disabled_fields": [
- "string"
], - "max_tries_to_fix_json": 0,
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "settings_model": {
- "id": 0,
- "version_id": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "null_values": [
- "string"
], - "null_values_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true": [
- "string"
], - "boolean_false": [
- "string"
], - "dollar_to_cent": true,
- "percent_to_decimal": true,
- "decimal_field_padding": 0,
- "string_field_padding": 0,
- "datetime_allowed_characters": "string",
- "datetime_formats": [
- "string"
], - "array_delimiters": [
- "string"
], - "datetime_formats_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "field_name_full_conversion": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion": {
- "property1": "string",
- "property2": "string"
}, - "default_value_for_field_when_casting_error": { },
- "dollar_value_if_word_in_field_name": [
- "string"
], - "non_string_fields_are_all_nullable": true,
- "use_text_instead_of_string": true,
- "trim_string_instead_of_raising_err": true,
- "string_fields_can_be_nullable": true,
- "ignore_lines_that_include_only_subset_of_characters": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words": [
- "string"
], - "ignore_fields_in_signature_calculation": [
- "string"
], - "quarantine_row_if_row_level_issue": true,
- "ignore_not_seen_before_fields_when_importing": true,
- "ignore_matchers": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "encrypt_columns": [
- "string"
], - "infer_datetime_for_columns": [
- "string"
], - "ignore_column_names": [
- "string"
], - "enable_small_integer": true,
- "enable_integer": true,
- "max_boundary_element_length": 0,
- "monetary_columns_override": {
- "property1": true,
- "property2": true
}, - "data_marked_as_valid": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "xls_date_mode": "mode_0",
- "column_overrides": {
- "property1": {
- "property1": true,
- "property2": true
}, - "property2": {
- "property1": true,
- "property2": true
}
}, - "fields_to_expand": [
- "string"
]
}
}
Creates new datasets. The ids and creation dates will be ignored in the request. table_name and schema_name and name are required. For the full list of what is required and what is ignored, take a look at the DataSetPostRequestSchema
Example request:
{ "name": "new", "table_name": "table1", "schema_name": "schema1", "settings_model": { "boolean_true": [ "blah" ] } }
dataset_id required | integer (Dataset Id) |
name required | string (Name) |
schema_name required | string (Schema Name) |
database_name required | string (Database Name) |
table_name required | string (Table Name) |
migration_policy | string (MigrationPolicy) Enum: "ask_user" "apply_asap" "migration_window" "locked" An enumeration. |
encrypt_raw_data_during_backup | boolean (Encrypt Raw Data During Backup) |
compression_type_of_backup_data | string (CompressionType) Enum: "gzip" "zip" "snappy_stream" An enumeration. |
backup_key_format | string (Backup Key Format) |
backup_settings_id required | integer (Backup Settings Id) |
destination_id | integer (Destination Id) |
destinations_list | Array of Array of any (Destinations List) [ items 2 items [ items ] ] |
data_loading_process | string (DataLoadingProcess) Enum: "replace" "upsert" "snapshot" An enumeration. |
should_reprocess | boolean (Should Reprocess) |
max_retry_count | integer (Max Retry Count) |
redirect_post_submit_to | string (Redirect Post Submit To) |
strictly_one_datetime_format_in_a_column | boolean (Strictly One Datetime Format In A Column) |
guess_datetime_format_in_ingestion | boolean (Guess Datetime Format In Ingestion) |
successful_rows_count | integer (Successful Rows Count) |
bad_rows_count | integer (Bad Rows Count) |
disabled_fields | Array of strings (Disabled Fields) |
max_tries_to_fix_json | integer (Max Tries To Fix Json) |
whodunit_id | string <uuid> (Whodunit Id) |
object (NewSettingsSchema) |
{- "name": "string",
- "schema_name": "string",
- "database_name": "string",
- "table_name": "string",
- "migration_policy": "ask_user",
- "encrypt_raw_data_during_backup": true,
- "compression_type_of_backup_data": "gzip",
- "backup_key_format": "string",
- "backup_settings_id": 0,
- "destination_id": 0,
- "destinations_list": [
- [
- 0,
- "string"
]
], - "data_loading_process": "replace",
- "should_reprocess": true,
- "max_retry_count": 0,
- "redirect_post_submit_to": "string",
- "strictly_one_datetime_format_in_a_column": true,
- "guess_datetime_format_in_ingestion": true,
- "successful_rows_count": 0,
- "bad_rows_count": 0,
- "disabled_fields": [
- "string"
], - "max_tries_to_fix_json": 0,
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "settings_model": {
- "null_values": [
- "string"
], - "null_values_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true": [
- "string"
], - "boolean_false": [
- "string"
], - "dollar_to_cent": true,
- "percent_to_decimal": true,
- "decimal_field_padding": 0,
- "string_field_padding": 0,
- "datetime_allowed_characters": "string",
- "datetime_formats": [
- "string"
], - "array_delimiters": [
- "string"
], - "datetime_formats_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "field_name_full_conversion": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion": {
- "property1": "string",
- "property2": "string"
}, - "default_value_for_field_when_casting_error": { },
- "dollar_value_if_word_in_field_name": [
- "string"
], - "non_string_fields_are_all_nullable": true,
- "use_text_instead_of_string": true,
- "trim_string_instead_of_raising_err": true,
- "string_fields_can_be_nullable": true,
- "ignore_lines_that_include_only_subset_of_characters": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words": [
- "string"
], - "ignore_fields_in_signature_calculation": [
- "string"
], - "quarantine_row_if_row_level_issue": true,
- "ignore_not_seen_before_fields_when_importing": true,
- "ignore_matchers": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "encrypt_columns": [
- "string"
], - "infer_datetime_for_columns": [
- "string"
], - "ignore_column_names": [
- "string"
], - "enable_small_integer": true,
- "enable_integer": true,
- "max_boundary_element_length": 0,
- "monetary_columns_override": {
- "property1": true,
- "property2": true
}, - "data_marked_as_valid": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "xls_date_mode": "mode_0",
- "column_overrides": {
- "property1": {
- "property1": true,
- "property2": true
}, - "property2": {
- "property1": true,
- "property2": true
}
}, - "fields_to_expand": [
- "string"
]
}
}
{- "id": 0,
- "version_id": 0,
- "name": "string",
- "schema_name": "string",
- "database_name": "string",
- "table_name": "string",
- "migration_policy": "ask_user",
- "state": "active",
- "encrypt_raw_data_during_backup": true,
- "compression_type_of_backup_data": "gzip",
- "backup_key_format": "string",
- "backup_settings_list": [
- [
- 0,
- "string"
]
], - "backup_settings_id": 0,
- "destination_id": 0,
- "destinations_list": [
- [
- 0,
- "string"
]
], - "data_loading_process": "replace",
- "should_reprocess": true,
- "max_retry_count": 0,
- "redirect_post_submit_to": "",
- "strictly_one_datetime_format_in_a_column": true,
- "guess_datetime_format_in_ingestion": true,
- "successful_rows_count": 0,
- "bad_rows_count": 0,
- "disabled_fields": [
- "string"
], - "max_tries_to_fix_json": 0,
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "settings_model": {
- "id": 0,
- "version_id": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "null_values": [
- "string"
], - "null_values_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true": [
- "string"
], - "boolean_false": [
- "string"
], - "dollar_to_cent": true,
- "percent_to_decimal": true,
- "decimal_field_padding": 0,
- "string_field_padding": 0,
- "datetime_allowed_characters": "string",
- "datetime_formats": [
- "string"
], - "array_delimiters": [
- "string"
], - "datetime_formats_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "field_name_full_conversion": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion": {
- "property1": "string",
- "property2": "string"
}, - "default_value_for_field_when_casting_error": { },
- "dollar_value_if_word_in_field_name": [
- "string"
], - "non_string_fields_are_all_nullable": true,
- "use_text_instead_of_string": true,
- "trim_string_instead_of_raising_err": true,
- "string_fields_can_be_nullable": true,
- "ignore_lines_that_include_only_subset_of_characters": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words": [
- "string"
], - "ignore_fields_in_signature_calculation": [
- "string"
], - "quarantine_row_if_row_level_issue": true,
- "ignore_not_seen_before_fields_when_importing": true,
- "ignore_matchers": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "encrypt_columns": [
- "string"
], - "infer_datetime_for_columns": [
- "string"
], - "ignore_column_names": [
- "string"
], - "enable_small_integer": true,
- "enable_integer": true,
- "max_boundary_element_length": 0,
- "monetary_columns_override": {
- "property1": true,
- "property2": true
}, - "data_marked_as_valid": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "xls_date_mode": "mode_0",
- "column_overrides": {
- "property1": {
- "property1": true,
- "property2": true
}, - "property2": {
- "property1": true,
- "property2": true
}
}, - "fields_to_expand": [
- "string"
]
}
}
Get extra information about a dataset including database migration information, etc.
dataset_id required | integer (Dataset Id) |
{- "id": 0,
- "version_id": 0,
- "name": "string",
- "schema_name": "string",
- "database_name": "string",
- "table_name": "string",
- "migration_policy": "ask_user",
- "state": "active",
- "encrypt_raw_data_during_backup": true,
- "compression_type_of_backup_data": "gzip",
- "backup_key_format": "string",
- "backup_settings_list": [
- [
- 0,
- "string"
]
], - "backup_settings_id": 0,
- "destination_id": 0,
- "destinations_list": [
- [
- 0,
- "string"
]
], - "data_loading_process": "replace",
- "should_reprocess": true,
- "max_retry_count": 0,
- "redirect_post_submit_to": "",
- "strictly_one_datetime_format_in_a_column": true,
- "guess_datetime_format_in_ingestion": true,
- "successful_rows_count": 0,
- "bad_rows_count": 0,
- "disabled_fields": [
- "string"
], - "max_tries_to_fix_json": 0,
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "settings_model": {
- "id": 0,
- "version_id": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "null_values": [
- "string"
], - "null_values_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true": [
- "string"
], - "boolean_false": [
- "string"
], - "dollar_to_cent": true,
- "percent_to_decimal": true,
- "decimal_field_padding": 0,
- "string_field_padding": 0,
- "datetime_allowed_characters": "string",
- "datetime_formats": [
- "string"
], - "array_delimiters": [
- "string"
], - "datetime_formats_per_column": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "field_name_full_conversion": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion": {
- "property1": "string",
- "property2": "string"
}, - "default_value_for_field_when_casting_error": { },
- "dollar_value_if_word_in_field_name": [
- "string"
], - "non_string_fields_are_all_nullable": true,
- "use_text_instead_of_string": true,
- "trim_string_instead_of_raising_err": true,
- "string_fields_can_be_nullable": true,
- "ignore_lines_that_include_only_subset_of_characters": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words": [
- "string"
], - "ignore_fields_in_signature_calculation": [
- "string"
], - "quarantine_row_if_row_level_issue": true,
- "ignore_not_seen_before_fields_when_importing": true,
- "ignore_matchers": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "encrypt_columns": [
- "string"
], - "infer_datetime_for_columns": [
- "string"
], - "ignore_column_names": [
- "string"
], - "enable_small_integer": true,
- "enable_integer": true,
- "max_boundary_element_length": 0,
- "monetary_columns_override": {
- "property1": true,
- "property2": true
}, - "data_marked_as_valid": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "xls_date_mode": "mode_0",
- "column_overrides": {
- "property1": {
- "property1": true,
- "property2": true
}, - "property2": {
- "property1": true,
- "property2": true
}
}, - "fields_to_expand": [
- "string"
]
}, - "current_migration_model_id": 0,
- "destination_name": "string",
- "unresolved_alerts": 0
}
Are there any training jobs currently running for this dataset?
A training job is a collection of analysis tasks that are analyzing certain files in the dataset.
The analysis results may lead to recommendations for the user to approve certain changes to the destination database.
dataset_id required | integer (Dataset Id) |
{- "result": true
}
dataset_id required | integer (Dataset Id) |
name required | string (Name) |
{- "name": "string"
}
{- "result": "string"
}
start | integer (Start) |
end | integer (End) |
limit | integer (Limit) Default: 20 |
reverse | boolean (Reverse) Default: true |
dataset_id | integer (Dataset Id) |
states | Array of strings (States) Default: ["active","disabled","draft"] |
{- "next": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "previous": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "results": [
- {
- "id": 0,
- "name": "string",
- "last_sync": "2019-08-24T14:15:22Z",
- "dataset_name": "string",
- "destination_name": "string",
- "dataset_id": "string",
- "unresolved_alerts": 0,
- "data_source_type": "s3",
- "schedule": "string",
- "state": "active"
}
], - "filters": {
- "settings_model_id": 0
}, - "title": ""
}
First the client hits /datasets/[dataset_id]/data-sources/new to get the data source model with some default values. Once the user fills up those values, a POST query is made to actually create a new data source model.
name required | string (Name) |
dataset_name | string (Dataset Name) |
settings_model_id required | integer (Settings Model Id) |
dataset_id | integer (Dataset Id) |
data_source_type | string (DataSourceType) Enum: "s3" "minio" "sftp" "gs" "dropbox" "stream" "custom" An enumeration. |
schedule | string (Schedule) |
state | string (DataSourceState) Enum: "active" "draft" "disabled" "deleted" An enumeration. |
connection_timeout | integer (Connection Timeout) |
delete_source_file_after_backup | boolean (Delete Source File After Backup) |
encoding_probably | string (Encoding Probably) |
identify_header_by_column_names | Array of strings (Identify Header By Column Names) |
raw_headers | Array of strings (Raw Headers) |
chunk_size_to_read_for_header | integer (Chunk Size To Read For Header) |
bad_rows_count | integer (Bad Rows Count) |
pattern | string (Pattern) |
archive_pattern | string (Archive Pattern) |
csv_delimiter | string (Csv Delimiter) |
csv_quotechar | string (Csv Quotechar) |
csv_escapechar | string (Csv Escapechar) |
csv_lineterminator | string (Csv Lineterminator) |
s3_access_key | string (S3 Access Key) |
s3_secret_key | string (S3 Secret Key) |
s3_endpoint_url | string (S3 Endpoint Url) |
s3_region_name | string (S3 Region Name) |
s3_default_buffer_size | integer (S3 Default Buffer Size) |
s3_bucket | string (S3 Bucket) |
s3_prefix | string (S3 Prefix) |
sftp_host | string (Sftp Host) |
sftp_port | integer (Sftp Port) |
sftp_user | string (Sftp User) |
sftp_password | string (Sftp Password) |
sftp_ssh_key | string (Sftp Ssh Key) |
sftp_ssh_key_passphrase | string (Sftp Ssh Key Passphrase) |
sftp_folder | string (Sftp Folder) |
gs_service_account_key | string (Gs Service Account Key) |
gs_bucket | string (Gs Bucket) |
gs_prefix | string (Gs Prefix) |
gpg_private_key | string (Gpg Private Key) |
gpg_passphrase | string (Gpg Passphrase) |
file_password | string (File Password) |
dropbox_access_token | string (Dropbox Access Token) |
dropbox_folder | string (Dropbox Folder) |
ignore_column_ids | Array of strings (Ignore Column Ids) |
excel_sheet_name | string (Excel Sheet Name) |
object (Datetime Formats Per Column Override) | |
should_reprocess_override | boolean (Should Reprocess Override) |
xls_date_mode_override | string (XLSDateMode) Enum: "mode_0" "mode_1" An enumeration. |
null_values_override | Array of strings (Null Values Override) |
object (Null Values Per Column Override) | |
boolean_true_override | Array of strings (Boolean True Override) |
boolean_false_override | Array of strings (Boolean False Override) |
datetime_allowed_characters_override | string (Datetime Allowed Characters Override) |
datetime_formats_override | Array of strings (Datetime Formats Override) |
array_delimiters_override | Array of strings (Array Delimiters Override) |
object (Field Name Full Conversion Override) | |
object (Field Name Part Conversion Override) | |
ignore_lines_that_include_only_subset_of_characters_override | Array of strings (Ignore Lines That Include Only Subset Of Characters Override) |
ignore_lines_that_include_only_subset_of_words_override | Array of strings (Ignore Lines That Include Only Subset Of Words Override) |
object (Ignore Matchers Override) | |
ignore_column_names_override | Array of strings (Ignore Column Names Override) |
use_text_instead_of_string_override | boolean (Use Text Instead Of String Override) |
quarantine_row_if_row_level_issue_override | boolean (Quarantine Row If Row Level Issue Override) |
trim_string_instead_of_raising_err_override | boolean (Trim String Instead Of Raising Err Override) |
fields_to_expand_override | Array of strings (Fields To Expand Override) |
infer_datetime_for_columns_override | Array of strings (Infer Datetime For Columns Override) |
redirect_post_submit_to | string (Redirect Post Submit To) |
{- "name": "string",
- "dataset_name": "string",
- "settings_model_id": 0,
- "dataset_id": 0,
- "data_source_type": "s3",
- "schedule": "string",
- "state": "active",
- "connection_timeout": 0,
- "delete_source_file_after_backup": true,
- "encoding_probably": "string",
- "identify_header_by_column_names": [
- "string"
], - "raw_headers": [
- "string"
], - "chunk_size_to_read_for_header": 0,
- "bad_rows_count": 0,
- "pattern": "string",
- "archive_pattern": "string",
- "csv_delimiter": "string",
- "csv_quotechar": "string",
- "csv_escapechar": "string",
- "csv_lineterminator": "string",
- "s3_access_key": "string",
- "s3_secret_key": "string",
- "s3_endpoint_url": "string",
- "s3_region_name": "string",
- "s3_default_buffer_size": 0,
- "s3_bucket": "string",
- "s3_prefix": "string",
- "sftp_host": "string",
- "sftp_port": 0,
- "sftp_user": "string",
- "sftp_password": "string",
- "sftp_ssh_key": "string",
- "sftp_ssh_key_passphrase": "string",
- "sftp_folder": "string",
- "gs_service_account_key": "string",
- "gs_bucket": "string",
- "gs_prefix": "string",
- "gpg_private_key": "string",
- "gpg_passphrase": "string",
- "file_password": "string",
- "dropbox_access_token": "string",
- "dropbox_folder": "string",
- "ignore_column_ids": [
- "string"
], - "excel_sheet_name": "string",
- "datetime_formats_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "should_reprocess_override": true,
- "xls_date_mode_override": "mode_0",
- "null_values_override": [
- "string"
], - "null_values_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true_override": [
- "string"
], - "boolean_false_override": [
- "string"
], - "datetime_allowed_characters_override": "string",
- "datetime_formats_override": [
- "string"
], - "array_delimiters_override": [
- "string"
], - "field_name_full_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "ignore_lines_that_include_only_subset_of_characters_override": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words_override": [
- "string"
], - "ignore_matchers_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "ignore_column_names_override": [
- "string"
], - "use_text_instead_of_string_override": true,
- "quarantine_row_if_row_level_issue_override": true,
- "trim_string_instead_of_raising_err_override": true,
- "fields_to_expand_override": [
- "string"
], - "infer_datetime_for_columns_override": [
- "string"
], - "redirect_post_submit_to": "string"
}
{- "id": 0,
- "name": "string",
- "dataset_name": "string",
- "created_at": "2019-08-24T14:15:22Z",
- "settings_model_id": 0,
- "dataset_id": 0,
- "version_id": 0,
- "data_source_type": "s3",
- "schedule": "string",
- "state": "active",
- "connection_timeout": 0,
- "delete_source_file_after_backup": true,
- "encoding_probably": "string",
- "identify_header_by_column_names": [
- "string"
], - "raw_headers": [
- "string"
], - "chunk_size_to_read_for_header": 0,
- "bad_rows_count": 0,
- "pattern": "string",
- "archive_pattern": "string",
- "csv_delimiter": "string",
- "csv_quotechar": "string",
- "csv_escapechar": "string",
- "csv_lineterminator": "string",
- "s3_access_key": "string",
- "s3_secret_key": "string",
- "s3_endpoint_url": "string",
- "s3_region_name": "string",
- "s3_default_buffer_size": 0,
- "s3_bucket": "string",
- "s3_prefix": "string",
- "sftp_host": "string",
- "sftp_port": 0,
- "sftp_user": "string",
- "sftp_password": "string",
- "sftp_ssh_key": "string",
- "sftp_ssh_key_passphrase": "string",
- "sftp_folder": "string",
- "gs_service_account_key": "string",
- "gs_bucket": "string",
- "gs_prefix": "string",
- "gpg_private_key": "string",
- "gpg_passphrase": "string",
- "file_password": "string",
- "dropbox_access_token": "string",
- "dropbox_folder": "string",
- "ignore_column_ids": [
- "string"
], - "excel_sheet_name": "string",
- "datetime_formats_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "should_reprocess_override": true,
- "xls_date_mode_override": "mode_0",
- "null_values_override": [
- "string"
], - "null_values_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true_override": [
- "string"
], - "boolean_false_override": [
- "string"
], - "datetime_allowed_characters_override": "string",
- "datetime_formats_override": [
- "string"
], - "array_delimiters_override": [
- "string"
], - "field_name_full_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "ignore_lines_that_include_only_subset_of_characters_override": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words_override": [
- "string"
], - "ignore_matchers_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "ignore_column_names_override": [
- "string"
], - "use_text_instead_of_string_override": true,
- "quarantine_row_if_row_level_issue_override": true,
- "trim_string_instead_of_raising_err_override": true,
- "fields_to_expand_override": [
- "string"
], - "infer_datetime_for_columns_override": [
- "string"
], - "redirect_post_submit_to": ""
}
dataset_id required | integer (Dataset Id) |
{- "id": 0,
- "name": "string",
- "dataset_name": "string",
- "created_at": "2019-08-24T14:15:22Z",
- "settings_model_id": 0,
- "dataset_id": 0,
- "version_id": 0,
- "data_source_type": "s3",
- "schedule": "string",
- "state": "active",
- "connection_timeout": 0,
- "delete_source_file_after_backup": true,
- "encoding_probably": "string",
- "identify_header_by_column_names": [
- "string"
], - "raw_headers": [
- "string"
], - "chunk_size_to_read_for_header": 0,
- "bad_rows_count": 0,
- "pattern": "string",
- "archive_pattern": "string",
- "csv_delimiter": "string",
- "csv_quotechar": "string",
- "csv_escapechar": "string",
- "csv_lineterminator": "string",
- "s3_access_key": "string",
- "s3_secret_key": "string",
- "s3_endpoint_url": "string",
- "s3_region_name": "string",
- "s3_default_buffer_size": 0,
- "s3_bucket": "string",
- "s3_prefix": "string",
- "sftp_host": "string",
- "sftp_port": 0,
- "sftp_user": "string",
- "sftp_password": "string",
- "sftp_ssh_key": "string",
- "sftp_ssh_key_passphrase": "string",
- "sftp_folder": "string",
- "gs_service_account_key": "string",
- "gs_bucket": "string",
- "gs_prefix": "string",
- "gpg_private_key": "string",
- "gpg_passphrase": "string",
- "file_password": "string",
- "dropbox_access_token": "string",
- "dropbox_folder": "string",
- "ignore_column_ids": [
- "string"
], - "excel_sheet_name": "string",
- "datetime_formats_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "should_reprocess_override": true,
- "xls_date_mode_override": "mode_0",
- "null_values_override": [
- "string"
], - "null_values_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true_override": [
- "string"
], - "boolean_false_override": [
- "string"
], - "datetime_allowed_characters_override": "string",
- "datetime_formats_override": [
- "string"
], - "array_delimiters_override": [
- "string"
], - "field_name_full_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "ignore_lines_that_include_only_subset_of_characters_override": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words_override": [
- "string"
], - "ignore_matchers_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "ignore_column_names_override": [
- "string"
], - "use_text_instead_of_string_override": true,
- "quarantine_row_if_row_level_issue_override": true,
- "trim_string_instead_of_raising_err_override": true,
- "fields_to_expand_override": [
- "string"
], - "infer_datetime_for_columns_override": [
- "string"
], - "redirect_post_submit_to": ""
}
dataset_id required | integer (Dataset Id) |
data_source_model_id required | integer (Data Source Model Id) |
{- "id": 0,
- "name": "string",
- "dataset_name": "string",
- "created_at": "2019-08-24T14:15:22Z",
- "settings_model_id": 0,
- "dataset_id": 0,
- "version_id": 0,
- "data_source_type": "s3",
- "schedule": "string",
- "state": "active",
- "connection_timeout": 0,
- "delete_source_file_after_backup": true,
- "encoding_probably": "string",
- "identify_header_by_column_names": [
- "string"
], - "raw_headers": [
- "string"
], - "chunk_size_to_read_for_header": 0,
- "bad_rows_count": 0,
- "pattern": "string",
- "archive_pattern": "string",
- "csv_delimiter": "string",
- "csv_quotechar": "string",
- "csv_escapechar": "string",
- "csv_lineterminator": "string",
- "s3_access_key": "string",
- "s3_secret_key": "string",
- "s3_endpoint_url": "string",
- "s3_region_name": "string",
- "s3_default_buffer_size": 0,
- "s3_bucket": "string",
- "s3_prefix": "string",
- "sftp_host": "string",
- "sftp_port": 0,
- "sftp_user": "string",
- "sftp_password": "string",
- "sftp_ssh_key": "string",
- "sftp_ssh_key_passphrase": "string",
- "sftp_folder": "string",
- "gs_service_account_key": "string",
- "gs_bucket": "string",
- "gs_prefix": "string",
- "gpg_private_key": "string",
- "gpg_passphrase": "string",
- "file_password": "string",
- "dropbox_access_token": "string",
- "dropbox_folder": "string",
- "ignore_column_ids": [
- "string"
], - "excel_sheet_name": "string",
- "datetime_formats_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "should_reprocess_override": true,
- "xls_date_mode_override": "mode_0",
- "null_values_override": [
- "string"
], - "null_values_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true_override": [
- "string"
], - "boolean_false_override": [
- "string"
], - "datetime_allowed_characters_override": "string",
- "datetime_formats_override": [
- "string"
], - "array_delimiters_override": [
- "string"
], - "field_name_full_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "ignore_lines_that_include_only_subset_of_characters_override": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words_override": [
- "string"
], - "ignore_matchers_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "ignore_column_names_override": [
- "string"
], - "use_text_instead_of_string_override": true,
- "quarantine_row_if_row_level_issue_override": true,
- "trim_string_instead_of_raising_err_override": true,
- "fields_to_expand_override": [
- "string"
], - "infer_datetime_for_columns_override": [
- "string"
], - "redirect_post_submit_to": ""
}
data_source_model_id required | integer (Data Source Model Id) |
{- "id": 0,
- "name": "string",
- "dataset_name": "string",
- "created_at": "2019-08-24T14:15:22Z",
- "settings_model_id": 0,
- "dataset_id": 0,
- "version_id": 0,
- "data_source_type": "s3",
- "schedule": "string",
- "state": "active",
- "connection_timeout": 0,
- "delete_source_file_after_backup": true,
- "encoding_probably": "string",
- "identify_header_by_column_names": [
- "string"
], - "raw_headers": [
- "string"
], - "chunk_size_to_read_for_header": 0,
- "bad_rows_count": 0,
- "pattern": "string",
- "archive_pattern": "string",
- "csv_delimiter": "string",
- "csv_quotechar": "string",
- "csv_escapechar": "string",
- "csv_lineterminator": "string",
- "s3_access_key": "string",
- "s3_secret_key": "string",
- "s3_endpoint_url": "string",
- "s3_region_name": "string",
- "s3_default_buffer_size": 0,
- "s3_bucket": "string",
- "s3_prefix": "string",
- "sftp_host": "string",
- "sftp_port": 0,
- "sftp_user": "string",
- "sftp_password": "string",
- "sftp_ssh_key": "string",
- "sftp_ssh_key_passphrase": "string",
- "sftp_folder": "string",
- "gs_service_account_key": "string",
- "gs_bucket": "string",
- "gs_prefix": "string",
- "gpg_private_key": "string",
- "gpg_passphrase": "string",
- "file_password": "string",
- "dropbox_access_token": "string",
- "dropbox_folder": "string",
- "ignore_column_ids": [
- "string"
], - "excel_sheet_name": "string",
- "datetime_formats_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "should_reprocess_override": true,
- "xls_date_mode_override": "mode_0",
- "null_values_override": [
- "string"
], - "null_values_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true_override": [
- "string"
], - "boolean_false_override": [
- "string"
], - "datetime_allowed_characters_override": "string",
- "datetime_formats_override": [
- "string"
], - "array_delimiters_override": [
- "string"
], - "field_name_full_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "ignore_lines_that_include_only_subset_of_characters_override": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words_override": [
- "string"
], - "ignore_matchers_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "ignore_column_names_override": [
- "string"
], - "use_text_instead_of_string_override": true,
- "quarantine_row_if_row_level_issue_override": true,
- "trim_string_instead_of_raising_err_override": true,
- "fields_to_expand_override": [
- "string"
], - "infer_datetime_for_columns_override": [
- "string"
], - "redirect_post_submit_to": ""
}
We have the Pydantic validate and deserialize items but the model we get back has every original attribute even if they were not part of the request. Hence we use the original request object to only grab what needed in order to patch the dataset object.
Example request:
{ "id": 1, "version_id": 0, "pattern": "some regex pattern" }
data_source_model_id required | integer (Data Source Model Id) |
id required | integer (Id) |
name | string (Name) |
settings_model_id | integer (Settings Model Id) |
dataset_id | integer (Dataset Id) |
version_id required | integer (Version Id) |
data_source_type | string (DataSourceType) Enum: "s3" "minio" "sftp" "gs" "dropbox" "stream" "custom" An enumeration. |
schedule | string (Schedule) |
state | string (DataSourceState) Enum: "active" "draft" "disabled" "deleted" An enumeration. |
connection_timeout | integer (Connection Timeout) |
delete_source_file_after_backup | boolean (Delete Source File After Backup) |
encoding_probably | string (Encoding Probably) |
identify_header_by_column_names | Array of strings (Identify Header By Column Names) |
raw_headers | Array of strings (Raw Headers) |
chunk_size_to_read_for_header | integer (Chunk Size To Read For Header) |
bad_rows_count | integer (Bad Rows Count) |
pattern | string (Pattern) |
archive_pattern | string (Archive Pattern) |
csv_delimiter | string (Csv Delimiter) |
csv_quotechar | string (Csv Quotechar) |
csv_escapechar | string (Csv Escapechar) |
csv_lineterminator | string (Csv Lineterminator) |
s3_access_key | string (S3 Access Key) |
s3_secret_key | string (S3 Secret Key) |
s3_endpoint_url | string (S3 Endpoint Url) |
s3_region_name | string (S3 Region Name) |
s3_default_buffer_size | integer (S3 Default Buffer Size) |
s3_bucket | string (S3 Bucket) |
s3_prefix | string (S3 Prefix) |
sftp_host | string (Sftp Host) |
sftp_port | integer (Sftp Port) |
sftp_user | string (Sftp User) |
sftp_password | string (Sftp Password) |
sftp_ssh_key | string (Sftp Ssh Key) |
sftp_ssh_key_passphrase | string (Sftp Ssh Key Passphrase) |
sftp_folder | string (Sftp Folder) |
gs_service_account_key | string (Gs Service Account Key) |
gs_bucket | string (Gs Bucket) |
gs_prefix | string (Gs Prefix) |
gpg_private_key | string (Gpg Private Key) |
gpg_passphrase | string (Gpg Passphrase) |
file_password | string (File Password) |
dropbox_access_token | string (Dropbox Access Token) |
dropbox_folder | string (Dropbox Folder) |
ignore_column_ids | Array of strings (Ignore Column Ids) |
excel_sheet_name | string (Excel Sheet Name) |
object (Datetime Formats Per Column Override) | |
should_reprocess_override | boolean (Should Reprocess Override) |
xls_date_mode_override | string (XLSDateMode) Enum: "mode_0" "mode_1" An enumeration. |
null_values_override | Array of strings (Null Values Override) |
object (Null Values Per Column Override) | |
boolean_true_override | Array of strings (Boolean True Override) |
boolean_false_override | Array of strings (Boolean False Override) |
datetime_allowed_characters_override | string (Datetime Allowed Characters Override) |
datetime_formats_override | Array of strings (Datetime Formats Override) |
array_delimiters_override | Array of strings (Array Delimiters Override) |
object (Field Name Full Conversion Override) | |
object (Field Name Part Conversion Override) | |
ignore_lines_that_include_only_subset_of_characters_override | Array of strings (Ignore Lines That Include Only Subset Of Characters Override) |
ignore_lines_that_include_only_subset_of_words_override | Array of strings (Ignore Lines That Include Only Subset Of Words Override) |
object (Ignore Matchers Override) | |
ignore_column_names_override | Array of strings (Ignore Column Names Override) |
use_text_instead_of_string_override | boolean (Use Text Instead Of String Override) |
quarantine_row_if_row_level_issue_override | boolean (Quarantine Row If Row Level Issue Override) |
trim_string_instead_of_raising_err_override | boolean (Trim String Instead Of Raising Err Override) |
fields_to_expand_override | Array of strings (Fields To Expand Override) |
infer_datetime_for_columns_override | Array of strings (Infer Datetime For Columns Override) |
redirect_post_submit_to | string (Redirect Post Submit To) |
{- "id": 0,
- "name": "string",
- "settings_model_id": 0,
- "dataset_id": 0,
- "version_id": 0,
- "data_source_type": "s3",
- "schedule": "string",
- "state": "active",
- "connection_timeout": 0,
- "delete_source_file_after_backup": true,
- "encoding_probably": "string",
- "identify_header_by_column_names": [
- "string"
], - "raw_headers": [
- "string"
], - "chunk_size_to_read_for_header": 0,
- "bad_rows_count": 0,
- "pattern": "string",
- "archive_pattern": "string",
- "csv_delimiter": "string",
- "csv_quotechar": "string",
- "csv_escapechar": "string",
- "csv_lineterminator": "string",
- "s3_access_key": "string",
- "s3_secret_key": "string",
- "s3_endpoint_url": "string",
- "s3_region_name": "string",
- "s3_default_buffer_size": 0,
- "s3_bucket": "string",
- "s3_prefix": "string",
- "sftp_host": "string",
- "sftp_port": 0,
- "sftp_user": "string",
- "sftp_password": "string",
- "sftp_ssh_key": "string",
- "sftp_ssh_key_passphrase": "string",
- "sftp_folder": "string",
- "gs_service_account_key": "string",
- "gs_bucket": "string",
- "gs_prefix": "string",
- "gpg_private_key": "string",
- "gpg_passphrase": "string",
- "file_password": "string",
- "dropbox_access_token": "string",
- "dropbox_folder": "string",
- "ignore_column_ids": [
- "string"
], - "excel_sheet_name": "string",
- "datetime_formats_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "should_reprocess_override": true,
- "xls_date_mode_override": "mode_0",
- "null_values_override": [
- "string"
], - "null_values_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true_override": [
- "string"
], - "boolean_false_override": [
- "string"
], - "datetime_allowed_characters_override": "string",
- "datetime_formats_override": [
- "string"
], - "array_delimiters_override": [
- "string"
], - "field_name_full_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "ignore_lines_that_include_only_subset_of_characters_override": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words_override": [
- "string"
], - "ignore_matchers_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "ignore_column_names_override": [
- "string"
], - "use_text_instead_of_string_override": true,
- "quarantine_row_if_row_level_issue_override": true,
- "trim_string_instead_of_raising_err_override": true,
- "fields_to_expand_override": [
- "string"
], - "infer_datetime_for_columns_override": [
- "string"
], - "redirect_post_submit_to": "string"
}
{- "id": 0,
- "name": "string",
- "dataset_name": "string",
- "created_at": "2019-08-24T14:15:22Z",
- "settings_model_id": 0,
- "dataset_id": 0,
- "version_id": 0,
- "data_source_type": "s3",
- "schedule": "string",
- "state": "active",
- "connection_timeout": 0,
- "delete_source_file_after_backup": true,
- "encoding_probably": "string",
- "identify_header_by_column_names": [
- "string"
], - "raw_headers": [
- "string"
], - "chunk_size_to_read_for_header": 0,
- "bad_rows_count": 0,
- "pattern": "string",
- "archive_pattern": "string",
- "csv_delimiter": "string",
- "csv_quotechar": "string",
- "csv_escapechar": "string",
- "csv_lineterminator": "string",
- "s3_access_key": "string",
- "s3_secret_key": "string",
- "s3_endpoint_url": "string",
- "s3_region_name": "string",
- "s3_default_buffer_size": 0,
- "s3_bucket": "string",
- "s3_prefix": "string",
- "sftp_host": "string",
- "sftp_port": 0,
- "sftp_user": "string",
- "sftp_password": "string",
- "sftp_ssh_key": "string",
- "sftp_ssh_key_passphrase": "string",
- "sftp_folder": "string",
- "gs_service_account_key": "string",
- "gs_bucket": "string",
- "gs_prefix": "string",
- "gpg_private_key": "string",
- "gpg_passphrase": "string",
- "file_password": "string",
- "dropbox_access_token": "string",
- "dropbox_folder": "string",
- "ignore_column_ids": [
- "string"
], - "excel_sheet_name": "string",
- "datetime_formats_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "should_reprocess_override": true,
- "xls_date_mode_override": "mode_0",
- "null_values_override": [
- "string"
], - "null_values_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true_override": [
- "string"
], - "boolean_false_override": [
- "string"
], - "datetime_allowed_characters_override": "string",
- "datetime_formats_override": [
- "string"
], - "array_delimiters_override": [
- "string"
], - "field_name_full_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "ignore_lines_that_include_only_subset_of_characters_override": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words_override": [
- "string"
], - "ignore_matchers_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "ignore_column_names_override": [
- "string"
], - "use_text_instead_of_string_override": true,
- "quarantine_row_if_row_level_issue_override": true,
- "trim_string_instead_of_raising_err_override": true,
- "fields_to_expand_override": [
- "string"
], - "infer_datetime_for_columns_override": [
- "string"
], - "redirect_post_submit_to": ""
}
data_source_model_id required | integer (Data Source Model Id) |
{- "id": 0,
- "name": "string",
- "last_sync": "2019-08-24T14:15:22Z",
- "dataset_name": "string",
- "destination_name": "string",
- "dataset_id": "string",
- "unresolved_alerts": 0,
- "data_source_type": "s3",
- "schedule": "string",
- "state": "active"
}
Get the history of the data source
data_source_model_id required | integer (Data Source Model Id) |
[- {
- "version_id": 0,
- "history_items": [
- {
- "action": "changed",
- "field_name": "string",
- "new_value": null,
- "old_value": null
}
], - "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff",
- "whodunit_first_name": "string",
- "whodunit_last_name": "string",
- "datetime": "string"
}
]
First the client hits /datasets/[dataset_id]/data-sources/[data_source_model_id]/copy to get the data source model that is a copy of the one passed ready for the same or different dataset_id. with some default values. Once the user fills up those values, a POST query is made to actually create a new data source model.
data_source_model_id required | integer (Data Source Model Id) |
name required | string (Name) |
dataset_name | string (Dataset Name) |
settings_model_id required | integer (Settings Model Id) |
dataset_id | integer (Dataset Id) |
data_source_type | string (DataSourceType) Enum: "s3" "minio" "sftp" "gs" "dropbox" "stream" "custom" An enumeration. |
schedule | string (Schedule) |
state | string (DataSourceState) Enum: "active" "draft" "disabled" "deleted" An enumeration. |
connection_timeout | integer (Connection Timeout) |
delete_source_file_after_backup | boolean (Delete Source File After Backup) |
encoding_probably | string (Encoding Probably) |
identify_header_by_column_names | Array of strings (Identify Header By Column Names) |
raw_headers | Array of strings (Raw Headers) |
chunk_size_to_read_for_header | integer (Chunk Size To Read For Header) |
bad_rows_count | integer (Bad Rows Count) |
pattern | string (Pattern) |
archive_pattern | string (Archive Pattern) |
csv_delimiter | string (Csv Delimiter) |
csv_quotechar | string (Csv Quotechar) |
csv_escapechar | string (Csv Escapechar) |
csv_lineterminator | string (Csv Lineterminator) |
s3_access_key | string (S3 Access Key) |
s3_secret_key | string (S3 Secret Key) |
s3_endpoint_url | string (S3 Endpoint Url) |
s3_region_name | string (S3 Region Name) |
s3_default_buffer_size | integer (S3 Default Buffer Size) |
s3_bucket | string (S3 Bucket) |
s3_prefix | string (S3 Prefix) |
sftp_host | string (Sftp Host) |
sftp_port | integer (Sftp Port) |
sftp_user | string (Sftp User) |
sftp_password | string (Sftp Password) |
sftp_ssh_key | string (Sftp Ssh Key) |
sftp_ssh_key_passphrase | string (Sftp Ssh Key Passphrase) |
sftp_folder | string (Sftp Folder) |
gs_service_account_key | string (Gs Service Account Key) |
gs_bucket | string (Gs Bucket) |
gs_prefix | string (Gs Prefix) |
gpg_private_key | string (Gpg Private Key) |
gpg_passphrase | string (Gpg Passphrase) |
file_password | string (File Password) |
dropbox_access_token | string (Dropbox Access Token) |
dropbox_folder | string (Dropbox Folder) |
ignore_column_ids | Array of strings (Ignore Column Ids) |
excel_sheet_name | string (Excel Sheet Name) |
object (Datetime Formats Per Column Override) | |
should_reprocess_override | boolean (Should Reprocess Override) |
xls_date_mode_override | string (XLSDateMode) Enum: "mode_0" "mode_1" An enumeration. |
null_values_override | Array of strings (Null Values Override) |
object (Null Values Per Column Override) | |
boolean_true_override | Array of strings (Boolean True Override) |
boolean_false_override | Array of strings (Boolean False Override) |
datetime_allowed_characters_override | string (Datetime Allowed Characters Override) |
datetime_formats_override | Array of strings (Datetime Formats Override) |
array_delimiters_override | Array of strings (Array Delimiters Override) |
object (Field Name Full Conversion Override) | |
object (Field Name Part Conversion Override) | |
ignore_lines_that_include_only_subset_of_characters_override | Array of strings (Ignore Lines That Include Only Subset Of Characters Override) |
ignore_lines_that_include_only_subset_of_words_override | Array of strings (Ignore Lines That Include Only Subset Of Words Override) |
object (Ignore Matchers Override) | |
ignore_column_names_override | Array of strings (Ignore Column Names Override) |
use_text_instead_of_string_override | boolean (Use Text Instead Of String Override) |
quarantine_row_if_row_level_issue_override | boolean (Quarantine Row If Row Level Issue Override) |
trim_string_instead_of_raising_err_override | boolean (Trim String Instead Of Raising Err Override) |
fields_to_expand_override | Array of strings (Fields To Expand Override) |
infer_datetime_for_columns_override | Array of strings (Infer Datetime For Columns Override) |
redirect_post_submit_to | string (Redirect Post Submit To) |
{- "name": "string",
- "dataset_name": "string",
- "settings_model_id": 0,
- "dataset_id": 0,
- "data_source_type": "s3",
- "schedule": "string",
- "state": "active",
- "connection_timeout": 0,
- "delete_source_file_after_backup": true,
- "encoding_probably": "string",
- "identify_header_by_column_names": [
- "string"
], - "raw_headers": [
- "string"
], - "chunk_size_to_read_for_header": 0,
- "bad_rows_count": 0,
- "pattern": "string",
- "archive_pattern": "string",
- "csv_delimiter": "string",
- "csv_quotechar": "string",
- "csv_escapechar": "string",
- "csv_lineterminator": "string",
- "s3_access_key": "string",
- "s3_secret_key": "string",
- "s3_endpoint_url": "string",
- "s3_region_name": "string",
- "s3_default_buffer_size": 0,
- "s3_bucket": "string",
- "s3_prefix": "string",
- "sftp_host": "string",
- "sftp_port": 0,
- "sftp_user": "string",
- "sftp_password": "string",
- "sftp_ssh_key": "string",
- "sftp_ssh_key_passphrase": "string",
- "sftp_folder": "string",
- "gs_service_account_key": "string",
- "gs_bucket": "string",
- "gs_prefix": "string",
- "gpg_private_key": "string",
- "gpg_passphrase": "string",
- "file_password": "string",
- "dropbox_access_token": "string",
- "dropbox_folder": "string",
- "ignore_column_ids": [
- "string"
], - "excel_sheet_name": "string",
- "datetime_formats_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "should_reprocess_override": true,
- "xls_date_mode_override": "mode_0",
- "null_values_override": [
- "string"
], - "null_values_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true_override": [
- "string"
], - "boolean_false_override": [
- "string"
], - "datetime_allowed_characters_override": "string",
- "datetime_formats_override": [
- "string"
], - "array_delimiters_override": [
- "string"
], - "field_name_full_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "ignore_lines_that_include_only_subset_of_characters_override": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words_override": [
- "string"
], - "ignore_matchers_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "ignore_column_names_override": [
- "string"
], - "use_text_instead_of_string_override": true,
- "quarantine_row_if_row_level_issue_override": true,
- "trim_string_instead_of_raising_err_override": true,
- "fields_to_expand_override": [
- "string"
], - "infer_datetime_for_columns_override": [
- "string"
], - "redirect_post_submit_to": "string"
}
{- "id": 0,
- "name": "string",
- "dataset_name": "string",
- "created_at": "2019-08-24T14:15:22Z",
- "settings_model_id": 0,
- "dataset_id": 0,
- "version_id": 0,
- "data_source_type": "s3",
- "schedule": "string",
- "state": "active",
- "connection_timeout": 0,
- "delete_source_file_after_backup": true,
- "encoding_probably": "string",
- "identify_header_by_column_names": [
- "string"
], - "raw_headers": [
- "string"
], - "chunk_size_to_read_for_header": 0,
- "bad_rows_count": 0,
- "pattern": "string",
- "archive_pattern": "string",
- "csv_delimiter": "string",
- "csv_quotechar": "string",
- "csv_escapechar": "string",
- "csv_lineterminator": "string",
- "s3_access_key": "string",
- "s3_secret_key": "string",
- "s3_endpoint_url": "string",
- "s3_region_name": "string",
- "s3_default_buffer_size": 0,
- "s3_bucket": "string",
- "s3_prefix": "string",
- "sftp_host": "string",
- "sftp_port": 0,
- "sftp_user": "string",
- "sftp_password": "string",
- "sftp_ssh_key": "string",
- "sftp_ssh_key_passphrase": "string",
- "sftp_folder": "string",
- "gs_service_account_key": "string",
- "gs_bucket": "string",
- "gs_prefix": "string",
- "gpg_private_key": "string",
- "gpg_passphrase": "string",
- "file_password": "string",
- "dropbox_access_token": "string",
- "dropbox_folder": "string",
- "ignore_column_ids": [
- "string"
], - "excel_sheet_name": "string",
- "datetime_formats_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "should_reprocess_override": true,
- "xls_date_mode_override": "mode_0",
- "null_values_override": [
- "string"
], - "null_values_per_column_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "boolean_true_override": [
- "string"
], - "boolean_false_override": [
- "string"
], - "datetime_allowed_characters_override": "string",
- "datetime_formats_override": [
- "string"
], - "array_delimiters_override": [
- "string"
], - "field_name_full_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "field_name_part_conversion_override": {
- "property1": "string",
- "property2": "string"
}, - "ignore_lines_that_include_only_subset_of_characters_override": [
- "string"
], - "ignore_lines_that_include_only_subset_of_words_override": [
- "string"
], - "ignore_matchers_override": {
- "property1": [
- "string"
], - "property2": [
- "string"
]
}, - "ignore_column_names_override": [
- "string"
], - "use_text_instead_of_string_override": true,
- "quarantine_row_if_row_level_issue_override": true,
- "trim_string_instead_of_raising_err_override": true,
- "fields_to_expand_override": [
- "string"
], - "infer_datetime_for_columns_override": [
- "string"
], - "redirect_post_submit_to": ""
}
data_source_model_id required | integer (Data Source Model Id) |
name required | string (Name) |
{- "name": "string"
}
{- "result": "string"
}
Get the list of stored items (files)
start | integer (Start) |
end | integer (End) |
reverse | boolean (Reverse) Default: true |
limit | integer (Limit) Default: 20 |
data_source_model_id | integer (Data Source Model Id) |
parent_id | integer (Parent Id) |
is_uploaded_via_signed_url | boolean (Is Uploaded Via Signed Url) |
{- "next": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "previous": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "results": [
- {
- "id": 0,
- "key": "string",
- "file_type": "snappy",
- "data_source_model_id": 0,
- "data_source_model_name": "string",
- "bad_rows_count": 0,
- "successful_rows_count": 0,
- "ignored_rows_count": 0,
- "ignore_file": false,
- "is_ingested": false,
- "is_uploaded_via_signed_url": false,
- "created_at": "2019-08-24T14:15:22Z"
}
], - "filters": {
- "settings_model_id": 0,
- "data_source_model_id": 0,
- "is_uploaded_via_signed_url": true,
- "parent_id": 0,
- "ignore_file": true
}
}
When we want to upload a file directly to AWS S3 or Google Cloud Storage, we hit this endpoint to create a Stored Item (File) record on Qluster and receive a pre-signed URL that we could use to upload the file directly to S3 or Google Cloud Storage.
data_source_model_id required | integer (Data Source Model Id) |
key required | string (Key) The key on the backup storage that we are getting the pre-signed URL to put a file in it. |
{- "data_source_model_id": 0,
- "key": "string"
}
{- "id": 0,
- "url": "string",
- "http_method": "GET",
- "fields": {
- "property1": "string",
- "property2": "string"
}, - "msg": ""
}
Modify the information about one stored item (file)
For example, the user might decide to ignore a file. That means all the rows of the file need to be deleted from the destination and the signature of the file needs to be added to the files to be ignored in the future.
id required | integer (Id) |
ignore_file required | boolean (Ignore File) Should this file be ignored? If yes and the file is already ingested, it will hide its relevant rows if they are already ingested and delete all the bad rows and alerts associated with it. |
{- "id": 0,
- "ignore_file": true
}
{- "id": 0,
- "key": "string",
- "file_type": "snappy",
- "data_source_model_id": 0,
- "data_source_model_name": "string",
- "bad_rows_count": 0,
- "successful_rows_count": 0,
- "ignored_rows_count": 0,
- "ignore_file": true,
- "is_ingested": false,
- "is_uploaded_via_signed_url": false,
- "created_at": "2019-08-24T14:15:22Z",
- "parent_id": 0,
- "backup_key": "string",
- "backup_settings_id": 0,
- "encoding": "string",
- "dataset_id": 0,
- "signature": "string",
- "other_names": [
- "string"
], - "duplicate_of_id": 0,
- "ignored_rows_line_number_to_signature": {
- "property1": "string",
- "property2": "string"
}, - "csv_delimiter": "string",
- "csv_quotechar": "string",
- "csv_escapechar": "string",
- "csv_lineterminator": "string",
- "is_backup_encrypted": true,
- "compression_type_of_backup_data": "gzip",
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff"
}
Get detailed information of one stored item (file)
Why do we call files, stored items? Because file is a reserved word in Python.
stored_item_id required | integer (Stored Item Id) |
{- "id": 0,
- "key": "string",
- "file_type": "snappy",
- "data_source_model_id": 0,
- "data_source_model_name": "string",
- "bad_rows_count": 0,
- "successful_rows_count": 0,
- "ignored_rows_count": 0,
- "ignore_file": true,
- "is_ingested": false,
- "is_uploaded_via_signed_url": false,
- "created_at": "2019-08-24T14:15:22Z",
- "parent_id": 0,
- "backup_key": "string",
- "backup_settings_id": 0,
- "encoding": "string",
- "dataset_id": 0,
- "signature": "string",
- "other_names": [
- "string"
], - "duplicate_of_id": 0,
- "ignored_rows_line_number_to_signature": {
- "property1": "string",
- "property2": "string"
}, - "csv_delimiter": "string",
- "csv_quotechar": "string",
- "csv_escapechar": "string",
- "csv_lineterminator": "string",
- "is_backup_encrypted": true,
- "compression_type_of_backup_data": "gzip",
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff"
}
Get the pre-signed link to download the stored item directly from AWS S3 or Google Cloud Storage.
Once you have the link, you can do a GET request on the link to receive the file.
stored_item_id required | integer (Stored Item Id) |
{- "id": 0,
- "url": "string",
- "http_method": "GET",
- "fields": {
- "property1": "string",
- "property2": "string"
}, - "msg": ""
}
When we use a pre-signed URL from Qluster to upload a file to AWS S3 or Google Cloud Storage, Qluster has no idea that the user is done with uploading the file or not. Once the user has successfully uploaded the file, they can use this endpoint to notify Qluster to take over.
This endpoint creates an ingestion job for the stored item (file) and puts the file on the queue to be ingested.
stored_item_id required | integer (Stored Item Id) |
{- "ingestion_job_id": 0
}
start | integer (Start) |
end | integer (End) |
limit | integer (Limit) Default: 20 |
job_name | string (Job Name) |
module | string (Module) |
deleted | boolean (Deleted) Default: false |
reverse | boolean (Reverse) Default: true |
{- "next": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "previous": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "results": [
- {
- "id": 0,
- "count": 0,
- "job_name": "string",
- "error": "string",
- "module": "string",
- "msg": "string",
- "dataset_id": 0,
- "created_at": "2019-08-24T14:15:22Z"
}
], - "filters": {
- "job_name": "string",
- "deleted": false,
- "module": false
}, - "title": ""
}
Creates new fatal errors.
Example request:
{ "job_name": "sensor-1", "module": "sensor", "data_source_model_id": 1, "job_type": "job", "error": "InvalidCron", "msg": "Fix it", "meta_data": { "foo": "bar" } }
job_name | string (Job Name) The name of what encounter this error. i.e. sensor-1. The name corresponds to the job name in kubernetes. |
job_type | string Enum: "job" "cronjob" The type of job. cronjob or job. |
error required | string (Error) The error to aggregate the errors based on. For example AccessDeniedError. Note that these are actual error class names with some optional details and different than alert types. Some of these could be related to alert types. |
module required | string (Module) The module that raised the error. Example: sensor |
msg required | string (Msg) The msg of the error. |
stack_trace | string (Stack Trace) Default: "" The stack trace of the exception if any. |
data_source_model_id | integer (Data Source Model Id) The data_source_model_id |
settings_model_id | integer (Settings Model Id) The settings_model_id |
alert_item_id | integer (Alert Item Id) The alert item id. |
stored_item_id | integer (Stored Item Id) The stored item id |
object (Meta Data) Other meta_data about the error. |
{- "job_name": "string",
- "job_type": "job",
- "error": "string",
- "module": "string",
- "msg": "string",
- "stack_trace": "",
- "data_source_model_id": 0,
- "settings_model_id": 0,
- "alert_item_id": 0,
- "stored_item_id": 0,
- "meta_data": {
- "property1": "string",
- "property2": "string"
}
}
{- "job_name": "string",
- "job_type": "job",
- "error": "string",
- "module": "string",
- "msg": "string",
- "stack_trace": "",
- "data_source_model_id": 0,
- "settings_model_id": 0,
- "alert_item_id": 0,
- "stored_item_id": 0,
- "meta_data": {
- "property1": "string",
- "property2": "string"
}, - "id": 0,
- "count": 0,
- "dataset_id": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "deleted": true
}
{- "job_name": "string",
- "job_type": "job",
- "error": "string",
- "module": "string",
- "msg": "string",
- "stack_trace": "",
- "data_source_model_id": 0,
- "settings_model_id": 0,
- "alert_item_id": 0,
- "stored_item_id": 0,
- "meta_data": {
- "property1": "string",
- "property2": "string"
}, - "id": 0,
- "count": 0,
- "dataset_id": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "deleted": true
}
Deletes fatal errors.
id | integer (Id) |
job_name | string (Job Name) |
job_type | string (JobType) Enum: "job" "cronjob" An enumeration. |
error | string (Error) |
{- "id": 0,
- "job_name": "string",
- "job_type": "job",
- "error": "string"
}
null
start | integer (Start) |
end | integer (End) |
reverse | boolean (Reverse) Default: true |
limit | integer (Limit) Default: 20 |
state | string (MigrationState) Enum: "draft" "applied" "current" "deleted" An enumeration. |
settings_model_id | integer (Settings Model Id) |
{- "next": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "previous": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "results": [
- {
- "id": 0,
- "state": "draft",
- "confirmed": true,
- "down_id": 0,
- "updated_at": "2019-08-24T14:15:22Z",
- "settings_model_id": 0,
- "dataset_id": 0,
- "dataset_name": "string"
}
]
}
{- "id": 0,
- "state": "draft",
- "confirmed": true,
- "down_id": 0,
- "updated_at": "2019-08-24T14:15:22Z",
- "settings_model_id": 0,
- "dataset_id": 0,
- "dataset_name": "string",
- "model": {
- "property1": {
- "property1": true,
- "property2": true
}, - "property2": {
- "property1": true,
- "property2": true
}
}, - "display_model": [
- [
- true
]
], - "column_overrides_info": {
- "property1": {
- "property1": true,
- "property2": true
}, - "property2": {
- "property1": true,
- "property2": true
}
}, - "format_info": {
- "property1": {
- "property1": "string",
- "property2": "string"
}, - "property2": {
- "property1": "string",
- "property2": "string"
}
}, - "created_at": "2019-08-24T14:15:22Z"
}
This is currently only used when a draft migration is created and the user can rename some fields.
When an unexpected field name alert is created, it is instead handled via the alert controller for the field renames or ignoring fields etc.
Here we are not exactly resolving an "Alert". It is just doing some renames for the fields.
dataset_id required | integer (Dataset Id) |
object (Renames) Rename this field to another field. | |
object (Field Name To Type Suggestion) Suggest a database field type change. If Qluster finds it compatible with your data, it will use this override. |
{- "renames": {
- "property1": "string",
- "property2": "string"
}, - "field_name_to_type_suggestion": {
- "property1": {
- "type_override": "String",
- "args_override": [
- 0
], - "is_percent_override": true,
- "is_dollar_override": true
}, - "property2": {
- "type_override": "String",
- "args_override": [
- 0
], - "is_percent_override": true,
- "is_dollar_override": true
}
}
}
{- "result": "string"
}
If there is no existing draft migration, it creates one that matches the current migration so the user can get to the migration review page
dataset_id required | integer (Dataset Id) |
{- "alert_item_id": 0
}
Get the list of sets of backup settings.
start | integer (Start) |
end | integer (End) |
limit | integer (Limit) Default: 20 |
reverse | boolean (Reverse) Default: true |
{- "next": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "previous": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "results": [
- {
- "id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
- "email": "user@example.com",
- "first_name": "string",
- "last_name": "string",
- "is_active": true,
- "is_verified": true,
- "user_type": "superuser",
- "created_at": "2019-08-24T14:15:22Z",
- "updated_at": "2019-08-24T14:15:22Z"
}
], - "filters": {
- "settings_model_id": 0
}, - "title": ""
}
Create a new backup settings model
backup_gpg_public_key | string (Backup Gpg Public Key) |
backup_gpg_recipient | string (Backup Gpg Recipient) |
backup_gpg_private_key | string (Backup Gpg Private Key) |
backup_gpg_passphrase | string (Backup Gpg Passphrase) |
backup_type required | string (BackupType) Enum: "s3" "minio" "gs" An enumeration. |
backup_s3_access_key | string (Backup S3 Access Key) |
backup_s3_secret_key | string (Backup S3 Secret Key) |
backup_s3_endpoint_url | string (Backup S3 Endpoint Url) |
backup_s3_region_name | string (Backup S3 Region Name) |
backup_s3_default_buffer_size | integer (Backup S3 Default Buffer Size) |
backup_s3_bucket | string (Backup S3 Bucket) |
backup_gs_service_account_key | string (Backup Gs Service Account Key) |
backup_gs_bucket | string (Backup Gs Bucket) |
redirect_post_submit_to | string (Redirect Post Submit To) |
{- "backup_gpg_public_key": "string",
- "backup_gpg_recipient": "string",
- "backup_gpg_private_key": "string",
- "backup_gpg_passphrase": "string",
- "backup_type": "s3",
- "backup_s3_access_key": "string",
- "backup_s3_secret_key": "string",
- "backup_s3_endpoint_url": "string",
- "backup_s3_region_name": "string",
- "backup_s3_default_buffer_size": 0,
- "backup_s3_bucket": "string",
- "backup_gs_service_account_key": "string",
- "backup_gs_bucket": "string",
- "redirect_post_submit_to": "string"
}
{- "id": 0,
- "version_id": 0,
- "backup_gpg_public_key": "string",
- "backup_gpg_recipient": "string",
- "backup_gpg_private_key": "string",
- "backup_gpg_passphrase": "string",
- "backup_type": "s3",
- "backup_s3_access_key": "string",
- "backup_s3_secret_key": "string",
- "backup_s3_endpoint_url": "string",
- "backup_s3_region_name": "string",
- "backup_s3_default_buffer_size": 0,
- "backup_s3_bucket": "string",
- "backup_gs_service_account_key": "string",
- "backup_gs_bucket": "string",
- "created_at": "2019-08-24T14:15:22Z",
- "redirect_post_submit_to": "",
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff"
}
Patch an existing backup settings model.
id required | integer (Id) |
version_id required | integer (Version Id) |
backup_gpg_public_key | string (Backup Gpg Public Key) |
backup_gpg_recipient | string (Backup Gpg Recipient) |
backup_gpg_private_key | string (Backup Gpg Private Key) |
backup_gpg_passphrase | string (Backup Gpg Passphrase) |
backup_type | string (BackupType) Enum: "s3" "minio" "gs" An enumeration. |
backup_s3_access_key | string (Backup S3 Access Key) |
backup_s3_secret_key | string (Backup S3 Secret Key) |
backup_s3_endpoint_url | string (Backup S3 Endpoint Url) |
backup_s3_region_name | string (Backup S3 Region Name) |
backup_s3_default_buffer_size | integer (Backup S3 Default Buffer Size) |
backup_s3_bucket | string (Backup S3 Bucket) |
backup_gs_service_account_key | string (Backup Gs Service Account Key) |
backup_gs_bucket | string (Backup Gs Bucket) |
redirect_post_submit_to | string (Redirect Post Submit To) |
{- "id": 0,
- "version_id": 0,
- "backup_gpg_public_key": "string",
- "backup_gpg_recipient": "string",
- "backup_gpg_private_key": "string",
- "backup_gpg_passphrase": "string",
- "backup_type": "s3",
- "backup_s3_access_key": "string",
- "backup_s3_secret_key": "string",
- "backup_s3_endpoint_url": "string",
- "backup_s3_region_name": "string",
- "backup_s3_default_buffer_size": 0,
- "backup_s3_bucket": "string",
- "backup_gs_service_account_key": "string",
- "backup_gs_bucket": "string",
- "redirect_post_submit_to": "string"
}
{- "id": 0,
- "version_id": 0,
- "backup_gpg_public_key": "string",
- "backup_gpg_recipient": "string",
- "backup_gpg_private_key": "string",
- "backup_gpg_passphrase": "string",
- "backup_type": "s3",
- "backup_s3_access_key": "string",
- "backup_s3_secret_key": "string",
- "backup_s3_endpoint_url": "string",
- "backup_s3_region_name": "string",
- "backup_s3_default_buffer_size": 0,
- "backup_s3_bucket": "string",
- "backup_gs_service_account_key": "string",
- "backup_gs_bucket": "string",
- "created_at": "2019-08-24T14:15:22Z",
- "redirect_post_submit_to": "",
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff"
}
Get the backup settings with the default values. These values are displayed to the user so they can modify them to create a new set of backup settings.
{- "id": 0,
- "version_id": 0,
- "backup_gpg_public_key": "string",
- "backup_gpg_recipient": "string",
- "backup_gpg_private_key": "string",
- "backup_gpg_passphrase": "string",
- "backup_type": "s3",
- "backup_s3_access_key": "string",
- "backup_s3_secret_key": "string",
- "backup_s3_endpoint_url": "string",
- "backup_s3_region_name": "string",
- "backup_s3_default_buffer_size": 0,
- "backup_s3_bucket": "string",
- "backup_gs_service_account_key": "string",
- "backup_gs_bucket": "string",
- "created_at": "2019-08-24T14:15:22Z",
- "redirect_post_submit_to": "",
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff"
}
Get one backup settings model by id
backup_settings_model_id required | integer (Backup Settings Model Id) |
{- "id": 0,
- "version_id": 0,
- "backup_gpg_public_key": "string",
- "backup_gpg_recipient": "string",
- "backup_gpg_private_key": "string",
- "backup_gpg_passphrase": "string",
- "backup_type": "s3",
- "backup_s3_access_key": "string",
- "backup_s3_secret_key": "string",
- "backup_s3_endpoint_url": "string",
- "backup_s3_region_name": "string",
- "backup_s3_default_buffer_size": 0,
- "backup_s3_bucket": "string",
- "backup_gs_service_account_key": "string",
- "backup_gs_bucket": "string",
- "created_at": "2019-08-24T14:15:22Z",
- "redirect_post_submit_to": "",
- "whodunit_id": "d26c4284-1039-4c18-ad5c-f54f942c5cff"
}
start | integer (Start) |
end | integer (End) |
limit | integer (Limit) Default: 20 |
reverse | boolean (Reverse) Default: true |
settings_model_id | integer (Settings Model Id) |
dataset_id | integer (Dataset Id) |
data_source_model_id | integer (Data Source Model Id) |
state | string (IngestionJobState) Enum: "created" "running" "blocked" "stopped" "error" "killed" "version_id_mismatch" "finished" "requeued" An enumeration. |
{- "next": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "previous": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "results": [
- {
- "id": 0,
- "dataset_id": 0,
- "dataset_name": "string",
- "key": "string",
- "state": "created",
- "alert_item_id": 0,
- "is_alert_resolved": true,
- "successful_rows_count": 0,
- "bad_rows_count": 0,
- "ignored_rows_count": 0,
- "msg": "string",
- "try_count": 0,
- "updated_at": "2019-08-24T14:15:22Z"
}
], - "filters": {
- "settings_model_id": 0,
- "data_source_model_id": 0,
- "state": "created"
}, - "title": ""
}
ingestion_job_id required | integer (Ingestion Job Id) |
{- "id": 0,
- "state": "created",
- "key": "string",
- "stored_item_id": 0,
- "data_source_model_id": 0,
- "data_source_model_name": "string",
- "settings_model_id": 0,
- "alert_item_id": 0,
- "is_alert_resolved": true,
- "successful_rows_count": 0,
- "bad_rows_count": 0,
- "ignored_rows_count": 0,
- "dataset_id": 0,
- "dataset_name": "string",
- "msg": "string",
- "try_count": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "started_at": "2019-08-24T14:15:22Z",
- "finished_at": "2019-08-24T14:15:22Z"
}
Get the list of training data items.
start | integer (Start) |
end | integer (End) |
limit | integer (Limit) Default: 20 |
reverse | boolean (Reverse) Default: true |
dataset_id | integer (Dataset Id) |
{- "next": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "previous": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "results": [
- {
- "id": 0,
- "enabled": true,
- "name": "string",
- "stored_item_id": 0,
- "alert_items_count": 0,
- "settings_model_id": 0,
- "dataset_id": 0,
- "dataset_name": "string",
- "created_at": "2019-08-24T14:15:22Z"
}
], - "filters": {
- "dataset_id": 0
}, - "title": ""
}
Get the detail view of one training data item.
A training data item represents a file that should be analyzed.
training_data_item_id required | integer (Training Data Item Id) |
{- "id": 0,
- "enabled": true,
- "name": "string",
- "stored_item_id": 0,
- "alert_items_ids": [
- 0
], - "settings_model_id": 0,
- "dataset_id": 0,
- "dataset_name": "string",
- "created_at": "2019-08-24T14:15:22Z",
- "updated_at": "2019-08-24T14:15:22Z"
}
Get the list of training jobs
start | integer (Start) |
end | integer (End) |
limit | integer (Limit) Default: 20 |
reverse | boolean (Reverse) Default: true |
dataset_id | integer (Dataset Id) |
{- "next": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "previous": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "results": [
- {
- "id": 0,
- "state": "created",
- "step": "analysis",
- "msg": "string",
- "settings_model_id": 0,
- "dataset_id": 0,
- "dataset_name": "string",
- "updated_at": "2019-08-24T14:15:22Z"
}
], - "filters": {
- "dataset_id": 0,
- "state": "created",
- "step": "analysis"
}, - "title": ""
}
Get the detailed information about a single training job.
training_job_id required | integer (Training Job Id) |
{- "id": 0,
- "state": "created",
- "step": "analysis",
- "msg": "string",
- "settings_model_id": 0,
- "migration_model_id": 0,
- "dataset_id": 0,
- "dataset_name": "string",
- "unresolved_alerts": 0,
- "analysis_tasks": [
- {
- "training_data_item_id": 0,
- "stored_item_key": "string",
- "state": "created",
- "msg": "string",
- "try_count": 0,
- "updated_at": "2019-08-24T14:15:22Z"
}
], - "created_at": "2019-08-24T14:15:22Z",
- "started_at": "2019-08-24T14:15:22Z",
- "finished_at": "2019-08-24T14:15:22Z"
}
Override a training job TODO: do we need to also publish on CUEUE_KILL to kill all the analysis jobs and other related jobs on k8?
training_job_id required | integer (Training Job Id) |
state required | string (TrainingJobState) Enum: "created" "running" "blocked" "stopped" "partial_error" "error" "killed" "version_id_mismatch" "finished" "ignore" "requeued" The state of the training job |
{- "state": "created"
}
{- "id": 0,
- "state": "created",
- "step": "analysis",
- "msg": "string",
- "settings_model_id": 0,
- "migration_model_id": 0,
- "dataset_id": 0,
- "dataset_name": "string",
- "unresolved_alerts": 0,
- "analysis_tasks": [
- {
- "training_data_item_id": 0,
- "stored_item_key": "string",
- "state": "created",
- "msg": "string",
- "try_count": 0,
- "updated_at": "2019-08-24T14:15:22Z"
}
], - "created_at": "2019-08-24T14:15:22Z",
- "started_at": "2019-08-24T14:15:22Z",
- "finished_at": "2019-08-24T14:15:22Z"
}
Launch a training job.
A training job launches analysis tasks on every single file that we have ever used for training in a dataset before.
A file might be later disabled for training purposes in case it does not provide enough insights into what the dataset data looks like.
settings_model_id required | integer (Settings Model Id) |
{- "result": "string"
}
Get one analysis task
An analysis task represents a single run of the analyzer on the contents of a file.
During the analysis we try to find the major characteristics of data in the file.
training_job_id required | integer (Training Job Id) |
training_data_item_id required | integer (Training Data Item Id) |
{- "training_data_item_id": 0,
- "training_job_id": 0,
- "stored_item_id": 0,
- "stored_item_key": "string",
- "state": "created",
- "msg": "string",
- "settings_version_id": 0,
- "analyzed_content": { },
- "display_analyzed_content": [
- [
- true
]
], - "questionable_fields": { },
- "failed_to_infer_fields": [
- "string"
], - "try_count": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "finished_at": "2019-08-24T14:15:22Z"
}
Get the list of destinations
A destination is a database instance. For example a Postgres database running in the cloud.
start | integer (Start) |
end | integer (End) |
limit | integer (Limit) Default: 20 |
reverse | boolean (Reverse) Default: true |
{- "next": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "previous": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "results": [
- {
- "id": 0,
- "name": "string",
- "destination_type": "postgresql",
- "updated_at": "2019-08-24T14:15:22Z"
}
], - "filters": {
- "destination_type": 0
}, - "title": ""
}
POST destination model info in order to create a new destination.
A destination is referring to the creds for a database instance. For example a Postgres database.
name | string (Name) |
destination_type required | string (DestinationType) Value: "postgresql" An enumeration. |
database_name required | string (Database Name) |
host | string (Host) |
user | string (User) |
port | string (Port) |
password | string (Password) |
connect_timeout | integer (Connect Timeout) |
redirect_post_submit_to | string (Redirect Post Submit To) |
{- "name": "string",
- "destination_type": "postgresql",
- "database_name": "string",
- "host": "string",
- "user": "string",
- "port": "string",
- "password": "string",
- "connect_timeout": 0,
- "redirect_post_submit_to": "string"
}
{- "id": 0,
- "name": "string",
- "destination_type": "postgresql",
- "database_name": "string",
- "updated_at": "2019-08-24T14:15:22Z",
- "host": "string",
- "user": "string",
- "port": "string",
- "password": "string",
- "connect_timeout": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "redirect_post_submit_to": ""
}
Patch an existing destination to modify it.
id required | integer (Id) |
name | string (Name) |
database_name | string (Database Name) |
host | string (Host) |
user | string (User) |
port | string (Port) |
password | string (Password) |
connect_timeout | integer (Connect Timeout) |
redirect_post_submit_to | string (Redirect Post Submit To) |
{- "id": 0,
- "name": "string",
- "database_name": "string",
- "host": "string",
- "user": "string",
- "port": "string",
- "password": "string",
- "connect_timeout": 0,
- "redirect_post_submit_to": "string"
}
{- "id": 0,
- "name": "string",
- "destination_type": "postgresql",
- "database_name": "string",
- "updated_at": "2019-08-24T14:15:22Z",
- "host": "string",
- "user": "string",
- "port": "string",
- "password": "string",
- "connect_timeout": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "redirect_post_submit_to": ""
}
Get a new destination object with the default values.
The user then can modify these values and submit it via a POST request
to actually create a new destination.
{- "id": 0,
- "name": "string",
- "destination_type": "postgresql",
- "database_name": "string",
- "updated_at": "2019-08-24T14:15:22Z",
- "host": "string",
- "user": "string",
- "port": "string",
- "password": "string",
- "connect_timeout": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "redirect_post_submit_to": ""
}
Get the detailed information about a specific destination.
Note: all the sensitive data in this response will be obfuscated
destination_id required | integer (Destination Id) |
{- "id": 0,
- "name": "string",
- "destination_type": "postgresql",
- "database_name": "string",
- "updated_at": "2019-08-24T14:15:22Z",
- "host": "string",
- "user": "string",
- "port": "string",
- "password": "string",
- "connect_timeout": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "redirect_post_submit_to": ""
}
start | integer (Start) |
end | integer (End) |
limit | integer (Limit) Default: 20 |
reverse | boolean (Reverse) Default: true |
acknowledged | boolean (Acknowledged) |
q_name | string (Q Name) |
data_source_model_id | integer (Data Source Model Id) |
{- "next": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "previous": {
- "start": 0,
- "end": 0,
- "limit": 20
}, - "results": [
- {
- "id": 0,
- "enqueued_at": "2019-08-24T14:15:22Z",
- "dequeued_at": "2019-08-24T14:15:22Z",
- "expected_at": "2019-08-24T14:15:22Z",
- "schedule_at": "2019-08-24T14:15:22Z",
- "q_name": "string",
- "key": "string",
- "data": { },
- "acknowledged": true,
- "data_source_model_id": 0,
- "put_count": 0,
- "receive_count": 0
}
], - "filters": {
- "acknowledged": true,
- "q_name": "string",
- "data_source_model_id": 0
}, - "title": ""
}
{- "id": 0,
- "enqueued_at": "2019-08-24T14:15:22Z",
- "dequeued_at": "2019-08-24T14:15:22Z",
- "expected_at": "2019-08-24T14:15:22Z",
- "schedule_at": "2019-08-24T14:15:22Z",
- "q_name": "string",
- "key": "string",
- "data": { },
- "acknowledged": true,
- "data_source_model_id": 0,
- "put_count": 0,
- "receive_count": 0
}
Login
Takes the username and password to login.
It returns an OAuth2 token that can be used in the subsequent requests in order to authenticate.
The scope, client_id, and client_secret are not currently being used.
grant_type | string (Grant Type) password |
username required | string (Username) |
password required | string (Password) |
scope | string (Scope) Default: "" |
client_id | string (Client Id) |
client_secret | string (Client Secret) |
{- "access_token": "string",
- "token_type": "string",
- "redirect_post_submit_to": ""
}
Create a new user
email required | string <email> (Email) |
raw_password2 required | string (Raw Password2) |
raw_password required | string (Raw Password) |
first_name required | string (First Name) |
last_name required | string (Last Name) |
is_active required | boolean (Is Active) |
is_verified required | boolean (Is Verified) |
user_type required | string (UserType) Enum: "superuser" "editor" "limited_to_source" "read_only" "billing" An enumeration. |
data_source_model_ids | Array of integers (Data Source Model Ids) [ items ] |
{- "email": "user@example.com",
- "raw_password2": "string",
- "raw_password": "string",
- "first_name": "string",
- "last_name": "string",
- "is_active": true,
- "is_verified": true,
- "user_type": "superuser",
- "data_source_model_ids": [
- 0
]
}
{- "id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
- "email": "user@example.com",
- "first_name": "string",
- "last_name": "string",
- "is_active": true,
- "is_verified": true,
- "user_type": "superuser",
- "created_at": "2019-08-24T14:15:22Z",
- "updated_at": "2019-08-24T14:15:22Z",
- "password": "string",
- "last_failed_logins_at": [
- "2019-08-24T14:15:22Z"
], - "data_source_model_ids": [
- 0
], - "created_by_user_id": "string",
- "created_by_user_email": "string",
- "redirect_post_submit_to": ""
}
Patch an existing user.
We have the Pydantic validate and deserialize items but the model we get back has every original attribute even if they were not part of the request. Hence we use the original request object to only grab what needed in order to patch the dataset object.
string <email> (Email) | |
raw_password2 | string (Raw Password2) |
raw_password | string (Raw Password) |
first_name | string (First Name) |
last_name | string (Last Name) |
is_active | boolean (Is Active) |
is_verified | boolean (Is Verified) |
user_type | string (UserType) Enum: "superuser" "editor" "limited_to_source" "read_only" "billing" An enumeration. |
data_source_model_ids | Array of integers (Data Source Model Ids) [ items ] |
id required | string <uuid> (Id) |
{- "email": "user@example.com",
- "raw_password2": "string",
- "raw_password": "string",
- "first_name": "string",
- "last_name": "string",
- "is_active": true,
- "is_verified": true,
- "user_type": "superuser",
- "data_source_model_ids": [
- 0
], - "id": "497f6eca-6276-4993-bfeb-53cbbbba6f08"
}
{- "id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
- "email": "user@example.com",
- "first_name": "string",
- "last_name": "string",
- "is_active": true,
- "is_verified": true,
- "user_type": "superuser",
- "created_at": "2019-08-24T14:15:22Z",
- "updated_at": "2019-08-24T14:15:22Z",
- "password": "string",
- "last_failed_logins_at": [
- "2019-08-24T14:15:22Z"
], - "data_source_model_ids": [
- 0
], - "created_by_user_id": "string",
- "created_by_user_email": "string",
- "redirect_post_submit_to": ""
}
Get one user
user_id required | string (User Id) |
{- "id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
- "email": "user@example.com",
- "first_name": "string",
- "last_name": "string",
- "is_active": true,
- "is_verified": true,
- "user_type": "superuser",
- "created_at": "2019-08-24T14:15:22Z",
- "updated_at": "2019-08-24T14:15:22Z",
- "password": "string",
- "last_failed_logins_at": [
- "2019-08-24T14:15:22Z"
], - "data_source_model_ids": [
- 0
], - "created_by_user_id": "string",
- "created_by_user_email": "string",
- "redirect_post_submit_to": ""
}
This gets the already ingested or quarantine data for the dataset.
Note that depending on the user access level, certain parts of data may or may not be available to the user.
For example, users who are limited to certain data sources can only see the data of each data source and file
they are associated with but not the combined data from all the data sources they are associated with.
Examples
/datasets/1/data?order_by=total_cost"
will get the data for the dataset_id of 1 as long as the authenticated user has access to this data./datasets/1/data?is_bad_data=true
will get the quarantine data for the dataset_id of 1dataset_id required | integer (Dataset Id) |
page | integer (Page) Default: 1 |
limit | integer (Limit) Default: 20 |
data_source_model_id | integer (Data Source Model Id) |
stored_item_id | integer (Stored Item Id) |
is_bad_data | boolean (Is Bad Data) Default: false |
order_by | Array of strings (Order By) Default: ["~id"] |
{- "title": "",
- "total_page_count": 0,
- "total_rows": 0,
- "dataset_name": "string",
- "fields": [
- "string"
], - "results": [
- null
], - "filters": {
- "page": 0,
- "stored_item_id": 0,
- "data_source_model_id": 0,
- "is_bad_data": false,
- "limit": 20,
- "order_by": [
- "~id"
]
}
}
The view to ignore one or more line signatures in the form of
tuples of data source model id and the line signature.
By doing this, in the same request, multiple lines from multiple data sources
can be ignored.
Such a request first adds the signatures to the list of signatures to be ignored in the future.
Then it puts these signatures on the queue to be deleted from the quarantine table if they are found in that table.
The table will by updated once deleting is completed.
Examples
POST /datasets/1/ignore_line_signature
body = {'data_source_model_to_line_signatures': [
[2, 'ABC'], [1, 'XYZ']
]}
Will add the ABC
line signature to the list of signatures to be ignored for data source model 2.
It also adds the XYZ
line signature to the list of signatures to be ignored for data source model 1.
The line signatures are returned via _signature
field in the get data view when the is_bad_data=true
in the filters.
dataset_id required | integer (Dataset Id) |
data_source_model_to_line_signatures required | Array of Array of any (Data Source Model To Line Signatures) [ items 2 items [ items ] ] |
{- "data_source_model_to_line_signatures": [
- [
- 0,
- "string"
]
]
}
{- "result": "string"
}
The view to reprocess one or more bad rows in the quarantine table based on the bad rows IDs.
Note that the request is limited to the dataset passed in the URL.
Such a request puts these ids on the queue to be reprocessed.
The table will by updated asynchronously once the reprocessing is done.
Example
POST "/datasets/2/reprocess-bad-rows
body = {'ids': [3, 4]}
Will reprocess the rows ID 3 and 4 in the quarantine table for the dataset id of 2.
dataset_id required | integer (Dataset Id) |
ids required | Array of integers (Ids) [ items ] |
{- "ids": [
- 0
]
}
{- "result": "string"
}
Get the validation rules only for one field in the dataset.
dataset_id required | integer (Dataset Id) |
field_name required | string (Field Name) |
{- "to_include": true,
- "is_required": false,
- "json_schema_field_type": "string",
- "limit_to_values": [
- "string"
], - "maximum": 0,
- "minimum": 0,
- "allowed_json_schema_field_type": [
- "string"
]
}
Update the validation rules only for one field in the dataset.
Example:
POST /datasets/1/validation_info/score"
Body
{
'is_required': True,
'json_schema_field_type': 'number',
'minimum': -100,
'maximum': 200
}
Will update the validation information for the score field in the dataset ID 1. Here we are setting the json schema field type to be number with a maximum of 200 and minimum of -100
Under the hood Qluster uses JsonSchema for basic validation
dataset_id required | integer (Dataset Id) |
field_name required | string (Field Name) |
to_include | boolean (To Include) Default: true |
is_required | boolean (Is Required) Default: false |
json_schema_field_type required | string (JsonSchemaFieldType) Enum: "string" "email" "integer" "number" "boolean" "array" "object" An enumeration. |
Array of strings or integers or numbers or numbers (Limit To Values) [ items ] | |
integer or number or number (Maximum) | |
integer or number or number (Minimum) | |
allowed_json_schema_field_type | Array of strings (JsonSchemaFieldType) Items Enum: "string" "email" "integer" "number" "boolean" "array" "object" |
{- "to_include": true,
- "is_required": false,
- "json_schema_field_type": "string",
- "limit_to_values": [
- "string"
], - "maximum": 0,
- "minimum": 0,
- "allowed_json_schema_field_type": [
- "string"
]
}
{- "result": "string"
}
Get the validation rules for a specific data source
Example
GET /data-sources/1/validation-model
will get the validation model
for data source 1.
Response:
The first line is the header row. The rest of lines are the values.
['field_name', 'field_type_in_db', 'validation_type', 'is_required', 'minimum', 'maximum', 'the_only_acceptable_values'],
['score', 'SmallInteger', 'integer', False, False, 100, None],
['active', 'boolean', 'boolean', False, None, None, None],
['street', 'text', 'string', True, None, None, None],
['zipcode', 'string', 'string', True, None, None, [90007, 90414, 90064]],
['image_urls', 'Array', 'array', False, None, None, ['url one', 'url two', 'url three']],
['attributes', 'Json', None, False, None, None, None],
data_source_model_id required | integer (Data Source Model Id) |
[- [
- true
]
]