Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Source S3: work-around for format.delimiter change '\\t' -> '\t' #9163

Merged
merged 19 commits into from
Jan 6, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"sourceDefinitionId": "69589781-7828-43c5-9f63-8925b1c1ccc2",
"name": "S3",
"dockerRepository": "airbyte/source-s3",
"dockerImageTag": "0.1.7",
"dockerImageTag": "0.1.9",
"documentationUrl": "https://docs.airbyte.io/integrations/sources/s3",
"icon": "s3.svg"
}
Original file line number Diff line number Diff line change
Expand Up @@ -593,7 +593,7 @@
- name: S3
sourceDefinitionId: 69589781-7828-43c5-9f63-8925b1c1ccc2
dockerRepository: airbyte/source-s3
dockerImageTag: 0.1.8
dockerImageTag: 0.1.9
documentationUrl: https://docs.airbyte.io/integrations/sources/s3
icon: s3.svg
sourceType: file
Expand Down
5 changes: 3 additions & 2 deletions airbyte-config/init/src/main/resources/seed/source_specs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6047,7 +6047,7 @@
path_in_connector_config:
- "credentials"
- "client_secret"
- dockerImage: "airbyte/source-s3:0.1.8"
- dockerImage: "airbyte/source-s3:0.1.9"
spec:
documentationUrl: "https://docs.airbyte.io/integrations/sources/s3"
changelogUrl: "https://docs.airbyte.io/integrations/sources/s3"
Expand Down Expand Up @@ -6103,7 +6103,8 @@
delimiter:
title: "Delimiter"
description: "The character delimiting individual cells in the CSV\
\ data. This may only be a 1-character string."
\ data. This may only be a 1-character string. For tab-delimited\
\ data enter '\\t'."
default: ","
minLength: 1
type: "string"
Expand Down
2 changes: 1 addition & 1 deletion airbyte-integrations/connectors/source-s3/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,5 +17,5 @@ COPY source_s3 ./source_s3
ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py"
ENTRYPOINT ["python", "/airbyte/integration_code/main.py"]

LABEL io.airbyte.version=0.1.8
LABEL io.airbyte.version=0.1.9
LABEL io.airbyte.name=airbyte/source-s3
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@
},
"delimiter": {
"title": "Delimiter",
"description": "The character delimiting individual cells in the CSV data. This may only be a 1-character string.",
"description": "The character delimiting individual cells in the CSV data. This may only be a 1-character string. For tab-delimited data enter '\\t'.",
"default": ",",
"minLength": 1,
"type": "string"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
#


from typing import Optional
from typing import Any, Mapping, Optional

from pydantic import BaseModel, Field

Expand Down Expand Up @@ -47,3 +47,9 @@ class SourceS3(SourceFilesAbstract):
stream_class = IncrementalFileStreamS3
spec_class = SourceS3Spec
documentation_url = "https://docs.airbyte.io/integrations/sources/s3"

def read_config(self, config_path: str) -> Mapping[str, Any]:
config = super().read_config(config_path)
if config.get("format", {}).get("delimiter") == r"\t":
config["format"]["delimiter"] = "\t"
return config
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ class Config:
delimiter: str = Field(
default=",",
min_length=1,
description="The character delimiting individual cells in the CSV data. This may only be a 1-character string.",
description="The character delimiting individual cells in the CSV data. This may only be a 1-character string. For tab-delimited data enter '\\t'.",
)
quote_char: str = Field(
default='"', description="The character used optionally for quoting CSV values. To disallow quoting, make this field blank."
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#
# Copyright (c) 2021 Airbyte, Inc., all rights reserved.
#

import json

from source_s3 import SourceS3


def test_transform_backslash_t_to_tab(tmp_path):
config_file = tmp_path / "config.json"
with open(config_file, "w") as fp:
json.dump({"format": {"delimiter": "\\t"}}, fp)
source = SourceS3()
config = source.read_config(config_file)
assert config["format"]["delimiter"] == "\t"
1 change: 1 addition & 0 deletions docs/integrations/sources/s3.md
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,7 @@ You can find details on [here](https://arrow.apache.org/docs/python/generated/py

| Version | Date | Pull Request | Subject |
| :--- | :--- | :--- | :--- |
| 0.1.9 | 2022-01-06 | [9163](https://github.com/airbytehq/airbyte/pull/9163) | Work-around for web-UI, `backslash - t` converts to `tab` for `format.delimiter` field. |
| 0.1.7 | 2021-11-08 | [7499](https://github.com/airbytehq/airbyte/pull/7499) | Remove base-python dependencies |
| 0.1.6 | 2021-10-15 | [6615](https://github.com/airbytehq/airbyte/pull/6615) & [7058](https://github.com/airbytehq/airbyte/pull/7058) | Memory and performance optimisation. Advanced options for CSV parsing. |
| 0.1.5 | 2021-09-24 | [6398](https://github.com/airbytehq/airbyte/pull/6398) | Support custom non Amazon S3 services |
Expand Down