-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉Source HubSpot: Adds form_submission and property_history streams #7787
🎉Source HubSpot: Adds form_submission and property_history streams #7787
Conversation
…-stream-contact_form_submissions
…bmissions' of https://github.com/tinomerl/airbyte into tinomerl/6493-source-hubspot-add-stream-contact_form_submissions
…-stream-contact_form_submissions
…-stream-contact_form_submissions
…bmissions' of https://github.com/tinomerl/airbyte into tinomerl/6493-source-hubspot-add-stream-contact_form_submissions
…-stream-contact_form_submissions
Tino Merl seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
@@ -34,5 +34,5 @@ COPY source_hubspot ./source_hubspot | |||
ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py" | |||
ENTRYPOINT ["python", "/airbyte/integration_code/main.py"] | |||
|
|||
LABEL io.airbyte.version=0.1.23 | |||
LABEL io.airbyte.version=0.2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why changing to 0.2.0 and not 0.1.24?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcosmarxm thanks for the feedback. I bumped the minor version up since the two new streams add functionality and are not bugfixes.
…bmissions' of github.com:tinomerl/airbyte into marcos/test-pr-7787
@@ -316,7 +316,7 @@ def _field_to_datetime(value: Union[int, str]) -> pendulum.datetime: | |||
def _filter_old_records(self, records: Iterable) -> Iterable: | |||
"""Skip records that was updated before our start_date""" | |||
for record in records: | |||
updated_at = record[self.updated_at_field] | |||
updated_at = record.get(self.updated_at_field) or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the .get
method already use None as default
I tried to run the tests locally and didn't work. Can you share the results from your tests, check https://docs.airbyte.io/connector-development/testing-connectors/source-acceptance-tests-reference#architecture-of-standard-tests or the checklist Updating connector |
@marcosmarxm currently running in timeouts when running the acceptance tests for basic_read and full_refresh. everything else works fine. also fixed your remark with .get but didn't commit because of the current timeouts. i guess it's because there are no explicit endpoints for form submissions. the contact property changes could be reworked if i use the endpoint for recently changed contacts. but the problem with the form submissions is that in order to get the values for a form submission one needs first to get every form available and afterwards get the form submission data. currently there is no specific endpoint to get every form submission independently from the form. |
You can cache the response from forms? Or, you need to run a full refresh to forms? |
@marcosmarxm how would i do the caching? I reduced the time range which will be called. last week i tried with about a year. also removed the Unit tests
Integration Tests
acceptance Tests
|
…bmissions' of github.com:tinomerl/airbyte into marcos/test-pr-7787
Hi @tinomerl! |
page_filter = "vidOffset" | ||
|
||
def list(self, fields) -> Iterable: | ||
properties = self._api.get(f"/properties/v2/contact/properties") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t understand why we can’t just define the contact
for this stream as self.entity
and not override the list
method, it will be executed as in other streams, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entity can't be used because we have to call a legacy API endpoint. The newer API doesn't have the possibility to call for the history of a contact property.
And since the data is also saved in a two column way with property and value as their names the value column would hold different properties and have a mixed type.
yield from self.read(partial(self._api.get, url=self.url), params) | ||
|
||
def _transform(self, records: Iterable) -> Iterable: | ||
record: Dict[str, Dict] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
record: Dict[str, Dict] | |
I think that this line does not carry any information we need, it is rather superfluous than useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
params = { | ||
"limit": 50, | ||
} | ||
forms = self.read(partial(self._api.get, url="/marketing/v3/forms"), params) | ||
for row in forms: | ||
record = self.read(partial(self._api.get, url=f"{self.url}/{row['id']}"), params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
params = { | |
"limit": 50, | |
} | |
forms = self.read(partial(self._api.get, url="/marketing/v3/forms"), params) | |
for row in forms: | |
record = self.read(partial(self._api.get, url=f"{self.url}/{row['id']}"), params) | |
forms = self.read(partial(self._api.get, url="/marketing/v3/forms")) | |
for row in forms: | |
record = self.read(partial(self._api.get, url=f"{self.url}/{row['id']}")) |
There is no need to explicitly pass only the limit
in the parameters, since we will get it from the class variable anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
forms = self.read(partial(self._api.get, url="/marketing/v3/forms"), params) | ||
for row in forms: | ||
record = self.read(partial(self._api.get, url=f"{self.url}/{row['id']}"), params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forms = self.read(partial(self._api.get, url="/marketing/v3/forms"), params) | |
for row in forms: | |
record = self.read(partial(self._api.get, url=f"{self.url}/{row['id']}"), params) | |
forms_stream = FormStream(api=self._api, start_date=str(self._start_date)) | |
for row in forms_stream.list(fields={}): |
I think, it will be more clearly, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is better. i added it.
if key == "lastmodifieddate": | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please leave a comment as to why we are skipping processing for the lastmodifieddate
key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment is added.
}, | ||
"objectTypeId": { | ||
"value": { | ||
"type": ["null", "string"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are u sure, that field which consists Id
in name has string
type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah HubSpot returns a value which consists only of this string "0-1". Therefore String.
"type": ["null", "string"] | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please, add empty string in the end of file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
"source-type": { | ||
"type": ["null", "string"] | ||
}, | ||
"source-id": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question as here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A Property can either be changed by workflow, form or manully. The id for either is saved in the source-id
. The thin is, that for workflow the id is really a id as in integer. but the id of a form is a mixture of characters and numbers. Therefore it is safer to just cast this as a string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comments.
Also, please, run command ./gradlew format
to format project. This will format the changes in the project to the required template.
NOTE: If the formatting process also affects files in which you did not make changes, please roll back the changes to those files to the state in which they were before. Thanks 😃
Hey @yevhenii-ldv thanks for the Feedback! I'll get to working to after the 19th since i'm not available next week. Just wanted to give you a heads up and let u know. :) |
@tinomerl I added the label |
Hey @yevhenii-ldv sorry for taking so long to work on your feedback. i implemented everything. The tests still throw the same error for contact lists mentioned here. But additionally i get a sequential read error for the property history stream. The problem lies in the calculated fields like
The second call will of course have a higher value for the same property. Is there a way to filter out a field if it contains a specific value? |
@tinomerl can you solve the conflicts? Other PR add the form submissions stream. |
…e-hubspot-add-stream-contact_form_submissions
@marcosmarxm solved the conflicts |
…bmissions' of github.com:tinomerl/airbyte into marcos/test-pr-7787
* github schema * GitHub dockerfile * formatted * 🎉Source HubSpot: Adds form_submission and property_history streams (#7787) * Began working on HubSpot Form Submission Connector * Added Property History Stream * Added form_guid to as value to form_submissions_stream. * Finalized the Form Submission Stream * Added documentation and test config * Corrected Version Number * updated version number to 0.1.25 * removed or none worked on tests * Changed code due to review comments & merges * readded Propertyhistory after merging * bump connector version Co-authored-by: Tino Merl <tino.merl@park-sieben.com> Co-authored-by: Marcos Marx <marcosmarxm@gmail.com> * bump connector version Co-authored-by: Tino Merl <35485536+tinomerl@users.noreply.github.com> Co-authored-by: Tino Merl <tino.merl@park-sieben.com> Co-authored-by: Marcos Marx <marcosmarxm@gmail.com>
What
How
Describe the solution
Recommended reading order
x.java
y.python
Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/SUMMARY.md
docs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampledocs/integrations/README.md
airbyte-integrations/builds.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing./publish
command described hereUpdating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing./publish
command described hereConnector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates
then checking in your changesThis change is