Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixpanel: Filtering out individual items based on datetime in state #15091

Merged
merged 10 commits into from
Aug 23, 2022

Conversation

Kopiczek
Copy link

What

Mixpanel export API allow to filter based on DATE (without the time portion).
This means that if a sync is run multiple times a day, we keep duplicating the same data over and over.
This change filters the records after getting them from source.

How

This is done on export stream only by comparing the cursor field of each record to stream state.

Recommended reading order

  1. exports.py
  2. base.py

🚨 User Impact 🚨

Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Tests

Unit

Test that verifies that a record is filtered by state.

Integration

N/A

Acceptance

N/A

@CLAassistant
Copy link

CLAassistant commented Jul 27, 2022

CLA assistant check
All committers have signed the CLA.

@marcosmarxm
Copy link
Member

marcosmarxm commented Aug 1, 2022

/test connector=connectors/source-mixpanel

🕑 connectors/source-mixpanel https://github.com/airbytehq/airbyte/actions/runs/2777526749
❌ connectors/source-mixpanel https://github.com/airbytehq/airbyte/actions/runs/2777526749
🐛 https://gradle.com/s/5dr52h6j75y2i

Build Failed

Test summary info:

Could not find result summary

@sajarin sajarin added the bounty-S Maintainer program: claimable small bounty PR label Aug 5, 2022
Kris added 2 commits August 8, 2022 15:11
# Conflicts:
#	airbyte-integrations/connectors/source-mixpanel/source_mixpanel/streams/export.py
@Kopiczek
Copy link
Author

Kopiczek commented Aug 8, 2022

/test connector=connectors/source-mixpanel

@Kopiczek
Copy link
Author

Kopiczek commented Aug 9, 2022

@marcosmarxm am I doing something wrong trying to initialize those test you did?

@marcosmarxm
Copy link
Member

@Kopiczek sorry the delay here, I'll take a look.

@Kopiczek
Copy link
Author

Kopiczek commented Aug 9, 2022

/test connector=connectors/source-mixpanel

@marcosmarxm
Copy link
Member

marcosmarxm commented Aug 10, 2022

/test connector=connectors/source-mixpanel

🕑 connectors/source-mixpanel https://github.com/airbytehq/airbyte/actions/runs/2833069734
❌ connectors/source-mixpanel https://github.com/airbytehq/airbyte/actions/runs/2833069734
🐛 https://gradle.com/s/ncqdmsxtyb752

Build Failed

Test summary info:

	 =========================== short test summary info ============================
	 FAILED unit_tests/test_streams.py::test_export_stream - assert 1 == 0
	 �[31m================== �[31m�[1m1 failed�[0m, �[32m22 passed�[0m, �[33m138 warnings�[0m�[31m in 0.56s�[0m�[31m ==================�[0m

Copy link
Member

@marcosmarxm marcosmarxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit test are broken @Kopiczek please implement a new unit test

@Kopiczek
Copy link
Author

Kopiczek commented Aug 10, 2022

Unit test are broken @Kopiczek please implement a new unit test

@marcosmarxm This should be working now. All unit tests passing locally

@Kopiczek Kopiczek requested a review from marcosmarxm August 10, 2022 23:58
@sajarin sajarin added internal and removed bounty bounty-S Maintainer program: claimable small bounty PR labels Aug 11, 2022
@marcosmarxm
Copy link
Member

marcosmarxm commented Aug 11, 2022

/test connector=connectors/source-mixpanel

🕑 connectors/source-mixpanel https://github.com/airbytehq/airbyte/actions/runs/2843095431
❌ connectors/source-mixpanel https://github.com/airbytehq/airbyte/actions/runs/2843095431
🐛 https://gradle.com/s/bkvtammb7ejlc

Build Failed

Test summary info:

	 =========================== short test summary info ============================
	 FAILED unit_tests/test_streams.py::test_export_stream_request_params - assert...
	 �[31m================== �[31m�[1m1 failed�[0m, �[32m23 passed�[0m, �[33m140 warnings�[0m�[31m in 0.56s�[0m�[31m ==================�[0m

@Kopiczek
Copy link
Author

Sorry for all the back and forth.
Since the code treats the long date from source as current timezone we are doing the same when going from cursor to long
https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-mixpanel/source_mixpanel/streams/export.py#L122-L124

We do the same, forgot to include that in the test and hardcoded my local timezone into expected result

@marcosmarxm
Copy link
Member

marcosmarxm commented Aug 12, 2022

/test connector=connectors/source-mixpanel

🕑 connectors/source-mixpanel https://github.com/airbytehq/airbyte/actions/runs/2846757501
✅ connectors/source-mixpanel https://github.com/airbytehq/airbyte/actions/runs/2846757501
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        77      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/tests/test_incremental.py       121     25    79%
source_acceptance_test/utils/common.py                  77     17    78%
source_acceptance_test/tests/test_core.py              307    106    65%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
------------------------------------------------------------------------
TOTAL                                                  970    246    75%
Name                                         Stmts   Miss  Cover
----------------------------------------------------------------
source_mixpanel/streams/revenue.py              14      0   100%
source_mixpanel/streams/funnels.py              56      0   100%
source_mixpanel/streams/annotations.py           6      0   100%
source_mixpanel/streams/__init__.py              9      0   100%
source_mixpanel/property_transformation.py      19      0   100%
source_mixpanel/__init__.py                      2      0   100%
source_mixpanel/source.py                       45      1    98%
source_mixpanel/streams/engage.py               89      5    94%
source_mixpanel/streams/base.py                 71      7    90%
source_mixpanel/streams/cohorts.py              20      3    85%
source_mixpanel/streams/export.py               51      8    84%
source_mixpanel/streams/cohort_members.py       19      5    74%
source_mixpanel/testing.py                      25     11    56%
----------------------------------------------------------------
TOTAL                                          426     40    91%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_incremental.py:23: `future_state_path` not specified, skipping
================== 26 passed, 1 skipped in 402.92s (0:06:42) ===================

@Kopiczek
Copy link
Author

@marcosmarxm is there anything else you need for this to be merged?

@marcosmarxm
Copy link
Member

@roman-yermilov-gl can you do the final review here?

@Kopiczek
Copy link
Author

@marcosmarxm need your approval as well since you requested changes.

@marcosmarxm
Copy link
Member

@marcosmarxm need your approval as well since you requested changes.

@Kopiczek can you bump the Dockerfile version and documentation, I'll publish after it

@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Aug 22, 2022
@Kopiczek
Copy link
Author

@marcosmarxm done

@marcosmarxm
Copy link
Member

marcosmarxm commented Aug 23, 2022

/publish connector=connectors/source-mixpanel

🕑 Publishing the following connectors:
connectors/source-mixpanel
https://github.com/airbytehq/airbyte/actions/runs/2908785371


Connector Did it publish? Were definitions generated?
connectors/source-mixpanel

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@marcosmarxm marcosmarxm merged commit a4916a4 into airbytehq:master Aug 23, 2022
@Kopiczek Kopiczek deleted the mixpanel branch August 23, 2022 16:10
rodireich pushed a commit that referenced this pull request Aug 25, 2022
…15091)

* Mixpanel: Filtering out individual items based on datetime in state

* Mixpanel: Filtering out individual items based on datetime in state

* Mixpanel: Use where API option to further filter out events

* Fixing unit tests

* Fixing unit test timezone issues

* Version bump + docs

* auto-bump connector version [ci skip]

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation community connectors/source/mixpanel internal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants