-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hold for payment 2024-04-17] [$500] Improve performance of getPersonalDetailByEmail #38960
Comments
Triggered auto assignment to @joekaufmanexpensify ( |
Job added to Upwork: https://www.upwork.com/jobs/~014394e09fe6cfc357 |
Triggered auto assignment to Contributor-plus team member for initial proposal review - @rojiphil ( |
ProposalPlease re-state the problem that we are trying to solve in this issue.Improve performance of getPersonalDetailByEmail What is the root cause of that problem?Improve performance What changes do you think we should make in order to solve the problem?We will first set the map here where we set App/src/libs/PersonalDetailsUtils.ts Lines 24 to 25 in 0a3cd81
And we return personal detail result by using App/src/libs/PersonalDetailsUtils.ts Lines 80 to 81 in 0a3cd81
What alternative solutions did you explore? (Optional) |
ProposalPlease re-state the problem that we are trying to solve in this issue.Improve performance of getPersonalDetailByEmail What is the root cause of that problem?Here: App/src/libs/PersonalDetailsUtils.ts Lines 79 to 81 in fe0699c
as the issue suggest, we have to loop over all the items of personalDetails array to get the email that we want.
What changes do you think we should make in order to solve the problem?we can have 2 options i.e.
For implementing hashmap we can do something like this - Define
|
ProposalPlease re-state the problem that we are trying to solve in this issue.Improve performance of getPersonalDetailByEmail What is the root cause of that problem?
What changes do you think we should make in order to solve the problem?we can use a hashmap and have the email as a key and the personaldetail as a value let emailToDetailsCache: Record<string, PersonalDetails> = {}; and define function buildCache() {
emailToDetailsCache = personalDetails.reduce((acc: Record<string, PersonalDetails>, detail) => {
if (detail && detail.login) {
acc[detail.login.toLowerCase()] = detail;
}
return acc;
}, {});
} then inside the oynx.connect we should call the prev function: Onyx.connect({
key: ONYXKEYS.PERSONAL_DETAILS_LIST,
callback: (val) => {
personalDetails = Object.values(val ?? {});
allPersonalDetails = val;
buildCache(); // Rebuild the cache whenever personalDetails changes
},
}); finally we should use the new object: function getPersonalDetailByEmail(email) {
return emailToDetailsCache[email.toLowerCase()];
} |
updated proposal with explaination of 1st method |
Updated to add code implementation example |
Updated Proposal:Not a major change just fixed typescript errors. |
ProposalPlease re-state the problem that we are trying to solve in this issue.Improve performance of What is the root cause of that problem?Currently What changes do you think we should make in order to solve the problem?updating Oynx.connect to store data in const personalDetailsMap = new Map<string, PersonalDetails>();
Onyx.connect({
key: ONYXKEYS.PERSONAL_DETAILS_LIST,
callback: (val) => {
personalDetails = Object.values(val ?? {});
allPersonalDetails = val;
personalDetails.map((p) => p?.login && personalDetailsMap.set(p.login, p));
},
}); update function getPersonalDetailByEmail(email: string): PersonalDetails | undefined {
return personalDetailsMap.get(email);
} What alternative solutions did you explore? (Optional) |
ProposalPlease re-state the problem that we are trying to solve in this issue.Improve performance of getPersonalDetailByEmail What is the root cause of that problem?getPersonalDetailByEmail loop over every value in our personalDetails array and check which one matches with the email we want, making this a O(n) search. What changes do you think we should make in order to solve the problem?I think create let personalDetails: Array<PersonalDetails | null> = [];
let allPersonalDetails: OnyxEntry<PersonalDetailsList> = {};
+const emailToAccountIDMap: Record<string, number> = {};
Onyx.connect({
key: ONYXKEYS.PERSONAL_DETAILS_LIST,
callback: (val) => {
personalDetails = Object.values(val ?? {});
allPersonalDetails = val;
+ for (const personalDetail of personalDetails) {
+ if (personalDetail?.login){
+ emailToAccountIDMap[personalDetail.login] = personalDetail?.accountID;
+ }
+ }
},
}); And getPersonalDetailByEmail will be function getPersonalDetailByEmail(email: string): PersonalDetails | undefined {
if (!emailToAccountIDMap[email]){
return;
}
return (allPersonalDetails?.[emailToAccountIDMap[email]] ?? {}) as PersonalDetails;
} What alternative solutions did you explore? (Optional) |
Pending proposal review |
ProposalPlease re-state the problem that we are trying to solve in this issueWe aim to optimize the retrieval of personal data by email, overcoming the inefficiency of the current linear search method within a potentially large array. What is the root cause of that problem?The linear search approach has a time complexity of O(n), which leads to slow performance as the array's size increases. What changes do you think we should make in order to solve the problem?We can implement a caching mechanism using a hash map that stores each email as a unique key and its corresponding personal details as the value. Changed code+ // Hash map to maintain a cache of email to personal details
+ const emailToDetailsCache = new Map<string, PersonalDetails>();
+ // Function to populate the cache
+ function buildEmailToDetailsCache(allPersonalDetails: OnyxEntry<PersonalDetailsList>) {
+ // Update the hash map whenever there is a change
+ emailToDetailsCache.clear();
+ Object.values(allPersonalDetails ?? {}).forEach((detail) => {
+ if (!detail?.login) {
+ return;
+ }
+ emailToDetailsCache.set(detail.login, detail);
+ });
+ }
let personalDetails: Array<PersonalDetails | null> = [];
let allPersonalDetails: OnyxEntry<PersonalDetailsList> = {};
Onyx.connect({
key: ONYXKEYS.PERSONAL_DETAILS_LIST,
callback: (val) => {
+ buildEmailToDetailsCache(val);
personalDetails = Object.values(val ?? {});
allPersonalDetails = val;
},
});
function getPersonalDetailByEmail(email: string): PersonalDetails | undefined {
- return (Object.values(allPersonalDetails ?? {}) as PersonalDetails[]).find((detail) => detail?.login === email);
+ return emailToDetailsCache.get(email);
} On a HT account with a Average time Overall performance improvement from
Result screenshots (before / after) |
Update: I could not review the proposals today. Tomorrow, I will prioritize the proposal review and share the feedback. |
Please time this with at least 50,000 accounts |
@flodnv I would if I could, but I don't have access to such an account. |
Thanks all for your proposals. @abzokhattab proposal LGTM as that was the first one to have a more complete workable solution than others. As an optimization, we can populate |
Triggered auto assignment to @luacmartins, see https://stackoverflow.com/c/expensify/questions/7972 for more details. |
Yes @rojiphil I know we don't have to define |
📣 It's been a week! Do we have any satisfactory proposals yet? Do we need to adjust the bounty for this issue? 💸 |
📣 @rojiphil 🎉 An offer has been automatically sent to your Upwork account for the Reviewer role 🎉 Thanks for contributing to the Expensify app! |
📣 @abzokhattab 🎉 An offer has been automatically sent to your Upwork account for the Contributor role 🎉 Thanks for contributing to the Expensify app! Offer link |
@abzokhattab is there an ETA for PR here? |
The PR will be ready today. thanks Joe |
PR is out on staging |
PR on production. Automation didn't work but payment due in 2 days (2024-04-17) |
All set to issue payment here. We need to pay:
Noting that payment here is $500 each since this issue was opened prior to baseline being set at $250. |
@rojiphil please accept the upwork offer so we can pay you |
@abzokhattab $500 sent and contract ended! |
@joekaufmanexpensify Accepted offer. Thanks |
@rojiphil $500 sent and contract ended! |
Upwork job closed. |
All set, thanks everyone! |
Problem
We sometimes have the need to get personal data based on an email, currently the way to do it is calling
getPersonalDetailByEmail
which will loop over every value in our personalDetails array and check which one matches with the email we want, making this a O(n) search. The problem is that this can be really slow for policies with too many users creating a bad user experience.Solution
Use the data we have stored in personalDetails and create a cached map keyed by email so we can use that map in
getPersonalDetailByEmail
making the search O(1). We should also update the cache whenever personalDetails change.Upwork Automation - Do Not Edit
The text was updated successfully, but these errors were encountered: