Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add enforced HTTP/3 #1177

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Add enforced HTTP/3 #1177

wants to merge 1 commit into from

Conversation

h3xOo
Copy link

@h3xOo h3xOo commented Feb 12, 2025

I hit nullptr dereference, when I manually created MultiPerform, but I didn't add any Interceptors.

@COM8
Copy link
Member

COM8 commented Feb 12, 2025

@h3xOo thanks for contributing!
Do you mind sharing a bit of example code how you run into this bug?
Or even better could you add test case for it?

@h3xOo
Copy link
Author

h3xOo commented Feb 12, 2025

I really have no idea, why that happened.
I have code like:

mpr make_mpr(...)
    mpr mpr;
    for (...) {
        session = ...;
        mpr.AddSession(session);
    }
    return mpr;
}

...
auto mpr = make_mpr(...);
while (true) {
    auto responses = mpr.Get();
    ...
    mpr = make_mpr(...);
}

And Get segfaulted in MultiPerform::intercept(), because of null.
But what's funnier, when I had mpr.Get(); auto responses = mpr.Get(), then it segfaulted in some shared_ptr implementation 🙃

I now think, can it be connected to some move assignment?
I was using multiperform in loop, and first iteration passed, but second segfaulted.

Copy link
Member

@COM8 COM8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing the VERSION_3_0_ONLY feature!
I'm just not happy with your changes regarding the null interceptors.

Other than that LGTM.

Comment on lines 356 to 358
if (*current_interceptor_ == nullptr) {
return std::nullopt;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not happy this this change. Although at first glance this makes the code more robust, it hides an underlying problem.
To me the only way something like this could ever happen is in case somebody appends a null pointer to the interceptor list (or a pointer that gets null).

In case this happens I would expect my code to segfault highlighting my faulty logic instead of hiding it even further.

@COM8 COM8 added this to the CPR 1.12.0 milestone Feb 16, 2025
@h3xOo
Copy link
Author

h3xOo commented Feb 18, 2025

This is reduced example of a bug (compile with -std=c++20) (make a lot of requests to a search random integer < 100 at google):

#include <array>
#include <cpr/cpr.h>
#include <iostream>
#include <memory>
#include <string_view>
#include <unordered_set>

using std::literals::operator""sv;
using std::literals::operator""s;

static constexpr auto ADDR = "https://google.com/search"sv;

static constexpr auto HTTP_VERSION = cpr::HttpVersionCode::VERSION_2_0_PRIOR_KNOWLEDGE;

template<typename T>
static cpr::MultiPerform make_session(T const &ids)
{
        cpr::MultiPerform multiperform;
        for (auto const &id : ids) {
                auto url = cpr::Url { ADDR };
                auto session = std::make_shared<cpr::Session>();
                session->SetUrl(url);
                session->SetParameters(cpr::Parameters { { "q", id } });
                session->SetHttpVersion(cpr::HttpVersion { HTTP_VERSION });
                multiperform.AddSession(session);
        }
        return multiperform;
}

static std::unordered_set<std::string> fetch_metadata_recursive(std::span<std::string> starting_ids)
{
        std::unordered_set<std::string> obtained { starting_ids.begin(), starting_ids.end() };
        auto layer = make_session(starting_ids);
        std::vector<std::string> next_layer;
        while (true) {
                std::vector<cpr::Response> responses = layer.Get();
                for (auto &response : responses) {
                        if (!cpr::status::is_success(response.status_code)) {
                                std::cout.flush();
                                throw std::runtime_error(std::format("not successful response on '{}' with code={} error={}",
                                                                     response.url.str(), response.status_code, response.error.message));
                        }
                        static constexpr size_t bound = 10;
                        for (size_t i = 0; i < bound; i++) {
                                auto id = std::to_string(std::rand() % 100);
                                if (obtained.contains(id))
                                        continue;
                                obtained.insert(id);
                                next_layer.push_back(id);
                        }
                }
                if (next_layer.empty())
                        break;
                layer = make_session(next_layer);
                next_layer.clear();
        }
        return obtained;
}

int main()
{
        std::array ids = { "219"s, "780"s, "726"s, "488"s, "270"s, "135"s, "929"s, "319"s, "261"s, "898"s };
        auto obtained = fetch_metadata_recursive(ids);
        return 0;
}

Gdb's backtrace shows cpr::MultiPerform::intercept()+84.

@COM8
Copy link
Member

COM8 commented Feb 19, 2025

@h3xOo thanks! To me this sounds like a race condition then. Let me investigate this over the weekend.

@COM8
Copy link
Member

COM8 commented Feb 23, 2025

I was able to reproduce it and I created a separate issue for it here: #1186

Workaround:
Wrap cpr::MultiPerform in a std::uniqe_ptr.

Example:

#include <array>
#include <iostream>
#include <memory>
#include <span>
#include <string_view>
#include <unordered_set>
#include <cpr/cpr.h>

using std::literals::operator""sv;
using std::literals::operator""s;

static constexpr auto ADDR = "https://google.com/search"sv;

static constexpr auto HTTP_VERSION = cpr::HttpVersionCode::VERSION_2_0_PRIOR_KNOWLEDGE;

template <typename T>
static std::unique_ptr<cpr::MultiPerform> make_session(T const& ids) {
    std::unique_ptr<cpr::MultiPerform> multiperform = std::make_unique<cpr::MultiPerform>();
    for (auto const& id : ids) {
        auto url = cpr::Url{ADDR};
        auto session = std::make_shared<cpr::Session>();
        session->SetUrl(url);
        session->SetParameters(cpr::Parameters{{"q", id}});
        session->SetHttpVersion(cpr::HttpVersion{HTTP_VERSION});
        multiperform->AddSession(session);
    }
    return multiperform;
}

static std::unordered_set<std::string> fetch_metadata_recursive(std::span<std::string> starting_ids) {
    std::unordered_set<std::string> obtained{starting_ids.begin(), starting_ids.end()};
    auto layer = make_session(starting_ids);
    std::vector<std::string> next_layer;
    while (true) {
        std::vector<cpr::Response> responses = layer->Get();
        for (auto& response : responses) {
            if (!cpr::status::is_success(response.status_code)) {
                std::cout.flush();
                throw std::runtime_error(std::format("not successful response on '{}' with code={} error={}",
                                                     response.url.str(), response.status_code, response.error.message));
            }
            static constexpr size_t bound = 10;
            for (size_t i = 0; i < bound; i++) {
                auto id = std::to_string(std::rand() % 100);
                if (obtained.contains(id))
                    continue;
                obtained.insert(id);
                next_layer.push_back(id);
            }
        }
        if (next_layer.empty())
            break;
        layer = make_session(next_layer);
        next_layer.clear();
    }
    return obtained;
}

int main() {
    std::array ids = {"219"s, "780"s, "726"s, "488"s, "270"s, "135"s, "929"s, "319"s, "261"s, "898"s};
    auto obtained = fetch_metadata_recursive(ids);
    return 0;
}

@h3xOo
Copy link
Author

h3xOo commented Feb 23, 2025

So should I change this PR to only add enforced HTTP/3 and remove workaround commit?

@COM8
Copy link
Member

COM8 commented Feb 24, 2025

Yes, please keep only the HTTP/3 stuff. Then we are ready to go.

@h3xOo h3xOo changed the title Add enforced HTTP/3 and don't dereference nullptr's Add enforced HTTP/3 Feb 25, 2025
Copy link
Member

@COM8 COM8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants