Skip to content
This repository was archived by the owner on May 26, 2022. It is now read-only.

Introspector in swarm #157

Merged

Conversation

aarshkshah1992
Copy link
Collaborator

@aarshkshah1992 aarshkshah1992 commented Jan 22, 2020

For libp2p/go-libp2p#775

Changes for swarm to be able to register as a data provider & then provide the data/metrics when it is asked to do so.

@aarshkshah1992 aarshkshah1992 force-pushed the feat/swarm-introspection branch from 68ba983 to 2669004 Compare January 23, 2020 10:28
@aarshkshah1992 aarshkshah1992 changed the base branch from master to feat/introspection January 23, 2020 12:02
Copy link
Member

@raulk raulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little to review here; looks sane so far.

swarm.go Outdated
@@ -9,6 +9,7 @@ import (
"sync/atomic"
"time"

coreit "github.com/libp2p/go-libp2p-core/introspection"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shorten introspection to introspect, and drop the import alias.

@aarshkshah1992 aarshkshah1992 changed the title Introspector in swarm [WIP] Introspector in swarm Jan 24, 2020
@aarshkshah1992 aarshkshah1992 changed the title [WIP] Introspector in swarm Introspector in swarm Jan 27, 2020
@aarshkshah1992
Copy link
Collaborator Author

aarshkshah1992 commented Jan 27, 2020

@raulk Here are the fields that still need some discussion before we implement sourcing for them:

Stream

  1. UserProvidedTags:

    • What does this mean ?
  2. Timeline

    • How do we introspect stream.CloseTime given the fact that we remove streams from the Swarm once they are fully closed or reset. This questions ALSO applies to connection.CloseTime
    • Should we use a timeCache for book-keeping non-active streams to solve this ?
  3. Traffic

    • How do we count the number of packets ?
    • Is the bandwidthCounter.RateIn/Rateout a good approximation for instantaneous bandwidth ? Maybe not. What would be a good way to measure it ?
  4. Status

    • Should we add a "half-closed" status or maybe even more granular "closedRead/closedWrite" states to the introspection protocol to capture the corresponding stream states ?
    • This suffers from the same problem as stream.CloseTime. Given that we remove fully closed/ resetted streams from the book-keeping, how would we show these states ?
  5. Latency

    • How do we source this ?

Connections

  1. RelayedOver

    • Need to understand how this works and where to source the Relay connection from
  2. TransportId

    • What does this represent ? Do we plan to add granular transport introspection to the protocol sometime in the future & is this the unique identifier for that ? Or does this resolve to the multiAddrCode for the transport protocol ?
  3. Status

    • Same problem as Stream.Status. How do we show states other than Open ?
  4. Attribs

    • How do we source the name of the muxer & the encryption method in the swarm ? Is it a good idea to do type assertions in the swarm by accessing the transport.CapableConn member of swarm.Conn ?
  5. Timeline

    • Outbound connections are created in Transport.Dial & upgraded by calling the upgrader from within Transport.Dial (let's assume the Transport is raw for this discussion)
    • Inbound connections are created & upgraded in transport-upgrader.listener.handleIncoming
    • In both cases, the swarm only ever sees the upgraded connection. How do we then source the connection creation & upgrade time in swarm ? How do we make these metrics flow from the transport/upgrader to the swarm ?

Copy link
Member

@raulk raulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few initial comments, mostly about style.

}

func (ci *connIntrospector) endPoints() *introspectpb.EndpointPair {
return &introspectpb.EndpointPair{SrcMultiaddr: ci.c.LocalMultiaddr().String(), DstMultiaddr: ci.c.RemoteMultiaddr().String()}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: multi-line init is more readable in this case.

// TODO Number of packets & instantaneous bandwidth ?
// Should we have a separate "flow-metre" for a connection to prepare for a message oriented world ?
func (ci *connIntrospector) traffic() *introspectpb.Traffic {
if ci.s.bwc != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In these cases, prefer short-circuiting to reduce one level of nesting/indentation.

if ci.s.bwc == nil {
  return nil
}
<rest>


// IntrospectTraffic introspects & returns the overall traffic for this peer
func (s *Swarm) IntrospectTraffic() (*introspectpb.Traffic, error) {
if s.bwc != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverse the condition, and outdent the logic outside the if.

t := &introspectpb.Traffic{}
metrics := s.bwc.GetBandwidthTotals()

t.TrafficIn = &introspectpb.DataGauge{CumBytes: uint64(metrics.TotalIn), InstBw: uint64(metrics.RateIn)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: multi-line init is more readable.

swarm.conns.RLock()
defer swarm.conns.RUnlock()

containsId := func(ids []introspect.ConnectionID, id string) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More efficient to create a map[introspect.ConnectionID]struct{}, and make the predicate test membership. This is O(m*n) worst case scenario; map-based solution would be O(m) at the expense of a little extra memory.

package swarm_test

import (
"context"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regroup imports.

@@ -209,11 +222,13 @@ func (s *Swarm) addConn(tc transport.CapableConn, dir network.Direction) (*Conn,
}

// Wrap and register the connection.
id := uuid.NewV4()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An UUID is pretty inefficient. Requires string comparison. Why not an int32 atomic counter at the swarm level, which you increment with every new connection?

@aarshkshah1992
Copy link
Collaborator Author

@raulk I'm happy to refactor this PR along with implementing the stream/connection scoped metering if you like.

@raulk raulk merged commit 9685707 into libp2p:feat/introspection Feb 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants