• Peter Goodspeed-Niklaus's avatar
    start working on building the real overseer (#1795) · 798f781f
    Peter Goodspeed-Niklaus authored
    
    
    * start working on building the real overseer
    
    Unfortunately, this fails to compile right now due to an upstream
    failure to compile which is probably brought on by a recent upgrade
    to rustc v1.47.
    
    * fill in AllSubsystems internal constructors
    
    * replace fn make_metrics with Metrics::attempt_to_register
    
    * update to account for #1740
    
    * remove Metrics::register, rename Metrics::attempt_to_register
    
    * add 'static bounds to real_overseer type params
    
    * pass authority_discovery and network_service to real_overseer
    
    It's not straightforwardly obvious that this is the best way to handle
    the case when there is no authority discovery service, but it seems
    to be the best option available at the moment.
    
    * select a proper database configuration for the availability store db
    
    * use subdirectory for av-store database path
    
    * apply Basti's patch which avoids needing to parameterize everything on Block
    
    * simplify path extraction
    
    * get all tests to compile
    
    * Fix Prometheus double-registry error
    
    for debugging purposes, added this to node/subsystem-util/src/lib.rs:472-476:
    
    ```rust
    Some(registry) => Self::try_register(registry).map_err(|err| {
    	eprintln!("PrometheusError calling {}::register: {:?}", std::any::type_name::<Self>(), err);
    	err
    }),
    ```
    
    That pointed out where the registration was failing, which led to
    this fix. The test still doesn't pass, but it now fails in a new
    and different way!
    
    * authorities must have authority discovery, but not necessarily overseer handlers
    
    * fix broken SpawnedSubsystem impls
    
    detailed logging determined that using the `Box::new` style of
    future generation, the `self.run` method was never being called,
    leading to dropped receivers / closed senders for those subsystems,
    causing the overseer to shut down immediately.
    
    This is not the final fix needed to get things working properly,
    but it's a good start.
    
    * use prometheus properly
    
    Prometheus lets us register simple counters, which aren't very
    interesting. It also allows us to register CounterVecs, which are.
    With a CounterVec, you can provide a set of labels, which can
    later be used to filter the counts.
    
    We were using them wrong, though. This pattern was repeated in a
    variety of places in the code:
    
    ```rust
    // panics with an cardinality mismatch
    let my_counter = register(CounterVec::new(opts, &["succeeded", "failed"])?, registry)?;
    my_counter.with_label_values(&["succeeded"]).inc()
    ```
    
    The problem is that the labels provided in the constructor are not
    the set of legal values which can be annotated, but a set of individual
    label names which can have individual, arbitrary values.
    
    This commit fixes that.
    
    * get av-store subsystem to actually run properly and not die on first signal
    
    * typo fix: incomming -> incoming
    
    * don't disable authority discovery in test nodes
    
    * Fix rococo-v1 missing session keys
    
    * Update node/core/av-store/Cargo.toml
    
    * try dummying out av-store on non-full-nodes
    
    * overseer and subsystems are required only for full nodes
    
    * Reduce the amount of warnings on browser target
    
    * Fix two more warnings
    
    * InclusionInherent should actually have an Inherent module on rococo
    
    * Ancestry: don't return genesis' parent hash
    
    * Update Cargo.lock
    
    * fix broken test
    
    * update test script: specify chainspec as script argument
    
    * Apply suggestions from code review
    
    Co-authored-by: default avatarBastian Köcher <[email protected]>
    
    * Update node/service/src/lib.rs
    
    Co-authored-by: default avatarBastian Köcher <[email protected]>
    
    * node/service/src/lib: Return error via ? operator
    
    * post-merge blues
    
    * add is_collator flag
    
    * prevent occasional av-store test panic
    
    * simplify fix; expand application
    
    * run authority_discovery in Role::Discover when collating
    
    * distinguish between proposer closed channel errors
    
    * add IsCollator enum, remove is_collator CLI flag
    
    * improve formatting
    
    * remove nop loop
    
    * Fix some stuff
    
    Co-authored-by: default avatarAndronik Ordian <[email protected]>
    Co-authored-by: default avatarBastian Köcher <[email protected]>
    Co-authored-by: default avatarFedor Sakharov <[email protected]>
    Co-authored-by: default avatarRobert Habermeier <[email protected]>
    Co-authored-by: default avatarBastian Köcher <[email protected]>
    Co-authored-by: default avatarMax Inden <[email protected]>
    798f781f