Unverified Commit ae69d7a0 authored by asynchronous rob's avatar asynchronous rob Committed by GitHub
Browse files

guide: validation data refactoring (#1576)



* guide: validation data refactoring

* address grumbles from review

* Update roadmap/implementers-guide/src/types/candidate.md
Co-authored-by: default avatarBernhard Schuster <bernhard@ahoi.io>

* last comments from review
Co-authored-by: Sergey Pepyakin's avatarSergei Shulepov <sergei@parity.io>
Co-authored-by: default avatarBernhard Schuster <bernhard@ahoi.io>
parent ae5990c7
Pipeline #103831 passed with stages
in 18 minutes and 45 seconds
......@@ -19,8 +19,8 @@
- [Validators](runtime-api/validators.md)
- [Validator Groups](runtime-api/validator-groups.md)
- [Availability Cores](runtime-api/availability-cores.md)
- [Global Validation Data](runtime-api/global-validation-data.md)
- [Local Validation Data](runtime-api/local-validation-data.md)
- [Persisted Validation Data](runtime-api/persisted-validation-data.md)
- [Full Validation Data](runtime-api/full-validation-data.md)
- [Session Index](runtime-api/session-index.md)
- [Validation Code](runtime-api/validation-code.md)
- [Candidate Pending Availability](runtime-api/candidate-pending-availability.md)
......
......@@ -26,7 +26,7 @@ On `ActiveLeavesUpdate`:
* Otherwise, for each `activated` head in the update:
* Determine if the para is scheduled or is next up on any occupied core by fetching the `availability_cores` Runtime API.
* Determine an occupied core assumption to make about the para. The simplest thing to do is to always assume that if the para occupies a core, that the candidate will become available. Further on, this might be determined based on bitfields seen or validator requests.
* Use the Runtime API subsystem to fetch the global validation data and local validation data.
* Use the Runtime API subsystem to fetch the full validation data.
* Construct validation function params based on validation data.
* Invoke the `collation_producer`.
* Construct a `CommittedCandidateReceipt` using the outputs of the `collation_producer` and signing with the `key`.
......
......@@ -19,8 +19,7 @@ Parachain candidates are validated against their validation function: A piece of
Upon receiving a validation request, the first thing the candidate validation subsystem should do is make sure it has all the necessary parameters to the validation function. These are:
* The Validation Function itself.
* The [`CandidateDescriptor`](../../types/candidate.md#candidatedescriptor).
* The [`LocalValidationData`](../../types/candidate.md#localvalidationdata).
* The [`GlobalValidationSchedule](../../types/candidate.md#globalvalidationschedule).
* The [`ValidationData`](../../types/candidate.md#validationdata).
* The [`PoV`](../../types/availability.md#proofofvalidity).
### Determining Parameters
......
......@@ -64,7 +64,7 @@ To determine availability:
- If the bitfields indicate availability and there is a scheduled `next_up_on_available`, then we can make an `OccupiedCoreAssumption::Included`.
- If the bitfields do not indicate availability, and there is a scheduled `next_up_on_time_out`, and `occupied_core.time_out_at == block_number_under_production`, then we can make an `OccupiedCoreAssumption::TimedOut`.
- If we did not make an `OccupiedCoreAssumption`, then continue on to the next core.
- Now compute the core's `validation_data_hash`: get the `LocalValidationData` from the runtime, given the known `ParaId` and `OccupiedCoreAssumption`; this can be combined with a cached `GlobalValidationData` to compute the hash.
- Now compute the core's `validation_data_hash`: get the `PersistedValidationData` from the runtime, given the known `ParaId` and `OccupiedCoreAssumption`;
- Find an appropriate candidate for the core.
- There are two constraints: `backed_candidate.candidate.descriptor.para_id == scheduled_core.para_id && candidate.candidate.descriptor.validation_data_hash == computed_validation_data_hash`.
- In the event that more than one candidate meets the constraints, selection between the candidates is arbitrary. However, not more than one candidate can be selected per core.
......
......@@ -37,3 +37,19 @@ The next sections will contain information on specific runtime APIs. The format
/// best for the implementation to return an error indicating the failure mode.
fn some_runtime_api(at: Block, arg1: Type1, arg2: Type2, ...) -> ReturnValue;
```
Certain runtime APIs concerning the state of a para require the caller to provide an `OccupiedCoreAssumption`. This indicates how the result of the runtime API should be computed if there is a candidate from the para occupying an availability core in the [Inclusion Module](../runtime/inclusion.md).
The choices of assumption are whether the candidate occupying that core should be assumed to have been made available and included or timed out and discarded, along with a third option to assert that the core was not occupied. This choice affects everything from the parent head-data, the validation code, and the state of message-queues. Typically, users will take the assumption that either the core was free or that the occupying candidate was included, as timeouts are expected only in adversarial circumstances and even so, only in a small minority of blocks directly following validator set rotations.
```rust
/// An assumption being made about the state of an occupied core.
enum OccupiedCoreAssumption {
/// The candidate occupying the core was made available and included to free the core.
Included,
/// The candidate occupying the core timed out and freed the core without advancing the para.
TimedOut,
/// The core was not occupied to begin with.
Free,
}
```
\ No newline at end of file
# Full Validation Data
Yields the full [`ValidationData`](../types/candidate.md#validationdata) at the state of a given block.
```rust
fn full_validation_data(at: Block, ParaId, OccupiedCoreAssumption) -> Option<ValidationData>;
```
# Global Validation Data
Yields the [`GlobalValidationData`](../types/candidate.md#globalvalidationschedule) at the state of a given block. This applies to all para candidates with the relay-parent equal to that block.
```rust
fn global_validation_data(at: Block) -> GlobalValidationData;
```
# Local Validation Data
Yields the [`LocalValidationData`](../types/candidate.md#localvalidationdata) for the given [`ParaId`](../types/candidate.md#paraid) along with an assumption that should be used if the para currently occupies a core: whether the candidate occupying that core should be assumed to have been made available and included or timed out and discarded, along with a third option to assert that the core was not occupied. This choice affects everything from the parent head-data, the validation code, and the state of message-queues. Typically, users will take the assumption that either the core was free or that the occupying candidate was included, as timeouts are expected only in adversarial circumstances and even so, only in a small minority of blocks directly following validator set rotations.
The documentation of [`LocalValidationData`](../types/candidate.md#localvalidationdata) has more information on this dichotomy.
```rust
/// An assumption being made about the state of an occupied core.
enum OccupiedCoreAssumption {
/// The candidate occupying the core was made available and included to free the core.
Included,
/// The candidate occupying the core timed out and freed the core without advancing the para.
TimedOut,
/// The core was not occupied to begin with.
Free,
}
/// Returns the local validation data for the given para and occupied core assumption.
///
/// Returns `None` if either the para is not registered or the assumption is `Freed`
/// and the para already occupies a core.
fn local_validation_data(at: Block, ParaId, OccupiedCoreAssumption) -> Option<LocalValidationData>;
```
# Persisted Validation Data
Yields the [`PersistedValidationData`](../types/candidate.md#persistedvalidationdata) for the given [`ParaId`](../types/candidate.md#paraid) along with an assumption that should be used if the para currently occupies a core:
```rust
/// Returns the persisted validation data for the given para and occupied core assumption.
///
/// Returns `None` if either the para is not registered or the assumption is `Freed`
/// and the para already occupies a core.
fn persisted_validation_data(at: Block, ParaId, OccupiedCoreAssumption) -> Option<PersistedValidationData>;
```
\ No newline at end of file
......@@ -35,9 +35,6 @@ fn update_configuration(f: impl FnOnce(&mut HostConfiguration)) {
*pending = Some(x);
})
}
/// Get the GlobalValidationData, assuming the context is the parent block.
fn global_validation_data() -> GlobalValidationData;
```
## Entry-points
......
......@@ -62,8 +62,8 @@ All failed checks should lead to an unrecoverable error making the block invalid
1. check that each candidate corresponds to a scheduled core and that they are ordered in the same order the cores appear in assignments in `scheduled`.
1. check that `scheduled` is sorted ascending by `CoreIndex`, without duplicates.
1. check that there is no candidate pending availability for any scheduled `ParaId`.
1. check that each candidate's `validation_data_hash` corresponds to a `(LocalValidationData, GlobalValidationData)` computed from the current state.
> NOTE: With contextual execution in place, local and global validation data will be obtained as of the state of the context block. However, only the state of the current block can be used for such a query.
1. check that each candidate's `validation_data_hash` corresponds to a `PersistedValidationData` computed from the current state.
> NOTE: With contextual execution in place, validation data will be obtained as of the state of the context block. However, only the state of the current block can be used for such a query.
1. If the core assignment includes a specific collator, ensure the backed candidate is issued by that collator.
1. Ensure that any code upgrade scheduled by the candidate does not happen within `config.validation_upgrade_frequency` of `Paras::last_code_upgrade(para_id, true)`, if any, comparing against the value of `Paras::FutureCodeUpgrades` for the given para ID.
1. Check the collator's signature on the candidate data.
......
......@@ -113,7 +113,7 @@ OutgoingParas: Vec<ParaId>;
* `is_parathread(ParaId) -> bool`: Returns true if the para ID references any live parathread.
* `last_code_upgrade(id: ParaId, include_future: bool) -> Option<BlockNumber>`: The block number of the last scheduled upgrade of the requested para. Includes future upgrades if the flag is set. This is the `expected_at` number, not the `activated_at` number.
* `local_validation_data(id: ParaId) -> Option<LocalValidationData>`: Get the LocalValidationData of the given para, assuming the context is the parent block. Returns `None` if the para is not known.
* `persisted_validation_data(id: ParaId) -> Option<PersistedValidationData>`: Get the PersistedValidationData of the given para, assuming the context is the parent block. Returns `None` if the para is not known.
## Finalization
......
......@@ -32,32 +32,17 @@ Often referred to as PoV, this is a type-safe wrapper around bytes (`Vec<u8>`) w
struct PoV(Vec<u8>);
```
## Omitted Validation Data
Validation data that is often omitted from types describing candidates as it can be derived from the relay-parent of the candidate. However, with the expectation of state pruning, these are best kept available elsewhere as well.
This contains the [`GlobalValidationData`](candidate.md#globalvalidationschedule) and [`LocalValidationData`](candidate.md#localvalidationdata)
```rust
struct OmittedValidationData {
/// The global validation schedule.
global_validation: GlobalValidationData,
/// The local validation data.
local_validation: LocalValidationData,
}
```
## Available Data
This is the data we want to keep available for each [candidate](candidate.md) included in the relay chain.
This is the data we want to keep available for each [candidate](candidate.md) included in the relay chain. This is the PoV of the block, as well as the [`PersistedValidationData`](candidate.md#persistedvalidationdata)
```rust
struct AvailableData {
/// The Proof-of-Validation of the candidate.
pov: PoV,
/// The omitted validation data.
omitted_validation: OmittedValidationData,
/// The persisted validation data used to check the candidate.
validation_data: PersistedValidationData,
}
```
......
......@@ -33,7 +33,7 @@ struct CandidateReceipt {
## Full Candidate Receipt
This is the full receipt type. The `GlobalValidationData` and the `LocalValidationData` are technically redundant with the `inner.relay_parent`, which uniquely describes the a block in the blockchain from whose state these values are derived. The [`CandidateReceipt`](#candidate-receipt) variant is often used instead for this reason.
This is the full receipt type. The `ValidationData` are technically redundant with the `inner.relay_parent`, which uniquely describes the block in the blockchain from whose state these values are derived. The [`CandidateReceipt`](#candidate-receipt) variant is often used instead for this reason.
However, the Full Candidate Receipt type is useful as a means of avoiding the implicit dependency on availability of old blockchain state. In situations such as availability and approval, having the full description of the candidate within a self-contained struct is convenient.
......@@ -41,10 +41,7 @@ However, the Full Candidate Receipt type is useful as a means of avoiding the im
/// All data pertaining to the execution of a para candidate.
struct FullCandidateReceipt {
inner: CandidateReceipt,
/// The global validation schedule.
global_validation: GlobalValidationData,
/// The local validation data.
local_validation: LocalValidationData,
validation_data: ValidationData,
}
```
......@@ -77,8 +74,9 @@ struct CandidateDescriptor {
relay_parent: Hash,
/// The collator's sr25519 public key.
collator: CollatorId,
/// The blake2-256 hash of the local and global validation data. These are extra parameters
/// derived from relay-chain state that influence the validity of the block.
/// The blake2-256 hash of the persisted validation data. These are extra parameters
/// derived from relay-chain state that influence the validity of the block which
/// must also be kept available for secondary checkers.
validation_data_hash: Hash,
/// The blake2-256 hash of the pov-block.
pov_hash: Hash,
......@@ -88,57 +86,39 @@ struct CandidateDescriptor {
}
```
## ValidationData
## GlobalValidationData
The validation data provide information about how to validate both the inputs and outputs of a candidate. There are two types of validation data: [persisted](#persistedvalidationdata) and [transient](#transientvalidationdata). Their respective sections of the guide elaborate on their functionality in more detail.
The global validation schedule comprises of information describing the global environment for para execution, as derived from a particular relay-parent. These are parameters that will apply to all parablocks executed in the context of this relay-parent.
This information is derived from the chain state and will vary from para to para, although some of the fields may be the same for every para.
```rust
/// Extra data that is needed along with the other fields in a `CandidateReceipt`
/// to fully validate the candidate.
///
/// These are global parameters that apply to all candidates in a block.
struct GlobalValidationData {
/// The maximum code size permitted, in bytes.
max_code_size: u32,
/// The maximum head-data size permitted, in bytes.
max_head_data_size: u32,
/// The relay-chain block number this is in the context of.
block_number: BlockNumber,
}
```
## LocalValidationData
This is validation data needed for execution of candidate pertaining to a specific para and relay-chain block.
Persisted validation data are generally derived from some relay-chain state to form inputs to the validation function, and as such need to be persisted by the availability system to avoid dependence on availability of the relay-chain state. The backing phase of the inclusion pipeline ensures that everything that is included in a valid fork of the relay-chain already adheres to the transient constraints.
Unlike the [`GlobalValidationData`](#globalvalidationdata), which only depends on a relay-parent, this is parameterized both by a relay-parent and a choice of one of two options:
1. Assume that the candidate pending availability on this para at the onset of the relay-parent is included.
1. Assume that the candidate pending availability on this para at the onset of the relay-parent is timed-out.
The validation data also serve the purpose of giving collators a means of ensuring that their produced candidate and the commitments submitted to the relay-chain alongside it will pass the checks done by the relay-chain when backing, and give validators the same understanding when determining whether to second or attest to a candidate.
This choice can also be expressed as a choice of which parent head of the para will be built on - either optimistically on the candidate pending availability or pessimistically on the one that is surely included.
Para validation happens optimistically before the block is authored, so it is not possible to predict with 100% accuracy what will happen in the earlier phase of the [`InclusionInherent`](../runtime/inclusioninherent.md) module where new availability bitfields and availability timeouts are processed. This is what will eventually define whether a candidate can be backed within a specific relay-chain block.
Since the commitments of the validation function are checked by the relay-chain, secondary checkers can rely on the invariant that the relay-chain only includes para-blocks for which these checks have already been done. As such, there is no need for the validation data used to inform validators and collators about the checks the relay-chain will perform to be persisted by the availability system. Nevertheless, we expose it so the backing validators can validate the outputs of a candidate before voting to submit it to the relay-chain and so collators can collate candidates that satisfy the criteria implied these transient validation data.
Design-wise we should maintain two properties about this data structure:
1. The `LocalValidationData` should be relatively lightweight primarly because it is constructed during inclusion for each candidate.
1. To make contextual execution possible, `LocalValidationData` should be constructable only having access to the latest relay-chain state for the past `k` blocks. That implies
1. The `ValidationData` should be relatively lightweight primarly because it is constructed during inclusion for each candidate.
1. To make contextual execution possible, `ValidationData` should be constructable only having access to the latest relay-chain state for the past `k` blocks. That implies
either that the relay-chain should maintain all the required data accessible or somehow provided indirectly with a header-chain proof and a state proof from there.
> TODO: determine if balance/fees are even needed here.
> TODO: message queue watermarks (first downward messages, then XCMP channels)
```rust
struct ValidationData {
persisted: PersistedValidationData,
transient: TransientValidationData,
}
```
## PersistedValidationData
Validation data that needs to be persisted for secondary checkers. See the section on [`ValidationData`](#validationdata) for more details.
```rust
/// Extra data that is needed along with the other fields in a `CandidateReceipt`
/// to fully validate the candidate. These fields are parachain-specific.
struct LocalValidationData {
struct PersistedValidationData {
/// The parent head-data.
parent_head: HeadData,
/// The balance of the parachain at the moment of validation.
balance: Balance,
/// The blake2-256 hash of the validation code used to execute the candidate.
validation_code_hash: Hash,
/// Whether the parachain is allowed to upgrade its validation code.
///
/// This is `Some` if so, and contains the number of the minimum relay-chain
......@@ -150,10 +130,34 @@ struct LocalValidationData {
/// height. This may be equal to the current perceived relay-chain block height, in
/// which case the code upgrade should be applied at the end of the signaling
/// block.
///
/// This informs a relay-chain backing check and the parachain logic.
code_upgrade_allowed: Option<BlockNumber>,
/// The relay-chain block number this is in the context of. This informs the collator.
block_number: BlockNumber,
}
```
## TransientValidationData
These validation data are derived from some relay-chain state to check outputs of the validation function.
```rust
struct TransientValidationData {
/// The maximum code size permitted, in bytes, of a produced validation code upgrade.
///
/// This informs a relay-chain backing check and the parachain logic.
max_code_size: u32,
/// The maximum head-data size permitted, in bytes.
///
/// This informs a relay-chain backing check and the parachain collator.
max_head_data_size: u32,
/// The balance of the parachain at the moment of validation.
balance: Balance,
/// The list of MQC heads for the inbound channels paired with the sender para ids. This
/// vector is sorted ascending by the para id and doesn't contain multiple entries with the same
/// sender.
/// sender. This informs the collator.
hrmp_mqc_heads: Vec<(ParaId, Hash)>,
}
```
......@@ -215,10 +219,8 @@ This struct encapsulates the outputs of candidate validation.
struct ValidationOutputs {
/// The head-data produced by validation.
head_data: HeadData,
/// The global validation schedule.
global_validation_data: GlobalValidationData,
/// The local validation data.
local_validation_data: LocalValidationData,
/// The validation data, persisted and transient.
validation_data: ValidationData,
/// Messages directed to other paras routed via the relay chain.
horizontal_messages: Vec<OutboundHrmpMessage>,
/// Upwards messages to the relay chain.
......
......@@ -316,13 +316,18 @@ enum RuntimeApiRequest {
SessionIndex(ResponseChannel<SessionIndex>),
/// Get the validation code for a specific para, using the given occupied core assumption.
ValidationCode(ParaId, OccupiedCoreAssumption, ResponseChannel<Option<ValidationCode>>),
/// Get the global validation schedule at the state of a given block.
GlobalValidationData(ResponseChannel<GlobalValidationData>),
/// Get the local validation data for a specific para, with the given occupied core assumption.
LocalValidationData(
/// Get the persisted validation data at the state of a given block for a specific para,
/// with the given occupied core assumption.
PersistedValidationData(
ParaId,
OccupiedCoreAssumption,
ResponseChannel<Option<LocalValidationData>>,
ResponseChannel<Option<PersistedValidationData>>,
),
/// Get the full validation data for a specific para, with the given occupied core assumption.
FullValidationData(
ParaId,
OccupiedCoreAssumption,
ResponseChannel<Option<ValidationData>>,
),
/// Get information about all availability cores.
AvailabilityCores(ResponseChannel<Vec<CoreState>>),
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment