You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: rna-velocity/rna-velocity.Rmd
+35-28
Original file line number
Diff line number
Diff line change
@@ -48,7 +48,7 @@ _RNA velocity_ (introduced by @La_Manno2018-velocyto) is one approach to address
48
48
In practice, RNA velocity analyses are often summarized by plots such as those shown below (from the [scVelo tutorial](https://scvelo.readthedocs.io/DynamicalModeling.html)): on the left, a vector field overlaid on a low-dimensional embedding, visualizing the 'flow' encoded by the velocities, and on the right, phase plots illustrating single genes.
49
49
We will see in this lecture how to generate such plots from raw droplet scRNA-seq data, and how to interpret the results.
The RNA velocity is defined as the rate of change of the mature RNA abundance in a cell, and can be estimated from scRNA-seq data by joint modeling of estimated unspliced (pre-mRNA) and spliced (mature mRNA) abundances.
54
54
This exploitation of the underlying molecular dynamics of the process sets it apart from other approaches for trajectory analysis, which typically use the similarity of the estimated gene expression profiles among cells to construct a path through the observed data.
@@ -126,7 +126,7 @@ This is, in essence, the idea behind the approach taken by @La_Manno2018-velocyt
126
126
If we fix one of the parameter values (e.g., setting $\beta=1$ as in @La_Manno2018-velocyto, corresponding to an assumption of a shared splicing rate between genes) we can estimate the other one ($\gamma$), and consequently obtain an estimate of the RNA velocity $v$, since $$v=\frac{ds(t)}{dt}=\beta u(t)-\gamma s(t).$$
127
127
Notably, these velocities can be derived directly from the phase plot:
It has two transcript isoforms, one with two exons and one with three exons.
236
236
The isoforms are partly overlapping.
@@ -263,18 +263,18 @@ In order to better understand some of these differences, we show below a few exa
263
263
264
264
* Chkb - overlapping features on the same strand. In this case, only _alevin_ assigns a non-zero UMI count (and _STARsolo-diff_, which defines the intronic count as the difference between a "gene body count" and the regular gene expression).
265
265
266
-

266
+

267
267
268
268
* Rassf1 - overlapping features on different strands.
269
269
Whether or not the tool accounts for the strandedness of the reads makes a difference.
270
270
271
-

271
+

272
272
273
273
* Tspan3 - many ambiguous regions.
274
274
The way that the introns are defined makes a substantial difference.
275
275
The intronic count is much higher with the 'separate' intron definition approach.
276
276
277
-

277
+

278
278
279
279
These differences between counts obtained by different methods propagate also to the estimated velocities, and can affect the biological interpretation of the final results.
280
280
@@ -321,9 +321,9 @@ We will practice generating the [_Salmon_](https://salmon.readthedocs.io/en/late
321
321
Here, we first set the path to the data (`datadir`), as well as to the folder where we will store the generated index and quantifications (`outdir`).
@@ -746,7 +746,7 @@ The model assumes the existence of four different transcriptional states - two s
746
746
The EM algorithm iterates between estimating the latent time of a cell (the 'position' of the cell along the phase space trajectory) and assigning it a transcriptional state, and optimizing the values of the parameters (see Figure below from @Bergen2019-scvelo).
747
747
The likelihood is obtained by assuming that the observations follow a normal distribution:$$x_i^{obs}\sim N((\hat{u}(t), \hat{s}(t)), \sigma^2).$$
748
748
749
-

749
+

750
750
751
751
Here, we will focus on the dynamical model, since it is generally the most accurate, and although it's a bit slower than the other methods, usually it's not prohibitively slow.
752
752
@@ -779,6 +779,13 @@ This step adds several columns to `adata.var` (see [https://scvelo.readthedocs.i
779
779
* estimates of switching time points (`fit_t_`)
780
780
* the likelihood value of the fit (`fit_likelihood`), averaged across all cells. The likelihood value for a gene and a cell indicates how well the cell is described by the learned phase trajectory.
781
781
782
+
Since the step above is quite time consuming, we'll save an intermediate object at this point:
for cluster in ['DIplotene/Secondary spermatocytes', 'Mid Round spermatids']:
948
-
scv.pl.scatter(adata, df[cluster][:5], ylabel = cluster, **kwargs, color = 'celltype')
949
953
```
950
954
951
-
Moreover, partial gene likelihoods (average likelihood over a subset of the cells) can be computed for a each cluster of cells to enable cluster-specific identification of potential drivers.
955
+
In the most recent release of _scVelo_ (0.2.0), the possibility of performing a 'differential kinetics' test was introduced.
956
+
The purpose of this is to detect genes that display a different kinetic behaviour in some cell types than in others, giving rise to multiple trajectories.
957
+
The `tl.differential_kinetic_test` module performs a likelihood ratio test evaluating whether allowing different kinetics for different cell populations give a significantly better likelihood than forcing them to follow the same one.
0 commit comments