Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iNKT annotation? #479

Open
mihem opened this issue Mar 3, 2025 · 7 comments
Open

iNKT annotation? #479

mihem opened this issue Mar 3, 2025 · 7 comments
Labels
enhancement New feature or request

Comments

@mihem
Copy link

mihem commented Mar 3, 2025

@ncborcherding

Hi Nick,

sorry, not sure this is within the scope of scRepertoire, but in my current project I would like to annotate iNKT (and maybe MAIT). This is possible with marker genes, but more reliable when looking at the CDR3 region. The more recent versions of CellRanger seem to do that: https://www.10xgenomics.com/support/software/cell-ranger/latest/hidden/cr-5p-vdj-algorithm-inktmait. However, I am confused because I thought this needed VDJ (so e.g. not possible when 3' was used, but you can still run annotate then).

However, I have an ongoing project with an older CellRanger version, plus I would think that this would be a nice feature for scRepetoire

Thank you!
Mischko

@mihem
Copy link
Author

mihem commented Mar 3, 2025

Okay sorry, I think I now understood that this annotation is just part of cellranger vdj and has nothing to do with the more recent feature cell annotation https://www.10xgenomics.com/support/software/cell-ranger/latest/analysis/running-pipelines/cr-cell-annotation-pipeline.

However, nonetheless it would be nice if scRepertoire would be able to process this information. Since it's already in clonotypes.csv scRepertoire could just use this information? Or start from scratch based on the CDR3 region to identify iNKT? https://github.com/10XGenomics/enclone/blob/cellranger5.0/enclone/src/human_iNKT_CDR3.json

Thanks

@ncborcherding ncborcherding added the enhancement New feature or request label Mar 4, 2025
@ncborcherding
Copy link
Member

@mihem

Great suggestion - I actually wrote the iNKT and MAIT assigning functions for another project - but there is no reason not to update them and implement them into scRepertoire.

I will add this to the to-do list and update this issue as I make progress.

Thanks,
Nick

@mihem
Copy link
Author

mihem commented Mar 4, 2025

Great thanks.

@mihem
Copy link
Author

mihem commented Mar 5, 2025

@ncborcherding sorry, need to work on this soon. Could you maybe be so nice and share the functions you wrote (as you probably don't have time to include this in scRepertoire the next days I suppose ;) ).
I know this is a rather a question for 10X then for scRepertoire, but maybe you could help: The column iNKT looks like this in my sample:

Image

So lots of (all of the first 50) clonotypes have iNKT evidence. But in the end I want to know: which of these are really iNKT? This is PBMC (of sick patients), so I would expect iNKT to be not more than a few percent of all T cells?

Thanks,
Mischko

@ncborcherding
Copy link
Member

Hey @mihem,

I am more familiar with using the gene and amino acid length in terms of defining them. Here is the function I wrote - it will need a little work as it has some internal functionality, but you should be able to get it going:

scoreInvariant <- function(input.data, 
                           type = NULL,
                           species = NULL) {
  
  TCRS <- getTCR(input.data, chains = "both") #this is just pulling the data from the seurat object, so you can do it yourself from the meta data
  
  comp <- switch(type,
                "MAIT" = .MAIT.criteria,
                "iNKT" = .iNKT.criteria,
                stop("Please select either 'MAIT' or 'iNKT' for type."))
  
  cells <- unique(TCRS[[1]]$barcode)
  
  lapply(cells, function(x) {
    TRA.v <- TCRS[[1]][TCRS[[1]]$barcode == x,]$v
    TRA.j <- TCRS[[1]][TCRS[[1]]$barcode == x,]$j
    TRB.v <- TCRS[[2]][TCRS[[2]]$barcode == x,]$v
    TRA.length <- nchar(TCRS[[1]][TCRS[[1]]$barcode == x,]$cdr3_aa)
    if(any(is.null(c(TRA.v, TRA.j, length, TRB.v)))) {
      score <- 0
    } else {
      if (grepl(TRA.v, comp[[species]]$TRA.v) & 
          any(grepl(TRA.j, comp[[species]]$TRA.j)) & 
          any(grepl(TRB.v, comp[[species]]$TRB.v)) &
          TRA.length %in% comp[[species]]$TRA.length) {
        score <- 1
      } else {
        score <- 0
      }
    }
    score
  }) -> individual.scores
  output.scores <- data.frame(row.names =  TCRS[[1]]$barcode, score = unlist(individual.scores))
  colnames(output.scores)[1] <- paste0(type, ".score")
  return(output.scores)
}

.MAIT.criteria <- list(mouse = list(TRA.v = c("TRAV1"), 
                          TRA.j = "TRAJ33", 
                          TRA.length = 12, 
                          TRB.v = c("TRBV13", "TRBV19")), 
                       human = list(TRA.v = "TRAV1-2", 
                                    TRA.j = c("TRAJ33", "TRAJ20", "TRAJ12"), 
                                    TRA.length = 12,
                                    TRB.v = c("TRBV6, TRBV20")))

.iNKT.criteria <- list(mouse = list(TRA.v = "TRAV11", 
                                    TRA.j = "TRAJ18", 
                                    TRA.length = 15), 
                        human = list(TRA.v = "TRAV10", 
                                     TRA.j = "TRAJ18",
                                     TRA.length = c(14,15,16),
                                     TRB.v = "TRBV25-1"))

I'm on clinical service this week, would be happy to help out more when I catch my breath next week.

Nick

@mihem
Copy link
Author

mihem commented Mar 5, 2025

Thanks very kind.
So the iNKT criteria on these three genes?
So you don't use this list right?
https://github.com/10XGenomics/enclone/blob/cellranger5.0/enclone/src/human_iNKT_CDR3.json

Do you think this does not make sense or is not necessary?

Haha same here regarding clinical service.

@ncborcherding
Copy link
Member

Oh interesting - thanks for the link. They are using previously published sequences for identifying cells. You could do that and replace the length check I suppose. My one question would be is that comprehensive? The way I envisioned using the scores is that as a permissive signal - like these cells qualify based on TCR, but that is not a guarantee.

I need to do some more reading in this area to see how to implement it.

Nick

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants