Filtering & Processing

Author

Kaitlin Sullivan

⟵ Previous: The goFISH App Next: Dimensionality Reduction & Clustering ⟶

Video Tutorial

The following video tutorial demonstrates the functionality of ruFilter() and ruProcess(), two functions that filter and normalize the data, as well as run a Principal Component Analysis (PCA).

Follow along with the code below.

Filtering by Gene Expression with ruFilter()

#These steps happen automatically within the Shiny App
#doing them manually simply gives you more autonomy over the individual steps

### FILTERING
#here we filter for excitatory cells which are Slc17a7+
myobj <- ruFilter(myobj, filter.by = 'Slc17a7', threshold = 0.1, exclude = c('RSC', 'LEC'))
[1] "Running gene exclusions..."
[1] "Filtering data by Slc17a7 at threshold of 0.1..."
[1] "Updating metadata..."

This step will populate the @filteredData slot with the data frame filtered to only include cells with an Slc17a7 value > 0.1

It will also exclude the filtering gene and any others that you choose to exclude. Here we have excluded the viral injection data as they are not indicitive of gene expression.

head(myobj@filteredData)
       Nnat      Synpr      Pcp4      Cdh9      Ctgf    Slc17a6 Lxn   Slc30a3
1 0.0000000  0.0000000 0.0000000 0.0000000 0.0000000 1.00420555   0 0.0000000
2 0.0000000  0.2438931 0.4888655 0.2438931 0.4888655 0.73275862   0 0.0000000
3 0.1613582  0.2426580 0.8893321 0.3233371 0.3233371 0.08067912   0 0.4040162
4 0.0000000 12.5748426 0.0000000 0.0000000 0.0000000 0.00000000   0 0.0000000
5 0.0000000  1.0240387 0.0000000 0.0000000 0.0000000 0.25628585   0 0.0000000
6 0.1633744  0.4910062 2.2907735 0.6543805 0.0000000 0.32763183   0 0.0000000
      Gfra1     Spon1      Gnb4 id
1 0.3347352 0.6694704 0.6694704  3
2 0.0000000 0.0000000 0.0000000  4
3 0.0000000 3.1520710 2.0207016  5
4 1.3976102 0.0000000 0.0000000  6
5 0.0000000 0.0000000 0.0000000  7
6 0.0000000 0.0000000 0.0000000  8

This step will also place any excluded variables into the @metaData slot and create a new $fil column that tells the user whether the cell is filtered out or not.

head(myobj@metaData)
        X      Y id region   anum section RSC LEC   fil
1 134.920 33.948  1    cla 123456       1   0   0 FALSE
2 167.766 32.075  2    cla 123456       1   0   0 FALSE
3 233.271 35.724  3    cla 123456       1   0   0  TRUE
4 279.353 34.970  4    cla 123456       1   0   0  TRUE
5 363.526 37.364  5    cla 123456       1   0   0  TRUE
6 412.295 32.073  6    cla 123456       1   0   0  TRUE

Normalization and PCA with ruProcess()

This function will normalize the data (option of either “log” or “PAC”) as well as run a PCA.

myobj <- ruProcess(myobj)
[1] "Normalizing the data..."
[1] "Running PCA..."
[1] "Updating metadata..."

From here the @filteredData slot will be updated with the normalized data. The attributes will also be updated to include the PCA itself (@attributes$pca) and the number of PC’s contributing ~95% variance to the data (@attributes$npc).

We can visualize this using plotVar(). The red line indicates the number of PCs that will automatically be used if the user doesn’t override this choice.

plotVar(myobj)

⟵ Previous: The goFISH App Next: Dimensionality Reduction & Clustering ⟶