The main qpcr functions and classes
These are the principal qpcr functions and their underlying classes that carry the main user API.
Function API
qpcr.DataReader
This is the qpcr.DataReader that serves as a general Hub for the qpcr.Readers and allows versatile file reading.
Reading Data Files
Setting up the DataReader is really easy and works just as setting up any other of the classes in qpcr.
reader = qpcr.DataReader()
# now we can read the file using
assay = reader.read( some_file )
If the file contains multiple assays we likely want to specify
assays = reader.read( some_file, multi_assay = True )
In case there are both “assays-of-interest” and “normaliser” assays in our file, we will have to decorate the datafile (or manually sort the read assays in our script). To learn more about file pre-processing so that qpcr can automatically read your setup, check out the Decorator tutorial .
# if we have a decorated file, then we can simply do
assays, normalisers = reader.read( some_file, multi_assay = True, decorator = True )
Reading multiple files
The DataReader is able to read multiple files successively when passed a list.
However, the DataReader functions as a wrapper around the qpcr.Readers and to save computations it sets up a suitable Reader
and then re-uses that same reader to read all successive files. However, if your files are differently formatted, you may supply reset = True
to force the DataReader to set up a new Reader for each file it reads.
many_files = [ ... ]
assays = reader.read( many_files, reset = True )
Using dedicated qpcr.Readers
Not all datafiles will be (easily) readable by the qpcr.DataReader. This is not necessarily because the files are bad, simply because the DataReader makes use of mostly default Readers.
Hence, it may be that your file will not be readable by the DataReader but will be readable by a dedicated Reader such as a BigTableReader for instance. Check out the qpcr.Readers for more details.
- class qpcr.main.DataReader.DataReader[source]
Bases:
qpcr._auxiliary._IDHandles reading a single file containing input data for qpcr.
Note
This is a top-level class that is designed as the central port through which data is read into qpcr.Assay objects from both regular and irregular, single- and multi-assay files. This is the suggested way to read your data for most users, which should work in most cases.
However, due to the automated setup of the inferred Readers there may be cases where you will either have a hard time or be unable to read your datafiles using the DataReader. In such cases, don’t try too long to make it work with the DataReader, just use one of the qpcr.Readers or even qpcr.Parsers directly.
- read(filename: str, multi_assay: bool = False, big_table: bool = False, decorator: Optional[bool] = None, reset=False, **kwargs)[source]
Reads an input file and extracts available datasets using the specified Reader or by setting up an approproate Reader.
- Parameters
filename (str) – A filepath to an input datafile.
multi_assay (bool) – Set to True if the file contains multiple assays you wish to read.
big_table (bool) – Set to True if the file is a “Big Table” file. Check out the documentation of the qpcr.Readers for more information on “Big Table” files.
decorator (str or bool) – Set if the file is decorated. This can be set either to True for multi_assay and multi-sheet (excel) or big_table files, or it can be set to a valid qpcr decorator for single assay files or single-sheet files. Check out the documentation of the qpcr.Parsers for more information on decorators.
reset (bool) – If multiple input files should be read but they do not all adhere to the same filetype / datastructure, use reset = True to set up a new Reader each time read is called.
**kwargs – Any additional keyword arguments to be passed to the core Reader. Note, while this tries to be utmost versatile there is a limitation to costumizibility through the kwargs. If you require streamlined datareading use dedicated qpcr.Readers and/or qpcr.Parsers directly.$
- Returns
Either a single qpcr.Assay object or a list thereof. In case of a decorated file, two lists will be returned, one for assays and one for normalisers.
- Return type
assays
- read_bigtable(filename: str, kind: str, decorator: bool = True, assay_col: Optional[str] = None, id_col: Optional[str] = None, ct_col: Optional[str] = None, reset: bool = False, **kwargs)[source]
Reads a single BigTable datafile.
- Parameters
filename (str) – A filepath to an input datafile.
kind (str) – Specifies the kind of Big Table from the file. This may either be “horizontal”, “vertical”, or “hybrid”.
decorator (str or bool) – Set if the file is decorated. This can be set either to True for multi_assay and multi-sheet (excel) or big_table files, or it can be set to a valid qpcr decorator for single assay files or single-sheet files. Check out the documentation of the qpcr.Parsers for more information on decorators.
assay_col (str) – The column header specifying the assay identifiers.
id_col (str) – The column header specifying the replicate identifiers (or “assays” in case of horizontal big tables).
ct_col (str) – The column header specifying the Ct values.
reset (bool) – If multiple input files should be read but they do not all adhere to the same filetype / datastructure, use reset = True to set up a new Reader each time read is called.
**kwargs – Any additional keyword arguments to be passed to the core Reader. Note, while this tries to be utmost versatile there is a limitation to costumizibility through the kwargs. If you require streamlined datareading use dedicated qpcr.Readers and/or qpcr.Parsers directly.
- Returns
Two lists will be returned, one for assays and one for normalisers. In case of a non-decorated file, the second (normaliser) list is empty.
- Return type
assays
- read_multi_assay(filename: str, decorator: bool = True, reset: bool = False, **kwargs)[source]
Reads a single irregular multi assay datafile.
- Parameters
filename (str) – A filepath to an input datafile.
decorator (str or bool) – Set if the file is decorated. This can be set either to True for multi_assay and multi-sheet (excel) or big_table files, or it can be set to a valid qpcr decorator for single assay files or single-sheet files. Check out the documentation of the qpcr.Parsers for more information on decorators.
reset (bool) – If multiple input files should be read but they do not all adhere to the same filetype / datastructure, use reset = True to set up a new Reader each time read is called.
**kwargs – Any additional keyword arguments to be passed to the core Reader. Note, while this tries to be utmost versatile there is a limitation to costumizibility through the kwargs. If you require streamlined datareading use dedicated qpcr.Readers and/or qpcr.Parsers directly.
- Returns
Two lists will be returned, one for assays and one for normalisers. In case of a non-decorated file, the second (normaliser) list is empty.
- Return type
assays
- store()[source]
Will store the read data.
Note
DataReader does NOT have a specific data storage facility to distinguish between assays / normalisers, data types, etc. It simply keeps a dictionary of {filename : data} that can be accessed. This is designed in case multiple files should be read using the same DataReader to allow an easier access of the data in case the data outputs are of the same type.
However, the main intended application of DataReader is to use the`read` method’s returned data directly.
qpcr.Assay
This is the qpcr.Assay class whose job is to store qPCR datasets. It is a central data-handling class in qpcr.
Setting up a qpcr.Assay
Here is a manual example of creating a qpcr.Assay object. You can use either the qpcr.DataReader or any one of qpcr.Readers directly to
read in your data and generate a pandas DataFrame. Note, the qpcr.Readers are already equipped with make_Assay(s) methods that will handle
setting up qpcr.Assay objects for you.
However, setting up a qpcr.Assay manually can be as simple as:
# get the dataframe from one of the qpcr.Readers
mydata = some_reader.get()
assay = Assay( df = mydata, id = "my_assay" )
If your replicate identifiers are the same for all replicates within each group then the groups are automatically inferred. And your assay is ready at this point already to be passed to an qpcr.Analyser. If not, you can specify the replicates manually like this:
# manually specify triplicates during setup
assay = Assay( df = mydata, id = "my_assay", replicates = 3 )
# or you can change the replicates after initial setup like
assay = Assay( df = mydata, id = "my_assay" )
assay.group( replicates = 3 )
We can now actually interact with the qpcr.Assay. Assays support direct item setting, getting, and deleting on their dataframes.
# we could for instance fill a new column with only ones
assay[ "my_new_column" ] = 1
# or get the id column from the assay
ids = assay[ "id" ]
Specifying (Groups of) Replicates
The groups are essential to analysing our data, so qpcr needs to know about how the data is grouped. By the way, if you are unfamiliar with “groups” check out this
Here’s the best part: usually, we don’t necessarily need to do anything here because qpcr.Assay are able to infer the groups of replicates in your data
automatically from the replicate identifiers (yeah!). However, you will be asked to manually provide replicate settings in case this fails.
In case you want to / have to manually specify replicate settings, a qpcr.Assay accepts an input replicates which is where you can specify this information.
This input can be either an integer, a tuple, or a string. Why’s that?
Well, normally we perform experiments as “triplicates”, or “duplicates”, or whatever multiplets.
Hence, if we always have the same number of replicates in each group (say all triplicates) we can simply specify this number as replicates = 3.
However, some samples might only be done in unicates (such as the diluent sample), while others are triplicates.
In these cases your dataset does not have uniformly sized groups of replicates and a single number will not do to describe the groups of replicates.
For these cases you can specify the number of replicates in each group separately as a tuple such as replicates = (3,3,3,3,1) or as a string “formula”
which allows you to avoid repeating the same number of replicates many times like replicates = "3:4,1", which will translate into the same tuple as we specified manually.
- class qpcr.main.Assay.Assay(df: pandas.DataFrame, id: Optional[str] = None, replicates: Optional[int] = None, group_names: Optional[list] = None)[source]
Bases:
qpcr._auxiliary._IDThe central storing unit of single datasets that were read from datafiles. An qpcr.Assay stores the replicate identifiers and Ct values, and also groups these according to the replicates information (which is automatically inferred by default). Groups of replicates can be arbitrarily renamed by the user.
Note
The new implementation of the qpcr.Assay works directly with a DataFrame that was generated by any one of the qpcr.Readers or qpcr.Parsers.
- Parameters
df (pandas.DataFrame) – A DataFrame produces by one of the qpcr.Readers containing an id column for the replicate identifiers and a Ct value column.
id (str) – The identifer of the assays (the Assay name, essentially).
replicates (int or tuple or str) – Can be an integer (equal group sizes, e.g. 3 for triplicates), or a tuple (uneven group sizes, e.g. (3,2,3) if the second group is only a duplicate). Another method to achieve the same thing is to specify a “formula” as a string of how to create a replicate tuple. The allowed structure of such a formula is n:m, where n is the number of replicates in a group and m is the number of times this pattern is repeated (if no :m is specified :1 is assumed). See qpcr.Assay.replicates for an example.
group_names (list) – A list of names to use for the replicates groups. If replicates of the same group share the same identifier, then the group will be inferred automatically. Otherwise, default group names will be set if no group_names are provided.
- property Ct
- returns: Ct – A pandas Series with the assay’s Ct values. The column is renamed
from “Ct” to the assay’s id.
- Return type
pandas.Series
- add_dCt(dCt: pandas.Series)[source]
Adds results from Delta-Ct (first Delta-Ct performed by a qpcr.Analyser).
- Parameters
dCt (pandas.Series) – A pandas Series of Delta-Ct values that will be stored in a column “dCt”. Note, that each Assay can, of course, only store one single Delta-Ct column.
- add_ddCt(normaliser_id: str, ddCt: pandas.Series)[source]
Adds results from Delta-Delta-Ct (“normalisation” performed by a qpcr.Normaliser). These will be stored in a column named “rel_{normaliser_id}”. Hence, an Assay can store an arbitrary number of Delta-Delta-Ct columns against an arbitrary number of different normalisers.
- Parameters
normaliser_id (str) – The id of the normaliser Assay used to compute the Delta-Delta-Ct values.
ddCt (pandas.Series) – A pandas Series of Delta-Delta-Ct values.
- adopt(df: pandas.DataFrame)[source]
Adopts an externally computed dataframe as its own. This is supposed to be used when setting up new qpcr.Assay objects that do not inherit data from one of the qpcr.Readers. If you wish to alter an existing qpcr.Assay use force = True. When doing this, please, make sure to retain the proper data structure!
- Parameters
df (pd.DataFrame) – A pandas DataFrame.
- boxplot(mode: Optional[str] = None, **kwargs)[source]
A shortcut to call a qpcr.Plotters.ReplicateBoxPlot plotter to visualise the loaded replicates.
- Parameters
mode (str) – The plotting mode. May be either “static” (matplotlib) or “interactive” (plotly).
**kwargs – Any additional keyword arguments to be passed to the plotter.
- Returns
fig – The figure generated by ReplicateBoxPlot.
- Return type
plt.figure or plotly.figure
- property columns
- property dCt
- returns: dCt – A pandas Series with the computed Delta-Ct values. The column is renamed
from “dCt” to the assay’s id.
- Return type
pandas.Series
- property data_cols
returns: A list of all non-setup columns in the dataframe. :rtype: cols
- property ddCt
- returns: ddCt – A pandas DataFrame with all Delta-Delta-Ct values that the Assay has stored.
All “rel_{}” columns are renamed to include the assay id to “{id}_rel_{}”.
- Return type
pandas.DataFrame
- property ddCt_cols
returns: A list of all rel_{} columns within the Assays’s dataframe. :rtype: cols
- efficiency(eff: Optional[float] = None)[source]
Gets or sets the amplification efficiency of the Assay.
- Parameters
eff (float) – A new efficiency to assign to the assay.
- Returns
The currently assigned efficiency.
- Return type
float
- get(copy: bool = False)[source]
- Parameters
copy (bool) – If True returns a deepcopy of the stored dataframe.
- Returns
data – The stored dataframe
- Return type
pandas.DataFrame
- group(replicates: Optional[int] = None, infer_names=True)[source]
Groups the data according to replicates-settings specified.
- Parameters
replicates (int or tuple or str) – The replicate settings after which to group the Assay. This will just get forwarded to the replicates method, so there is no need to specify replicates here if the replicates method has already been called. See the documentation of the Assay.replicates method for more details.
infer_names (bool) – Try to infer names of replicate groups based on the individual replicate sample identifiers. Note that this only works if all replicates have an identical sample name!
- groups(as_set=True)[source]
- Parameters
as_set (bool) – If as_set = True (default) it returns a set (as list without duplicates) of assigned group names for replicate groups. If as_set = False it returns the full group_name column (including all repeated entries).
- Returns
groups – The given numeric group identifiers of all replicate groups.
- Return type
list
- ignore(entries: tuple, drop=False)[source]
Remove lines based on index from the dataframe. This is useful when removing corrupted data entries.
- Parameters
entries (tuple) – Tuple of row indices from the dataframe to drop.
drop (bool) – If True the provided entries will be entirely removed from the dataset. If False, ignore entries will be set to NaN.
- n()[source]
- Returns
The number of entries (individual replicates) within the Assay.
- Return type
int
- names(as_set=True)[source]
- Parameters
as_set (bool) – If as_set = True (default) it returns a set (as list without duplicates) of assigned group names for replicate groups. If as_set = False it returns the full group_name column (including all repeated entries).
- Returns
names – The given group names of all replicate groups.
- Return type
list or pd.Series
- rename(names: list)[source]
Replaces the current names of the replicate groups (stored in the “group_name” column).
- Parameters
names (list or dict) – Either a list (new names without repetitions) or dict (key = old name, value = new name) specifying new group names. Group names only need to be specified once, and are applied to all replicate entries.
- rename_cols(cols: dict)[source]
Renames columns according to a dictionary as key -> value.
- Parameters
cols (dict) – A dictionary specifying old column names (keys) and new colums names (values).
- replicates(replicates: Optional[int] = None)[source]
Either sets or gets the replicates settings to be used for grouping Before they are assigned, replicates are vetted to ensure they cover all data entries.
- Parameters
replicates (int or tuple or str) –
Can be an integer (equal group sizes, e.g. 3 for triplicates), or a tuple (uneven group sizes, e.g. (3,2,3) if the second group is only a duplicate). Another method to achieve the same thing is to specify a “formula” as a string of how to create a replicate tuple. The allowed structure of such a formula is n:m, where n is the number of replicates in a group and m is the number of times this pattern is repeated (if no :m is specified :1 is assumed).
So, as an example, if there are 12 groups which are triplicates, but at the end there is one which only has a single replicate (like the commonly measured diluent qPCR sample), we could either specify the tuple individually as replicates = (3,3,3,3,3,3,3,3,3,3,3,3,1) or we use the formula to specify replicates = “3:12,1”. Of course, this works for any arbitrary setting such as “3:5,2:5,10,3:12” (which specifies five triplicates, followed by two duplicates, a single decaplicate, and twelve triplicates again – truly a dataset from another dimension)…
- save(filename: str)[source]
Saves the data from the Assay to a csv file.
- Parameters
filename (str) – The filename into which the assay should be stored. If this is a directory, then the assay id will automatically be used as filename.
- stack(n: int = 2)[source]
Expands the dataframe entry-wise n times.
- Parameters
n (int) – The number of stacks to produce. 1 stack will introduce one more copy of each replicate. Note, n == 1 will keep the current entries!
- tile(n: int = 1)[source]
Expands the dataframe to the square number of entries for each group. This is useful for combinatoric normalisation wherein each replicate is normalised against each replicate group-wise from the normaliser, instead of only its supposed partner value.
- Parameters
n (int) – The number of tiles to produce. By default 1 tile will effectively square the number of entries within the dataframe.
qpcr.Analyser
This is the qpcr.Analyser whose function is to perform dataset-internal normalisation to compute the first-step Delta-Ct Values within an qpcr.Assay object.
Computing Delta-Ct values
Setting up a qpcr.Analyser is really easy and we see it in virtually every GitHub tutorial.
analyser = qpcr.Analyser()
# and now directly pipe the data through
some_assays = analyser.pipe( some_assays )
Alternatively we can also directly use the qpcr.delta_ct function that will call on a Analyser for us.
some_assays = qpcr.delta_ct( some_assays )
Delta-Ct values
The computed values are stored in the respective qpcr.Assay dataframe into a column called "dCt".
By default, the qpcr.Analyser will compute the Delta-Ct values already in exponential form. I.e. as \(efficiency^{-\Delta Ct}\).
This behaviour can be changed by changing the applied function using the provided func method.
Check out this tutorial about custom anchors which will give you an idea of the flavour of editing the qpcr.Analyser.
- class qpcr.main.Analyser.Analyser[source]
Bases:
qpcr._auxiliary._IDPerforms Single Delta-Ct (first normalisation within dataset against the anchor)
- DeltaCt(**kwargs)[source]
Calculates Delta-Ct for all groups within the dataframe. Any specifics such as anchor or func must have already been set using the respective methods prior to calling DeltaCt()!
- Parameters
**kwargs – Any additional keyword arguments that a custom DeltaCt function may require.
- anchor(anchor: Optional[str] = None, group: int = 0)[source]
Sets the anchor for DeltaCt for internal normalisation.
- Parameters
anchor (str or float or function) – The internal anchor for normalisation. This can be either “first” (default, the very first dataset entry), “mean” (mean of the reference group), “grouped” (first entry for each replicate group), any specified numeric value (as float), or a function that will calculate the anchor and returns a single numeric value. If you wish to use a function to compute the anchor, you can access the dataframe stored by the qpcr.Assay that is being analysed through the data argument. data will be automatically forwarded to your custom anchor-function, unless you specify it directly. Please, make sure your function can handle **kwargs because any kwargs supplied during DeltaCt()-calling will be passed per default to both the anchor-function and DeltaCt-function.
group (int or str) – The reference group identifier. This can be either the numeric identifier or the group_name. This is only used for anchor = “mean”. By default the first group is assumed.
- Returns
anchor – The currently selected anchor.
ref_group – The current reference group.
- func(f: str)[source]
Sets the function to be used for DeltaCt (optional)
- Parameters
f (str or function) – The function to be used for DeltaCt computation. Pre-defined functions are either “exponential” (which uses dCt = eff ** ( -(s-r) ), default), or “linear” (uses dCt = s-r), where s is any replicate entry in the dataframe and r is the anchor. eff = 2 * efficiency is the numeric duplication factor (default assumed efficiency = 1). It is also possible to assign any defined function that accepts one float Ct value s (1st!) and anchor r value (2nd!), alongside any kwargs (which will be forwarded from DeltaCt()…). It must return also a single float.
- get()[source]
- Returns
Assay – The analysed qpcr.Assay object that contains now deltaCT values.
- Return type
qpcr.Assay
- link(Assay: qpcr.main.Assay.Assay)[source]
Links a qpcr.Assay object to the Analyser.
- Parameters
Assay (qpcr.Assay) – A qpcr.Assay object containing data.
- pipe(Assay: qpcr.main.Assay.Assay, **kwargs)[source]
A quick one-step implementation of link + DeltaCt.
Note
This is the suggested application of the qpcr.Analyser.
- Parameters
Assay (qpcr.Assay) – A qpcr.Assay object to be linked to the Analyser for DeltaCt computation.
**kwargs – Any additional keyword arguments to be passed to the DeltaCt() method.
- Returns
assay – The same qpcr.Assay with computed Delta-Ct values.
- Return type
qpcr.Assay
qpcr.Normaliser
This is the qpcr.Normaliser class whose function is to compute Fold-Changes of Delta-Ct values
from assays and normalisers using qpcr.Assay objects.
Modes of normalisation
The qpcr.Normaliser supports custom functions for normalisation. However, it also comes with three built-in methods to normalise sample-assays against normalisers.
These are accessible via the mode argument of the qpcr.Normaliser.normalise method, which can be set to "pair-wise" (default), "combinatoric", or "permutative".
The default option “pair-wise” is computationally the fastest and will rigidly normalise replicates against their corresponding partner from the normaliser. I.e. first against first, second against second, etc. This mode is appropriate for multiplex qPCR experiments.
For qPCR reactions that were pipetted individually, there is not reason to strictly only pair first
with first, second with second etc. For these cases there are other two options “combinatoric” and "permutative". "combinatoric" normalisation will calculate all possible group-wise
combinations of a sample-assay replicate with all available normaliser replicates of the same group. I.e. first against first, and against second, etc. This will generate \(n^2\) values where
\(n\) is the number of replicates within a group. This mode is appropriate for small-scale datasets but will substantially increase computation times for larger datasets.
"permutative" on the other hand will reflect the equivalence of replicates within a group through random permutations wihtin the normaliser replicates. Hence,
first may be normalised against first, or second, etc. This normalisation method may by used iteratively to increase the accuracy. Replacement during permutations are allowed (although disabled by default).
If replacement is desired by the user, the probability of each replicate to be chosen will be weighted based on a fitted normal distribution. This method is appropriate for larger datasets for which combinatoric normalisation
is not desired.
Normalising data
Setting up a qpcr.Normaliser is really easy and we see it in virtually every GitHub tutorial.
normaliser = qpcr.Normaliser()
# and now directly pipe the data through
results = normaliser.pipe( some_assays, some_normalisers )
Alternatively we can also directly use the qpcr.normalise function that will call on a Normaliser for us.
results = qpcr.normalise( some_assays, some_normalisers )
Preprocessing “normalisers”
The qpcr.Normaliser can work with multiple qpcr.Assay as “normaliser-assays”. However, it requires for computation a single set of numbers to compute fold changes.
Therefore, the Normaliser first performs some pre-processing of all normaliser assays it receives. By default it will compute their mean and use this to compute fold changes.
However, the qpcr.Normaliser is equipped with a method prep_func which allows you to pass a custom function for preprocessing.
Normalisation
By default the qpcr.Normaliser will compute normalised fold changes by dividing the assays-of-interst by the pre-processed normaliser.
As a side note here, the qpcr.Analyser already stores the Delta-Ct values it computes as \(efficiency^{-\Delta Ct}\).
Hence, by default the Delta-Ct values stored by qpcr.Assay``s after having been "analysed" are exponentials.
However, using the method ``norm_func also a custom normalisation function can be specified.
- class qpcr.main.Normaliser.Normaliser[source]
Bases:
qpcr._auxiliary._IDHandles the second step in Delta-Delta-Ct (normalisation against normaliser assays).
Note
This requires that all have been analysed in the same way before!
- get(copy=False)[source]
- Parameters
copy (bool) – Will return a deepcopy of the Results object if copy = True (default is copy = False).
- Returns
Results – A qpcr.Results object containing the normalised dataframe
- Return type
qpcr.Results
- link(assays: Optional[list] = None, normalisers: Optional[list] = None)[source]
Links either normalisers or assays-of-interest qpcr.Assay objects coming from the same qpcr.Analyser.
- Parameters
assays (list or tuple or qpcr.Analyser) – A list of qpcr.Assay objects coming from a qpcr.Analyser. These assays will be normalised against a normaliser.
normalisers (list or tuple) – A list of qpcr.Assay objects coming from a qpcr.Analyser. These assays will be used as normalisers. These will be combined into one single pseudo-normaliser which will then be used to normalise the assays. The method of combining the normalisers can be specified using the qpcr.Normaliser.prep_func method.
- norm_func(f=None)[source]
Sets any defined function to perform normalisation of assays against normalisers. If no f is provided, it returns the current norm_func.
- Parameters
f (function) –
The function may accept two qpcr.Assay objects (named assay and normaliser which will be forwarded from the qpcr.Normaliser). The function may also accept one pandas.DataFrame (named df) containing two numeric columns of delta-Ct values from a sample-assay (named “s”) and a normaliser-assay (named “n”), as well as a group identifier column (named “group”). Whatever inputs it works with, it must return a named numeric pandas.Series of the same length as entries in the Assays’ dataframes.
##### Note Support for the dataframe direct usage will be dropped at some point in the future.
By default s/n is used, where s is a column of sample-assay deltaCt values, and n is the corresponding “dCt” column from the normaliser.
- normalise(mode='pair-wise', **kwargs)[source]
Normalises all linked assays against the combined pseudo-normaliser (by default, unless a custom prep_func has been specified), and stores the results in a new Results object.
- Parameters
mode (str) – The normalisation mode to use. This can be either pair-wise (default), or combinatoric, or permutative. pair-wise will normalise replicates only by their partner (i.e. first against first, second by second, etc.). combinatoric will normalise all possible combinations of a replicate with all partner replicates of the same group from a normaliser (i.e. first against first, then second, then third, etc.). This will generate n^2 normalised Delta-Delta-Ct values, where n is the number of replicates in a group. permutative will scramble the normaliser replicates randomly and then normalise pair-wise. This mode supports a parameter k which specifies the times this process should be repeated, thus generating n * k normalised Delta-Delta-Ct values. Also, through setting replace = True replacement may be allowed during normaliser scrambling. Note, this setting will be ignored if a custom norm_func is provided.
**kwargs – Any additional keyword arguments that may be passed to a custom norm_func and prep_func (both will receive the kwargs!).
- pipe(assays: list, normalisers: list, mode='pair-wise', **kwargs)[source]
A wrapper for Normaliser.link and Normaliser.normalise
- Parameters
assays (list) – A list of qpcr.Assay objects.
normalisers (list) – A list of qpcr.Assay objects.
mode (str) – The normalisation mode to use. This can be either pair-wise (default), or combinatoric, or permutative. pair-wise will normalise replicates only by their partner (i.e. first against first, second by second, etc.). combinatoric will normalise all possible combinations of a replicate with all partner replicates of the same group from a normaliser (i.e. first against first, then second, then third, etc.). This will generate n^2 normalised Delta-Delta-Ct values, where n is the number of replicates in a group. permutative will scramble the normaliser replicates randomly and then normalise pair-wise. This mode supports a parameter k which specifies the times this process should be repeated, thus generating n * k normalised Delta-Delta-Ct values. Also, through setting replace = True replacement may be allowed during normaliser scrambling. Note, this setting will be ignored if a custom norm_func is provided.
**kwargs – Any additional keyword arguments that may be passed to a custom norm_func and prep_func (both will receive the kwargs!).
- Returns
results – A qpcr.Results object of the assembled results.
- Return type
qpcr.Results
- prep_func(f=None)[source]
Sets any defined function for combined normaliser pre-processing. If no f is provided, it returns the current prep_func.
- Parameters
f (function) – The function may accept one list of qpcr.Assay objects, and must return either an qpcr.Assay object directly or a pandas.Dataframe (that will be migrated to an qpcr.Assay). The returned dataframe must contain a “dCt” column which stores the delta-Ct values ultimately used as “normaliser assay”.
- prune(assays=True, normalisers=True, results=True)[source]
Will clear assays, normalisers, and/or results
- Parameters
assays (bool) – Will clear any sample assays if True (default).
results (bool) – Will clear any computed results if True (default).
normalisers (bool) – Will clear any normalisers if True (default).
qpcr.Results
This is the qpcr.Results class whose function is to accumulate results from various
qpcr.Assay objects and summarize them.
Setting up a qpcr.Results object
Since the Results are supposed to be a central collection hub it makes sense to know how to make them.
The setup is fairly simple. The qpcr.Results already provide a number of methods to directly add specific data
such as Delta-Delta-Ct values to their dataframes from qpcr.Assay objects. However, they also allow more generic
data manipulation through normal item setting, getting, and deleting.
An important first step is usually to adopt the experimental meta-data shared by the Assays.
This can be done using the setup_cols method which copies the id, group, and group_name columns from an Assay.
Once this is done, we can easily add more interesting data.
# initialize the Results
result = Results()
# make sure the metadata is present
result.setup_cols( some_assay )
# now copy actually interesting data
# for example Delta-Delta-Ct values
result.add_ddCt( some_assay )
# now we can continue to assemble data
# for instance with
for assay in a_list_of_assays:
result.add_ddCt( assay )
# or directly
# result.add_ddCt( a_list_of_assays )
# and now summarize these
result.stats()
# and visualise
result.preview()
Alternatively, we might wish to make use of of a Results object for data processing where we might want to assemble a set of Assays from different files into a single BigTable-like file.
For this we might only wish to store the Ct values and then save them to a new file.
# a list of many assays
many_assays = [...]
r = Results()
r.setup_cols( many_assays[0] )
r.add_Ct( many_assays )
# and now save the accumulated file
r.save( ... )
- class qpcr.main.Results.Results(id: Optional[str] = None)[source]
Bases:
qpcr._auxiliary._IDHandles a pandas dataframe for data and computed results from a
qpcrclass.Note
This is a central data collection that can inherit directly from
qpcr.Assayobjects and from externally computed sources. Please, note that it will not perform extensive vetting on its data input, so make sure to only provide proper data input when manually assembling yourqpcr.Results!- add(data: pandas.DataFrame, replace: bool = False)[source]
Adds some new datacolumn.
Note
The
columnargument has to be named for this to work. However, there are already implemented methods dedicated to adding specifically Delta-Ct, Delta-Delta-Ct or just Ct values to the Results. In order to add a generic column from a numpy array or some other iterable just use default item setting (e.g. results[“new column”] = [1,2,3,4]).- Parameters
data (pd.Series or pd.DataFrame) – A named pandas Series or DataFrame that can be joined into the already stored dataframe. Note, a DataFrame may contain multiple columns.
replace (bool) – In case results from a computation with the same identifiers are already stored no new data can be stored under that id. Either the new data must be renamed or
replace = Truemust be set to overwrite the presently stored data.
- add_Ct(assay: qpcr.main.Assay.Assay)[source]
Adds a “Ct” column with Delta-Ct values from an
qpcr.Assay. It will store these as a new column using the Assay’sidas header.- Parameters
assay (qpcr.Assay) – An
qpcr.Assayobject from which to import.
- add_comparisons(comp)[source]
Add a results from a statistical evaluation of the stored Results in the form of a Comparison object.
- Parameters
comp – Either a Comparison or ComparisonsCollection object.
- add_dCt(assay: qpcr.main.Assay.Assay)[source]
Adds a “dCt” column with Delta-Ct values from an
qpcr.Assay. It will store these as a new column using the Assay’sidas header.- Parameters
assay (qpcr.Assay) – An
qpcr.Assayobject from which to import.
- add_ddCt(assay: qpcr.main.Assay.Assay)[source]
Adds all “rel_{}” columns with Delta-Delta-Ct values from an
qpcr.Assay. It will store these as new columns using the Assay’sid+ the_rel_{}composite id.- Parameters
assay (qpcr.Assay) – An
qpcr.Assayobject from which to import.
- property columns
- property comparisons
Returns a Comparison object storing the results of statistical analysis that were performed (if any).
- property data_cols
returns: A list of all non-setup columns in the dataframe. :rtype: cols
- property ddCt_cols
- returns: A list of all {}_rel_{} columns within the Results’s dataframe.
Or their new names if drop_rel was performed.
- Return type
cols
- drop(*cols)[source]
Drops all specified columns from the dataframe. This is used for normaliser pre-processing. This is the same as calling
Results.drop_cols.- Parameters
*cols – Any column names (as
str) to be dropped.
- drop_cols(*cols)[source]
Drops all specified columns from the dataframe. This is used for normaliser pre-processing. This is the same as calling
Results.drop.- Parameters
*cols – Any column names (as
str) to be dropped.
- drop_groups(groups: list)[source]
Removes specific groups of replicates from the DataFrame.
- Parameters
groups (list) – Either the numeric group identifiers or the group name, or an iterable thereof, of the groups to be removed, or a
regexpattern defining which groups should be dropped (this is useful for systematically removing RT- groups etc.) Aregex patterncan be supplied as well to match multiple group names.
- drop_rel()[source]
Crops the
X_rel_Ycolumn-names of Delta-Delta-Ct results to justX. I.e. reduces back to the assay-of-interest name only.
- groups(as_set=True)[source]
- Parameters
as_set (bool) – If
as_set = True(default) it returns a set (as list without duplicates) of assigned group names for replicate groups. Ifas_set = Falseit returns the full group column (including all repeated entries).- Returns
groups – The given numeric group identifiers of all replicate groups.
- Return type
list
- property is_empty
Checks if any results have been stored so far.
- Returns
Trueif NO data is yet stored, elseFalse.- Return type
bool
- merge(*Results, all_cols: bool = False)[source]
Merge any number of
qpcr.Resultsobjects into this one. The same can be achieved using the + operator.Note
This operation will merge the columns of the Results’ dataframes!
- Parameters
*Results – An arbitrary number of
qpcr.Resultsobjects.all_cols (bool) – Set to
Trueto merge not only the Delta-Delta-Ct columns (_rel_ columns) but also any additional columns.
- names(as_set=True)[source]
- Parameters
as_set (bool) – If
as_set = True(default) it returns a set (as list without duplicates) of assigned group names for replicate groups. Ifas_set = Falseit returns the full group_name column (including all repeated entries).- Returns
names – The adopted
group_names(only works if aqpcr.Assayhas already been linked usingadopt_names()!)- Return type
list or None
- preview(kind: Optional[str] = None, mode: Optional[str] = None, **kwargs)[source]
A shortcut to call on a
qpcr.Plotters.PreviewResultswrapper to visualise the results.- Parameters
kind (str) – The kind of Plotter to call. This can be any of the four wrapped Plotters, e.g. kind = “GroupBars”. By default this will be “AssayBars”.
mode (str) – The plotting mode. May be either “static” (matplotlib) or “interactive” (plotly).
- Returns
fig – The figure generated by
PreviewResults.- Return type
plt.figure or plotly.figure
- rename(cols: dict)[source]
Renames columns according to a dictionary as key -> value. This is the same as calling
Results.rename_cols.- Parameters
cols (dict) – A dictionary specifying old column names (keys) and new colums names (values).
- rename_cols(cols: dict)[source]
Renames columns according to a dictionary as key -> value. This is the same as calling
Results.rename.- Parameters
cols (dict) – A dictionary specifying old column names (keys) and new colums names (values).
- save(path, df=True, stats=True)[source]
Saves a csv file for each specified type of results.
- Parameters
path (str) – Path has to be a filepath if only one type of results shall be saved (i.e. either
dforstats), otherwise a path to the directory where bothdfandstatsshall be saved.df (bool) – Save the results dataframe containing all replicate values (the full results). Default is
df = True.stats (bool) – Save the results dataframe containing summary statistics for all replicate groups. Default is
stats = True.
- setup_cols(obj: qpcr.main.Assay.Assay)[source]
Adopts the setup columns:
id, group, group_namefrom another object.- Parameters
pd.DataFrame (obj qpcr.Assay or qpcr.Results or) – Either a
qpcr.Assayor aqpcr.Resultsor a pandas DataFrame that has the given columns.
- stats(recompute=False, iqr_limits: Optional[tuple] = None, ci_level: float = 0.95)[source]
Computes summary statistis about the replicate groups: -
N (count)-Mean-Median-StDev-IQR-CIof all replicate groups, for all datasets (assays).
- Parameters
recompute (bool) – Statistics will only be once unless recompute is set to
True. The same dataframe can be directly accessed via this method once is has been computed.iqr_limits (tuple) – The lower and upper quantiles for the IQR computation. By default
(0.25, 0.75)
- Returns
stats_df – A new dataframe containing the computed statistics for each replicate group.
- Return type
pd.DataFrame
qpcr.Calibrator
This is the qpcr.Calibrator class that is able to compute qPCR amplification efficiencies from qpcr.Assay objects.
Amplification Efficiencies in qpcr
By default qpcr sets the amplification efficiency of a new qpcr.Assay to 1 (100%). However, they can be set to any percentage (also > 1) using the efficiency method of the qpcr.Assay.
qpcr stores the efficiency as percentage but actually calculates with \(2 \ \cdot \ efficiency\) when computing Delta-Ct values.
Assigning an assay’s efficiency
The qpcr.Calibrator is dedicated to either computing the amplification efficiency of an assay or assigning existing effiencies that have been calculated elsewhere.
In order to compute new efficiencies the Calibrator requires a set of decorated replicates that come from a dilution series. You can check out this tutorial to learn more.
If appropriate data is available we can use the qpcr.Calibrator to compute and assign new efficiencies by:
calibrator = qpcr.Calibrator()
assay = calibrator.calibrate( assay )
This will read the data, perform a linear regression to determine the efficiency, and assign the computed efficiency to the assay. Also, the efficiency is now stored by the Calibrator. The stored efficiencies can be easily saved to a file using:
calibrator.save( "my_efficiencies.csv" )
We usually do not have a dilution series in each of our datasets, however. So most often you will wish to assign efficiencies that have been already computed from other qPCR runs.
For these cases, the Calibrator can read a reference “database file” and assign existing efficiencies to new Assays as long as their id is present among the reference efficiencies.
# read already existing efficiencies from a file
calibrator.load( "my_efficiencies.csv" )
# and now simple assign an existing efficiency to the assay
assays = calibrator.assign( assay )
If we have both assays with existing efficiencies and such with new dilution series data, we can actually just use the Calibrator’s pipe method to process all of them at once.
many_assays = [ ... ]
calibrator.load( "my_efficiencies.csv" )
# pipe all assays, which will assign where possible, and calibrate anew where necessary (and data is available)
many_assays = calibrator.pipe( many_assays )
# finally, save the (now updated) database of efficiencies
# this will by default update the file "my_efficiencies.csv" which was already loaded.
calibrator.save()
- class qpcr.main.Calibrator.Calibrator[source]
Bases:
qpcr._auxiliary._IDCalculates qPCR primer efficiency based on a dilution series. The dilution series may either be represented as an entire assay or as a subset of groups within an assay denoted as calibrator : {some_name}. In this mode, calibrator replicates will be removed after calibration is done.
It is possible to specify the dilution steps directly in the groupnames as: calibrator: {some_name}: dil where dil is the inverse dilution step, e.g. calibrator: my_sample: 2 for a 1 : 2 dilution or calibrator: my_sample: 100 for a 1 : 100. Note, this will have to be present in each groupname!
Alternatively, if no dilution is specified in the groupnames or they cannot be inferred for some other reason, it is possible to supply a dilution step via the qpcr.Calibrator.dilution method.
- adopt(effs: dict)[source]
Adopts an externally generated dictionary of assay : efficiency structure as its own.
- Parameters
effs (dict) – A dictionary where keys are Assay Ids (str) and values are float efficiencies.
- assign(assay: qpcr.main.Assay.Assay, remove_calibrators: bool = True)[source]
Assigns an efficiency to an qpcr.Assay based on its Id. This requires that an efficiency corresponding to the Assay’s Id is present in the currently loaded / computed effiencies.
- Parameters
assay (qpcr.Assay) – A qpcr.Assay object.
remove_calibrators (bool) – If calibrators are present in the assay alongside other groups, remove the calibrator replicates.
- calibrate(assay: qpcr.main.Assay.Assay, remove_calibrators: bool = True)[source]
Computes an efficiency from an qpcr.Assay object.
This method will try to compute a new efficiency. To do this, it will check autonomously if calibrator : {} replicates are present and use these for computation. If none are found it will assume the entire assay is to be used as calibrator.
Note
Calibrators are searched for through the group names not the replicate ids!
- Parameters
assay (qpcr.Assay) – A qpcr.Assay object.
remove_calibrators (bool) – If calibrators are present in the assay alongside other groups, remove the calibrator replicates after efficiency calculation.
- property computed_values
- returns: The currently stored values from newly
computed efficiencies.
- Return type
dict
- dilution(step: Optional[float] = None)[source]
Gets or sets the dilution steps used. This must be a float fraction e.g. 0.5 for a 1 : 2 dilution series or 0.1 for a 1 : 10 series etc. If there are multiple steps because there is a gap in the dilution series. It is necessary to supply a step for each group individually e.g. [1,0.5,0.25,0.0625,0.03125]. if there are 5 dilution steps (originally six but 0.125 was discarded).
Note, both of the above also work with the inverse dilutions e.g. 2 or [1,2,4,16,32].
By default the qpcr.Calibrator tries to infer the dilutions automatically. This only works, however, if the calibrator groupnames specify calibrator: {some name} : dil where dil is the inverse dilution step (e.g. calibrator: my_sample: 2 for a 1 : 2 dilution). Note, it is important that the dilution step is given as the inverse (i.e. not as 1:2 or 1/2 or something else! )
- Parameters
step (float or np.ndarray) – The dilution step used.
- Returns
dilution – The currently used dilution step.
- Return type
float or np.ndarray
- property efficiencies
returns: The currently stored efficienies. :rtype: dict
- get(which='efficiencies')[source]
- Returns
Either the stored efficiencies (if which = “efficiencies”) or the computed values of newly computed efficiencies (if which = “values”).
- Return type
dict
- load(filename, merge: bool = True, supersede: bool = False)[source]
Loads a csv file of previously computed efficiencies.
- Parameters
filename (str) – The filepath to load efficiencies from.
merge (bool) – In case efficiencies are already loaded, merge the new and existing ones. If False the current ones will be replaced completely.
supersede (bool) – In case efficiencies of the same assay are already loaded they will be overwritten by the newly incoming ones if supersede = True.
- merge(*filenames, outfile=None, adopt=True)[source]
Merges multiple efficiency files together into a single one.
- Parameters
filenames (iterable) – Filepaths to load data from which should be merged together.
outfile (str) – The filepath in which to store the merged efficiencies. Not saved if set to None.
adopt (bool) – Will adopt the merged dictionary as its own if True (default).
- Returns
all_effiencies – The merged dictionary of all efficiencies from all files.
- Return type
dict
- pipe(assay: qpcr.main.Assay.Assay, remove_calibrators: bool = True, ignore_uncalibrated: bool = False)[source]
A wrapper for calibrate / assign.
This method will first try to assign pre-computed efficiencies and if no matching ones are found it will try to calculate a new efficiency from the assay.
- Parameters
assay (qpcr.Assay) – A qpcr.Assay object.
remove_calibrators (bool) – If calibrators are present in the assay alongside other groups, remove the calibrator replicates after assignment or efficiency calculation.
ignore_uncalibrated (bool) – If True assays that could neither be newly calibrated nor be assigned an existing efficiency will be ignored. Otherwise, and error will be raised.
- Returns
assay – The now calibrated qpcr.Assay.
- Return type
qpcr.Assay
- plot(mode: Optional[str] = None, **kwargs)[source]
A shortcut to call a qpcr.Plotters.EfficiencyLines plotter to visualise the regression lines from de novo efficiency computations.
- Parameters
mode (str) – The plotting mode. May be either “static” (matplotlib) or “interactive” (plotly).
**kwargs – Any additional keyword arguments to be passed to the plotter.
- Returns
fig – The figure generated by EfficiencyLines.
- Return type
plt.figure or plotly.figure
- reset()[source]
Resets the Calibrator to initial settings. This will clear all stored efficiency values and computed data!
- save(filename: Optional[str] = None, mode: str = 'write')[source]
Saves the calculated efficiencies to a csv file.
- Parameters
filename (str) – The filepath in which to store the efficiencies. If a file was already loaded then by default the same file will be used to save values again.
mode (str) – Can be either “write” to fully overwrite an existing file, with the newly computed data, or “append” to only add newly computed efficiencies.