Method of summarising replicate fitness observations for individual genotypes.
Euclidean distance discovers profiles that are close or overlap. Correlation discovers profiles with the same pattern, but potentially offset from each other. Mahalanobis distance can be considered as a combination of the two.
Nearest & farthest genes are only highlighted when just a single gene is specified in the box above. Note that nearest and farthest genes can only be identified when target gene is present in all selected screens.
Optionally select groups of related genes from the drop-down list above instead of specifying them manually. Select 'None' to return to highlighting 'Target gene(s)'
Select number of nearest (or farthest) profiles to plot & tabulate alongside a single 'Target gene'. Nearest genes may have a similar function to the target. Farthest genes may have an opposite function, but such observations are probably more difficult to interpret.
Boxplot comparing fitness distributions across multiple QFA screens with different background mutations, overlaid with fitness profiles for individual library deletion strains. Black bar is median fitness, edges of blue box corresponds to the 1st & 3rd quartiles, whiskers extend to most extreme datapoint which is no more than 1.5 times the length of the box away from the box, grey background corresponds to the range of the data.
Nearest
Farthest
Difference distribution: plot of difference magnitude against difference rank. This plot helps us to decide whether library deletions identified as least different to target are significant outliers or just part of the average behaviour of deletions across all screens. The flat part of this curve represents the average deletion difference from the target. Deletions with differences from the target around this value have essentially indistinguishable profiles. If the curve is flat immediately after the origin, this suggests that the target profile is not very different from that of many other deletions. In this case, ranking by difference will not provide much information and the ranked order of deletions above is not likely to be informative. On the other hand, if the curve has a steep slope near the origin, then deletions on the steep section of the curve are unusually similar to the target and should be candidates for further investigation.
Data, source code & documentation for this instance of profilyzer are hosted on GitHub