Monday 15 December 2014

Label selections in the View Labels tab

The view Labels tab in Biodiverse is where you can interactively visualise the distribution of your species (or other) data in geographic space, on a tree and against a matrix of pairwise values.  Clicking (selecting) map or matrix cells, tree branches or rows in the label list highlights the distribution of these labels across each of the other panels.

This has been in Biodiverse since its first release, and is really useful because you can easily identify outliers, gappy distributions, or simply gain an understanding of how your data are distributed. See here for an overview from an earlier version.

As part of the development work towards version 1.0, the View Labels tab has been enhanced with a number of selection related features.  Some of these were already available in the 0.99_006 development release, while the rest are be in 0.99_007 (which, if there are no show stopping bugs, will be the last of the 0.99 series before 1.0 is released).  This follows other enhancements to this tab, such as the export menu, and the pan and zoom tools

The features are listed here, with details below. 
  1. Selections can now be added to and removed from, rather than being a new set every time. Selections can be also switched (inverted). 
  2. Labels can be selected using text matching.
  3. Selected labels can be deleted.
  4. New basedata objects can be created from the selected set, or its complement.  
  5. Selected labels can be exported.
It is worth noting that these operations work on all groups (cells) containing the selected labels.  We don't yet have tools to work on selected labels across a subset of groups, although some combination of label selections and a definition query in the Run Exclusions dialogue could be used here (I need to blog about those updates separately). 

Selections can be added to or removed from, and switched

Previously in Biodiverse, selecting labels in any of the grid, tree or matrix panes would generate a new selection every time.  The only way to add or remove labels from the selection was to control-click on the rows in the label list at the top left. 

Now users can choose from three selection modes, "new", "add_to" and "remove_from".  These work exactly as named.  So, for example, one can select a clade in the tree, change the mode to remove_from, draw a box around a set of cells in the grid and any label in these cells will be removed from the selected set.

The switch selection simply inverts the section, so any selected records become unselected while any unselected ones become selected.  This is most useful when you need all but a small number of records to be selected, and it is easier to select the small number first.

(These options are another of those feature sets which are already in many other software packages, but it is good to provide a user interface many people are already used to).

Users can now choose from three selection modes, as well as switch the selected set

Selections can use text matching

Biodiverse now also supports the ability to select labels using text matching.  This can use part of the word, or the whole word.  It also uses regular expressions, so you can build matches that are as complex as you need.

As an example, say you have records for species in the genuses Acacia, Daviesia and Gastrolobium in a data set, with the genus name included in each label, but you are only interested in the distribution of Gastrolobium.  All you need do is set "Gastrolobium" as the match to select all records containing that name.  You can then see where Gastrolobium records are distributed across the map, tree and matrix.  If needed, you can also then delete or export these records, or make a new basedata (see the next few sections below for details).

The interface allows you to override the current selection mode if need be, but it defaults to whatever the current mode is (new, add_to or remove_from).  Choosing a full match will select only  labels that match the text exactly, negating the selection will select any label that does not match, while case insensitive matching will ignore the case (so "cac" will match all of "Cactus", "cacaphony" and "ICAC").


The text selection above will select any label containing the text sp1, so for the example data distributed with Biodiverse this will select Genus:sp1, Genus:sp11, and so forth.

Selected labels can be deleted

This is a feature people have been asking for for some time, so it is good to finally get it into the system.

If you have a data set with a variety of labels in it then you can select everything you don't want to keep and then delete them.  Simple as that.

There are two deletion approaches.  The default is to also delete any groups which have no remaining labels after the label deletions are completed; this is consistent with the Run Exclusions dialogue (look under the Basedata menu).  The other approach is to keep these groups.  This provides a convenient way of generating empty groups, as one can import a dummy label to create the relevant groups, and then delete the label while retaining the groups.

Any deleted groups are plotted in light grey to show where they were.  If groups are not deleted then they are not plotted in grey, as they are still part of the Basedata.

The key point to be aware of is that there is currently no undo support, so be careful when you do this.  While you do get a warning message allowing you to time change your mind, it is probably worth working on a copy of your Basedata just to be on the safe side. 

The other point is that deletions will not be applied to Basedatas which contain analysis outputs, e.g. Spatial, Cluster or RegionGrower analyses.  The system will throw an error if you try.  At the time of writing it waits until you try to delete the labels before complaining, but future versions might simply make the menu option insensitive (non-clickable).  The reason we don't at the moment is that we need to track additions and deletions of outputs in a Basedata when a view labels tab is open to make it work smoothly.

One can delete selected labels.  In this case, all labels in Tasmania have been selected.
Groups deleted because all their labels have been deleted are plotted in grey.  This helps keep track of where the deletions have occurred.    In this example Tasmania is now plotted as grey, but so are several other groups which contained only labels found in Tasmania.


New Basedata objects can be created from the selected labels

Sometimes you don't want to delete any labels, as you still need them for analyses.  In this case you can create a new Basedata object from the selected labels.

There is not much to say about this one.  All it does is create a new Basedata object where only the selected labels are used.  The groups in the new basedata will be only those which contain the selected labels by default, but there is the option to retain all groups.  

There is also the option to use only the non-selected labels.  You could achieve the same result by switching the selection before exporting selected records, but this way you can avoid a few button clicks.



Selected labels can be exported

In the same way that labels and groups can be exported using the Export menu, the selected labels and the groups which contain them can be directly exported using any of the supported formats.

This is another case of saving button clicks, as one could otherwise create a new Basedata and then export that, but that would become irritating if one needed to do it frequently.  (This is actually what the system does in the background, as it creates temporary Basedata object, exports it, and then discards it.  Consequently it might not work well for very large Basedatas if system memory is in short supply). 

All the usual export options are available, but they will apply only to the selected set. 

Summary

To sum up, these additions represent a very useful set of features which allow the user finer control over the labels that are selected, visualised, and now exported or cleaned up.

Please give them a try and report any success or issues.  You can use the comment below, the mailing list or the issue tracker



Shawn Laffan, 15-Dec-2014




For more details about Biodiverse, see http://purl.org/biodiverse

For the full list of changes in the 0.99 series (leading to version 1) see https://purl.org/biodiverse/wiki/ReleaseNotes#version-099

To see what else Biodiverse has been used for, see https://purl.org/biodiverse/wiki/PublicationsList

You can also join the Biodiverse-users mailing list at http://groups.google.com/group/Biodiverse-users




Thursday 20 November 2014

Do it yourself CANAPE

[[ Update 2022-10-25:  Biodiverse now calculates CANAPE for you, so steps 4-6 below are not needed to generate CANAPE maps and data.  See more details in this blog post. ]]


In the previous blog post I described some of methods behind the CANAPE (Categorical Analysis of Neo and Palaeo Endemism) method.

The purpose of this post is to give more details about how to run it with your own data.  You can also use the Biodiverse pipeline, but these instructions focus on how to do it using the Biodiverse GUI (at least to the point of generating all the necessary data). 

These instructions assume you have already imported all your data, so both the species distribution data and the tree.  Instructions for how to do so are in the quick start guide.

The CANAPE analyses also need a version later than 0.19, which at the time of writing is the development version 0.99_005. [Update 2015-04-20 - version 1.0 has now been released http://purl.org/biodiverse/wiki/Downloads ]

Step 1.  View the data

This is a general step you should always do anyway, but open the View Labels tab so you can cross check the spatial data (the basedata object in Biodiverse parlance) and the tree.  This is accessed by the menu option Basedata->View Labels, the keyboard shortcut Shift-Control-V, or by double clicking on the basedata object in the outputs tab. 

As you hover over cells on the map (cells are called groups in Biodiverse) you should see paths on the tree being highlighted.  As you hover over branches on the tree you should see cells being highlighted.  If you click on cells or branches then species names (labels) in the list at the top left should be highlighted.

Hovering over a cell highlights any branches on the tree that are found in that cell. 


Hovering over a branch highlights the set of cells containing any of the named branches beneath it (usually just the terminals).  Clicking on the branch will also select the matching labels in the list at the top left, and colour all cells in the map based on how many of those labels are found in the associated group. 

If there is no highlighting then there is a mismatch between your tree and the basedata.  You will need to either rename the labels, for which there is an option in the basedata menu, or re-import the tree and specify a remap table when you do so.  Both approaches can use the same table, so it is up to you which data set should be the canonical source of names.  It does not matter which one is canonical for CANAPE, but for other analyses it can do.  It depends on what you want to do with the data.

Step 2.  Run the spatial analyses

The next step is to run the calculations for the observed data.  This uses a spatial analysis, and can be run using menu option Analyses->Spatial.  You should see something like the image below.  Make sure the tree you want to use is selected. 

The initial spatial analysis window with default spatial conditions. 

We now need to change the settings.  For this particular analysis we are only interested in each group (cell) in isolation, not collections of groups, so we need to delete the second spatial condition.

We also need to select the Phylogenetic Endemism calculation.  This can be accessed under the Phylogenetic Endemism category (or Phylogenetic Indices if you are using a version earlier than 0.99_005, but remember that CANAPE won't work on version 0.19 since it lacks the next set of calculations).  Then select the Relative Phylogenetic Endemism, type 2 calculation under the Phylogenetic Indices (relative) category.

Delete the second spatial condition and select the Phylogenetic Endemism calculation.


Also select the Relative Phylogenetic Endemism, type 2 calculation. 

When all is selected, hit the Go! button.  The example data set distributed with Biodiverse will take very little time to run.  The data set used in Mishler et al. (2014) takes approximately 5 seconds on a modern laptop.  (The code includes a number of optimisations such as caching of results for later re-use and binary searches where possible).

Step 2.  View the results

Now you get to bask in the glory of a set of results.

Screenshots of the example data are given in the previous blog post so won't be repeated here.  See the section "Step 1" at this URL:   http://biodiverse-analysis-software.blogspot.com.au/2014/11/canape-categorical-analysis-of-palaeo.html

It is time to stop basking and get back to work.

Step 3.  Run the randomisations


To assess which of the cells are candidates for palaeo- or neo-endemism we need to run a randomisation.  In this case we select the rand_structured option in Biodiverse, constraining the richness of each group (cell) to be equal across all randomisations.

CANAPE uses the rand_structured randomisation to ensure the richness of each cell is constant across all randomisation iterations/realisations. 

In the screen shot above you can see in the Setup section that we have chosen the rand_structured function for the randomisation.  We will run 999 iterations with all results being prefixed with "CANAPE".  See the Biodiverse help system for more details on the naming scheme.  The checkpoint save iterations is useful when one wishes to track a long running randomisation process to see if it has converged.  The basedata will be saved at any iteration ending in the value given will be saved, so a setting of 99 means iteration 99, 199, 299 etc will be saved.  It is probably not needed here, so can be set to 999.  [[EDIT 2020-Feb-19:  The checkpoints were used when we had crashes due to memory leaks.  These were fixed in version 0.18.  Unless you was to see how the system is progressing, the checkpoint save option can be set to -1 so it never happens.]]

The Parameters section allows us to control other options.  In this case we will leave the trees alone (randomise_trees_by equals no_change), so each analysis across the random iterations will use the original tree (for other analyses one might choose to randomise the tree but not the spatial data).  We have no group properties set so can also leave randomise_group_props_by as no_change.  A richness_multiplier of 1 and richness_addition of 0 means we will replicate the exact richness scores, as the formula used is a linear function (max_random_richness = observed_richness * richness_multiplier + richness_addition).


Now we hit the Go! button again.  This might take a while, depending on how large your data set is.  For the example data it is about 3 minutes on a modern laptop, but it can scale to hours for data sets of the size used in Mishler et al. (2014).

Unsurprisingly, the progress bar displays how progress has been made.


Step 4.  View the randomisation results 

This is another case of "the images are in the previous blog post".  Look for step 2 in http://biodiverse-analysis-software.blogspot.com.au/2014/11/canape-categorical-analysis-of-palaeo.html  (but note that those images use a randomisation name of Rand1 instead of CANAPE).


Step 5.  Export the results 


Exporting the results in Biodiverse version 0.99_005 and later can be done while viewing the spatial output (see this previous blog post), or from the outputs tab.  In earlier versions exporting can only be done from the outputs tab. 

The screenshot below shows the export options.  If you are using the Biodiverse pipeline then you need to export to delimited text format files.  If you are going to hand roll your classification then use whichever format is appropriate. 

Choose the export option most appropriate to your needs. 



Export the SPATIAL_RESULTS list.  This contains all the PE and RPE results.  This image is for the Delimited text method.  Some options will differ for other export methods. 

Export the randomisation results. 
The delimited text exports can be viewed in a spreadsheet program or imported into a stats package such as R. 
The results are in two tables (one for the spatial analyses, one for the randomisation results) so might need to be linked.  Use the Element field in each table for this, as it is the unique identifier for each group (in Biodiverse a group is jut a special type of element, as is a label).  The Axis_0 and Axis_1 fields are the coordinates of the cell centroids.  There can be any number of axis columns in a basedata, but in this case we have only two.  Remember that the randomisation naming scheme is explained here

If you export to one of the raster formats (except the ER-Mapper format) then you will have one raster per list item, so one for PE_WE_P, one for RPE_NULL2, etc.  The ER-Mapper format packs all of one list into one multiband raster, with one band per list item.

Step 6.  Classify the data 

The next step is to classify the data into neo, palaeo and mixed-endemism. Blow-by-blow details of that will be left to another blog post, as this one is getting pretty long.  More of the pipeline also needs to be shaken out so it works with other data sets before I can write about that.

If you want to go ahead and do it yourself then it is not particularly complex.  Details of the classification system are given in the previous blog post.   Look for Step 3.  Use the indices PE_WE_P, PHYLO_RPE_NULL2 and PHYLO_RPE2 for PE_orig, PE_alt and RPE, respectively.


Shawn Laffan 20-Nov-2014

For more details about Biodiverse, see http://purl.org/biodiverse

For the full list of changes in the 0.99 series (leading to version 1) see https://purl.org/biodiverse/wiki/ReleaseNotes

To see what else Biodiverse has been used for, see https://purl.org/biodiverse/wiki/PublicationsList


You can also join the Biodiverse-users mailing list at http://groups.google.com/group/Biodiverse-users



Wednesday 19 November 2014

CANAPE - Categorical Analysis of Palaeo and Neo Endemism


[[ Update 2022-10-25:  Biodiverse now calculates CANAPE for you.  See more details in this blog post. ]]


The purpose of this blog is to explain in a bit more detail how the CANAPE method works.  This was prompted by some very useful questions from Stu Marsden.


The paper describing the CANAPE method (Mishler et al. 2014) is at http://dx.doi.org/10.1038/ncomms5473


The CANAPE method is a three step process.  In Step 1 a set of three primary indices as calculated for each region in the data set, in Step 2 a randomisation is run to identify regions with significant endemism.  In Step 3 these regions are classified into palaeo, neo, mixed or non-endemism.

Step 1.  Calculate the observed endemism 

The first step is to calculate a set of three observed endemism scores for each region in the data set:
  1. Phylogenetic Endemism (PE) calculated using a user specified tree.  This will be called PE_orig below.  
  2. PE calculated using an alternate tree.  This will be called PE_alt below.  
  3. Relative Phylogenetic Endemism (RPE) which is calculated as the ratio of PE_orig to PE_alt. 
In this case each region is a single cell, but it could be any collection of cells for which one is interested in running the calculations.


The formula for PE for a region "i" is:




where is the set of branches found in region i, is the local range of branch (the number of cells in region i in which it is found), and is the global range of branch (calculated as the number of cells in which it is found across the whole data set).  Put in words, PE for a region is the sum of the branch lengths found in that region, but where each branch is weighted by the fraction of its geographic range that is found in that region.  It is worth noting that PE is basically is a range-weighted variant of PD (phylogenetic diversity), as the sum of PE scores across all cells will equal the PD for the set of branches found in those cells.

PE_orig and PE_alt are calculated in the same way, the difference is simply in the trees being used.  In Mishler et al. (2014) the alternate tree is one with the same topology as the original tree but where the non-zero branches are modified to be of equal length.  It should be noted that the per-branch range weighting is the same as for PE_orig, so each equalised branch receives the same range weighting as its counterpart in the original tree.

RPE for a region is simply the ratio of PE_orig and PE_alt for that region, so will be >1 when the original tree has longer range weighted branches (PE_orig is longer than on the alternate tree), and <1 when PE_alt has longer range weighted branches (PE_orig is shorter than on the alternate tree). This translates to determining if a region has a collection of longer or shorter range weighted branches.







The following plots illustrate the calculation of PE_orig and PE_alt for the example data set that is distributed with Biodiverse.



PE_orig (scaled to be proportional to the total tree length).  Branches highlighted in blue are those found in the cell marked with a circle.  (The sum of these branch lengths is the PD of the cell.)  Grey branches are not found in the highlighted cell.  See this blog post for more details about the tree plots)



Same as above, but with the tree branches scale to be proportional to their ranges.  PE for the highlighted cell is the sum of these range weighted branches.  It is clear from this plot that the PE in the highlighted cell is largely due to one narrow ranged branch as the other branches in that cell have been considerably downweighted.  (Note also that the branch weighting and thus the weighted lengths will change if collections of cells are used in the analysis, e.g. to calculate the PE of a region instead of a single cell).



The map on the left is PE_alt, with the alternate tree on the right (where branches are scaled to be of equal length).  The map colours are not directly comparable between plots due to differing numerical ranges (something to fix in a future version of Biodiverse), but the differing highs and lows are readily differentiated.





PE_alt, but now plotting the range weighted version of the equal branch length tree.  Note the similarity with the range weighted tree above for PE_orig, but that it has more detail in the internal branches.





Step 2.  Use a randomisation to identify regions with significant endemism


The randomisations are needed because we don't have a good basis to directly threshold the values of PE_orig, PE_alt and RPE.  The same value of PE can be obtained by different combinations of terminals (species) as it is a combination of branch lengths and their range weighting, e.g. two long but narrow range branches could be the same as 100 long but wide-ranged branches, so using some predetermined threshold is not likely to be useful.  The same applies to RPE, where the same ratio can arise from different inputs. 

The randomisation is done at the level of the species rather than the cell.  In each iteration, species (tree tips) are randomly allocated to the landscape, with the constraint that the richness of each cell is held constant and the same as in the original data set, and thus so is each species' range.  For each random realisation we then calculate PE_orig, PE_alt and RPE for each cell.

We repeat the randomisation 999 times (or more) and keep a track of where the observed PE_orig, PE_alt and RPE for each ell plot against the PE_orig, PE_alt and RPE calculated using the 999 random realisations.  The significance of each cell for each index (PE_orig, PE_alt and RPE) is then the rank relative position of the original values against those of the random realisations, so anything in the top 5% is significantly high for a one tailed test, while anything in the upper 2.5% or lower 2.5% is significant for a two tailed test.


The randomisations are plotted below.  The C_ prefix in the index name means it is the count of times the observed value was greater than the randomised values, so 995 means 995 of the 999 randomisations.  The RPE test is two-tailed so we look for both high and low values (accounting for tied values in the lows, albeit there are none in this example).  These plots don’t have the thresholding applied to them because it is not yet supported in Biodiverse, but it can be done using other tools such as the Biodiverse pipeline or a GIS.

 
Plot of the number of times PE_orig was greater than the same index calculated using the 999 random realisations. 


Plot of the number of times PE_alt was greater than the same index calculated using the 999 random realisations.


Plot of the number of times RPE was greater than the same index calculated using the 999 random realisations.


 

Step 3.  Classify the results for each cell


The CANAPE test is then a process of checking the rank relative significance of the PE_orig, PE_alt and RPE values for each cell.  Indents are sub-branches in the decision process.

    1)    If either PE_orig or PE_alt are significantly high then we look for palaeo or neo endemism
        a)    If RPE is significantly high then we have palaeo-endemism (PE_orig is consistently higher than PE_alt across the random realisations)
        b)    Else if RPE is significantly low then we have neo-endemism (PE_orig is consistently lower than PE_alt across the random realisations)
        c)    Else we have mixed age endemism in which case
            i)    If both PE_orig and PE_alt are highly significant (p<0.01) then we have super endemism (high in both palaeo and neo)
            ii)    Else we have mixed (some mixture of palaeo, neo and non endemic)
    2)    Else if neither PE_orig or PE_alt are significantly high then we have a non-endemic cell



    The bulk of the CANAPE process can be run using the development version of Biodiverse (https://purl.org/biodiverse/wiki/Downloads [Update 2015-04-20 - Version 1 has now been released]).  It is just the final classification which needs external processing, for which there is a pipeline at https://github.com/NunzioKnerr/biodiverse_pipeline but it needs tweaking for the plots to work with other data sets.  The pipeline can actually be used to run the whole thing, it just takes a bit of work to set up at the moment.


    Shawn Laffan
    19-Nov-2014


    For more details about Biodiverse, see http://purl.org/biodiverse
     
    For the full list of changes in the 0.99 series (leading to version 1) see https://purl.org/biodiverse/wiki/ReleaseNotes
     
    To see what else Biodiverse has been used for, see https://purl.org/biodiverse/wiki/PublicationsList
    You can also join the Biodiverse-users mailing list at http://groups.google.com/group/Biodiverse-users


    Equations are all CodeCogs - An Open Source Scientific Library

    Wednesday 29 October 2014

    Pan, zoom and other functions are now more standard

    The pan and zoom functionality in Biodiverse has always been pretty simple.  For the zooming you simply click the button and the display zooms in.  The same applies to zooming out.  The problem here is that it is not consistent with other software.

    The other issue was that each panel (the map, tree and matrix) had its own pan and zoom tools, leading to repetition and needless use of screen real estate.  In some cases there was not enough space to render the control widgets so they simply disappeared. 

    This is all changed in version 1.  Now all of the pan and zoom functions are at the left of the window, with one set of tools controlling all the panels (much of the display functionality has also moved). This applies to all of the analysis displays, so the View Labels tab and all the Spatial, Cluster, Region Grower and Matrix analyses.

    Compare the new version with the older version in the screenshots below.  The interface is now much cleaner and less repetitive.  



    The new versions of Biodiverse have the pan and zoom tools are at the left of the display, and one set controls all the panels. 


    The old approach had pan and zoom tools for each of the panels, so there were three different sets for the view labels tab. 


    The new approach also has keyboard shortcuts to switch from one mode to another: “z” to zoom in; “x” to zoom out; “c” to pan; “v” to zoom to fit and “b” to select.  There are tooltips for each button so you can get a reminder of the key to use by hovering the mouse over a tool. 

    The zoom-to-fit and zoom-out options happen immediately the key is pressed to speed up the interaction with the display.  Their keyboard shortcuts also don’t change the current mouse function, so if you are selecting then you can keep doing so. 

    The zoom-in option now supports a box zoom approach, where one can select a box and it becomes the focus of the display.  A single click will zoom in by a predefined amount (usually x1.5 or x2).  (The matrix in the View Labels tabs needs some work before version 1 is released [Update 19-Nov-2014 - this has now been fixed]).

    To make life easier for the user the mouse also changes its icon depending on which mode it is in, so there is a visual cue for the user.

    To be honest, there is nothing particularly novel about this approach.  However, it can only be a good thing that the cognitive load on users is reduced by following a more standard approach for interaction with the plots.


    For more details about Biodiverse, see http://purl.org/biodiverse
     
    For the full list of changes in the 0.99 series (leading to version 1) see https://purl.org/biodiverse/wiki/ReleaseNotes
     
    To see what else Biodiverse has been used for, see https://purl.org/biodiverse/wiki/PublicationsList

    You can also join the Biodiverse-users mailing list at http://groups.google.com/group/Biodiverse-users

    Shawn Laffan 
    29-Oct-2014 

    Monday 27 October 2014

    New export menu in analysis display tabs

    Exports in Biodiverse 0.19 and earlier all had to be done from the outputs tab.  This meant that users had to switch from an analysis tab displaying the results of some calculations, which is a few too many mouse clicks to be described as easy.  That’s now a thing of the past. 

    In Biodiverse 1.0 there is now an export menu at the left of the tab when the map is visible.  This is in all tabs which have a map: spatial analyses, spatial matrix plots, and cluster and region grower analyses. 

    Spatial tab


    Cluster tab


    Matrix tab


    As part of adding this functionality, the system now also defaults to the last list that was displayed if the export supports lists.  When you export your data from an open tab it will choose the one you are looking at as the default.  This applies to exporting from the Outputs tab as well, so you can still export the data you were looking at if you have closed the analysis tab or don’t want to navigate back to it.

    The default list is the last one displayed. 




    For more details about Biodiverse, see http://purl.org/biodiverse
     
    For the full list of changes in the 0.99 series (leading to version 1) see https://purl.org/biodiverse/wiki/ReleaseNotes
     
    To see what else Biodiverse has been used for, see https://purl.org/biodiverse/wiki/PublicationsList

    You can also join the Biodiverse-users mailing list at http://groups.google.com/group/Biodiverse-users

    Shawn Laffan 
    27-Oct-2014 



    Friday 24 October 2014

    New tree plots in Biodiverse



    Biodiverse has always had the capacity to display a phylogeny at the same time as one views the spatial data.  In version 1 it now has the capacity to display a tree at the same time as a spatial analysis.

    Now when you run an analysis using a tree you can display the results and more easily see which branches of the tree contributed to the answer at that cell.  If your analysis did not use the tree then you can still see which branches occur in a cell (group) and its neighbourhood.

    The screenshot below is the example data distributed with Biodiverse, analysed using one cell radius neighbourhood to calculate the PhyloSørenson index of turnover between the branches in the central group (cell) and each group within a one cell radius.  Branches plotted in blue are found only in the central group (neighbour set 1, and the blue should probably be darker), those in red are found only in the neighbouring groups (neighbour set 2), while those in black are found in both neighbour sets.  Any branch not found in the neighbourhood is in grey to reduce its visual impact without hiding it.

    As with the View Labels plots which have been in Biodiverse since the beginning, the branch highlighting updates as you hover the mouse over the map.
     



    The new tree plot is not restricted to the spatial analyses.  You can also visualise matrices in the same way.  The next screenshot is the PhyloSørenson turnover surface for an index group plotted in grey (see Laffan 2011 for more details about how this process works and is interactive).  Now the branches highlighted are those in the index cell relative to a set in the cell that was hovered over when the screenshot was taken (perhaps we need to highlight that in a future version).  The interpretation is otherwise the same as the previous plot – blue branches are unique to the index group, red are unique to the neighbour group, and black are shared. 



    The other good news here is that you are not restricted to using only the tree used in the analysis.  Using the chooser at the bottom of the window you can select from the analysis tree, the currently selected project tree, or no tree at all.  As of version 0.99_005 you can also hide the panel if it gets in the way.

    We don’t yet plot extra trees in the cluster plots, but if there is sufficient need then we could be convinced to implement it.

    The tree plotting is still a work-in-progress, for example the blue could be darker, but you can try it now using the development version of the software.  https://purl.org/biodiverse/wiki/Downloads

    Current users of Biodiverse will also note that the display layout has changed.  That will be the topic of another post.


    For more details about Biodiverse, see http://purl.org/biodiverse

     
    For the full list of changes in the 0.99 series (leading to version 1) see https://purl.org/biodiverse/wiki/ReleaseNotes
     

    To see what else Biodiverse has been used for, see https://purl.org/biodiverse/wiki/PublicationsList


    You can also join the Biodiverse-users mailing list at http://groups.google.com/group/Biodiverse-users


    Shawn Laffan
    24-Oct-2014