Frequently Asked Questions (FAQs):

  1. When should I use Wegan?
  2. What types of input does Wegan accept?
  3. Is the data I uploaded kept confidential?
  4. How are missing values dealt with?
  5. How can outliers be identified and dealt with?
  6. Some images did not show up after I click the corresponding tab (PCA, PLS-DA)?
  7. Can I analyze unlabeled data?
  8. What is generalized logarithm transformation?
  9. How does "Auto/Pareto/Range scaling" work?
  10. When should I use Data editor or Data filter ?
  1. When should I use Wegan?

    Wegan (Web-based Ecological Group Analysis) is a free, user-friendly tool for analyzing ecology data. It can help with exploratory data analysis; diversity, ordination, taxonomic comparisons; and generating statistics and figures for use in publications. It is powered by R and useful for those less familiar with R programming.

  2. What types of input does Wegan accept?

    Wegan was designed primarily for community abundance data, in the form of tables commonly recognizable with sites or samples in rows and species or taxa in columns. If your data has these in reverse (i.e. the species in rows and sites in columns), indicate so on data upload. The Plotting and Correlation modules can process categorical data along with numeric data in the same table, but all others must be community abundance data. For sample files and format specification, please see "Data Formats" for more details.

  3. Is the data I uploaded kept confidential?

    Yes. The data files you upload for analysis as well as any analysis results, are not downloaded or examined in any way by the administrators, unless required for system maintenance and troubleshooting. All files are deleted from the server after no more than 72 hours, and no archives or backups are kept. You are advised to download your results as an zip immediately after performing an analysis.
  4. How are missing values dealt with?

    Missing values should be presented either as empty values or NA without quotes. Any other symbol will be treated as string character and will cause errors during data processing. Wegan offers a variety of methods to deal with missing values. Missing values must be dealt with during the data upload phase. This decision was made to streamline the user interface and available analysis methods provided. Be aware that by default, missing values are replaced by half of the minimum positive values detected in the data, which may or may not be appropriate for your analysis goals. Users can also specify other methods, such as replace by mean/median, Probabilistic PCA (PPCA), Bayesian PCA (BPCA) method, or Singular Value Decomposition (SVD) method to impute the missing values (Stacklies W. et al).

  5. How can outliers be identified and dealt with?

    Potential outliers can be identified from PCA or PLS-DA plots. The scores plot can be used to identify sample outliers, while the loadings plot can be used to identify feature outliers. The potential outlier will distinguish itself as the one located far away from the major clusters formed by the remaining.

    To deal with outliers, verify correct sample measurements. If those values cannot be corrected, they should be removed from analysis. Wegan provides DataEditor to enable easy removal of sample/feature outliers. Please note, you may need to re-normalize the data after outlier removal.

  6. Some images did not show up after I click the corresponding tab?

    This implies Wegan failed to execute the command using the given parameters. Users should try to adjust parameter values. We found in most cases, the problem is associated with sample size. In particular, if the sample size is very small (below 10), some unpredictable error may happen. For instance, by default PCA and PLSDA will try to generate summary/classification/permutation plot for the top 5 components, if the sample size is too small, it will fail to do so.

  7. Can I analyze unlabeled data?

    There are several unsupervised methods (PCA, hierarchical clustering, SOM, K-means) that can be used to detect inherent patterns in unlabeled data. However you need to trick Wegan to accept the data by providing dummy two-group labels . In this case, results from feature selection or supervised classification methods will be meaningless.

  8. What is generalized logarithm transformation (glog)?

    Generalized logarithm (glog) is a simple variation of ordinary log in order to deal with zero or negative values in the data set. It has many desirable features (for details, see Durbin BP. et al Its formula is shown below:
    where a is a constant with a default value of 1.

  9. How does "Auto/Pareto/Range scaling" work?

    Please see the following summary table by van den Berg et al . Here Si is standard deviation.

  10. When should I use Data editor or Data filter ?

    The purposes of data editor and data filter are to help improve the quality of data for better separation, prediction or interpretation. In particular, user can use data editor to remove outlier(s) which can be visually identified from PCA or PLS-DA scores plots); user can use data filter to remove noisy or uninformative features (i.e. baseline noises, near-constant-features). These features tend to dilute the signal and decrease the performance of most the statistical procedures. Be removing outliers and low-quality features, the resulting data will be more consistent and reliable.

Processing ....
Your session is about to expire!

You will be logged off in seconds.

Do you want to continue your session?