实例介绍
进行关联分析的专业指南 熟练掌握关联分析不可缺少的教程
Table of contents INTRODUCTION GETTING STARTED L1 INSTALLATION 1.1.1 WEB START LL2 STAND-ALONE 1. 3 OPEN SOURCE CODE 777888 1.2 pANELS 2 DATA MODE GAG Data 10 2. 1 GDPC GmPe GDPC 2.2 LOAD Load 2.2.1 BLOB 22.2 HAPMAP 2. 2. 3 PlinK 2.2.4 FLAPJACK 2.2.5 POLYMORPHISM 22334 2. 2. 6 PHYLIP 2. 2. 7 NUMERICAL DATA 2.2. 8 SQUARE NUMERICAL MATRIX 16 2.2.9 GENETIC MAP 16 2. 3 EXPORT F Expo 16 2.4 SITES sites 17 2.5 SITE NAMES/ I Site Names 18 2.6 TAXA Taxa 19 2.7 TRAITS Traits 19 2.8 IMPUTE SNPS 8 mute SNPs 20 2.9 TRANSFORM ?-+5 Transform 2.9. 1 GENOTYPE NUMERICALIZATION →+5 Transform 20 2 92 TRANSFORM AND/OR STANDARDIZE DATA 21 2.9.3 IMPUTE PHENOTYPE 22 2.9.4 PCA 23 2. 10 SYNONYMIZE TAXA NAMES p+9 Synonymize 23 2.11 UNION JOIN ① U Join 25 2.12 INTERSECTION JOIN Q n Join 26 3 ANALYSIS MODE Analysis 27 3.1 DIVERSITY A Diversity 27 3.2 LINKAGE DISEQUILIBRIUM L Link. Diseq 28 3. 3 CLADOGRAM s Cladogram 29 3.4 SNP EXTRACT AGiC: SNP Extract 29 3.5 KINSHIP Kins 3. 6 GENERAL LINEAR MODEL GLM 30 3.7 MIXED LINEAR MODEL 32 GS 3.8 RIDGE REGRESSION 34 4 RESULT MODEL R Results 36 4.1 TABLE Table 36 4.2 TREE PLOT s Tree Plot 36 E 2D Plot 4.3 2D PLOT 37 4.4 LD PLOT LD Plot 38 4.5 CHART Chart 39 5 MENUS 5.1 FILE MENU 41 5..I SAVE DATA TREE 41 5.1.2 OPEN DATA TREE 41 5.1 3 SAVE DATA TREE AS 41 5. 1. 4 OPEN DATA TREE 41 5.1.5 SAVE SELECTED AS 41 5.2 CONTINGENCY TEST 42 5.3 PREFERENCES 6 TUTORIAL 43 6.1 MISSING PHENOTYPE MPUTATION 43 6.2 PRINCIPAL COMPONENT ANALYSIS 45 6. 3 ESTIMATION OF KINSHIP USING GENETIC MARKERS 49 6.4 ASSOCIATION ANALYSIS USING GLMI 50 6.5 ASSOCIATION ANALYSIS USING MLM 54 6.6 IMPORTING DATA FROM A DATABASE (VIA GDPC 57 6.6. 1 CONNECTING WITH A DATABASE 57 6.6.2 DATA QUERY 6. 6. 3 IMPORTING GDPC DATA INTO TASSEL 6.6.4 SAVING GDPC QUERY RESULTS 63 7 APPENDIX 64 7.1 NUCLEOTIDE CODES DERIVED FROM IUPAC) 64 7. 2 TASSEL TUTORIAL DATA SETS 65 7.3 BIOGRAPHY OF TASSEL 66 7.4 FREQUENTLY ASKED QUESTIONS 68 I. WHAT DO I DO IF TASSEL MISBEHAVES? 68 2 WHERE DO I TURN FOR MORE INFORMATION? 68 3 HOW DO I JOIN THE FUN: TASSEL ON SOURCEFORGE? 68 4. HOW DO I CHANGE THE AMOUNT OF MEMORY USED? WHAT DO I DO WHEN THE EXCEPTION JAVALANG OUTOFMEMORYERROR APPEARS? 68 5. WHEN I CLICK ON THE MOST CURRENT VERSION OF TASSEL WEB START. A PREVIOUS VERSION APPEARS WHAT SHOULD IDO? 69 6. WHAT SHOULD I SUBSTITUTE FOR MISSING VALUES IN TASSEL? 69 7. IS IT POSSIBLE TO CHANGE DATA NAMES IN THE DATA TREE? 8. HOW CAN I CREATE A TASSEL ICON ON DESKTOP? 69 9. WHY DO I GET EMPTY SQUARES IN MLM ASSOCIATION ANALYS 69 I0. WHY SHOULD LEXCLUDE ONE COLLMN OF THE POPULATION STRUCTURET 9 11. CAN KINSHIP REPLACE POPULATION STRUCTURE? 69 12. WHY DO TASSEL AND SPAGEDI GIVE DIFFERENT KINSHIP ESTIMATES? 70 13. CAN I GET MARKER R SQUARE USING SAS PROC MIXED OR TASSEL MLM? 70 14. DOES MLM FIND MORE ASSOCIATIONS THAN GLM? 70 15. DO I NEED MULTIPLE TEST CORRECTION FOR THE P VALUE FROM TASSEL? 70 16. CAN TASSEL HANDLE DIPLOID GENOTYPE DATA? 70 17. HOW TO CITE TASSEL? 70 REFERENCES 71 INDEX 73 INTRODUCTION While tassel has changed considerably since its initial public release in 2001, its primary function continues to be providing tools to investigate the relationship between phenotypes and genotypes. As indicated by its title- Trait Analysis by aSSociation, Evolution and Linkage- TASSEL has multiple functions, including association study, evaluating evolutionary relationships, analysis of linkage disequilibrium, principal component analysis, cluster analysis, missing data imputation and data visualization Onc of the design clements driving tassel dcvclopmcnt has bccn the need to analyze ever larger scts of data". For example, the MLM (mixed linear model) function for association analysis originally used an EM (expectation-maximization) algorithm, which is a common method for solving mixed models but is relatively slow. Subsequently developers implemented the emma algorithm to increase computing speed. Model compression was added to that to improve speed and statistical power for association study. Another technique that optimizes variance components once and then uses the estimates to test markers now provides the ability to screen the large numbers of markers used in genome-wide association tudies(GWAS). The method was independently described by Zhang et al. and Kang et al. in 2010. This method was named P3d by Zhang et al. and emmaX by Kang et al TASSEL was designed for a wide range of users, including those not expert in statistics or computer science. A GWAS using the mixed linear model method to incorporate information about population structure and cryptic relationships can be performed by in a few steps by"clicking,on the proper choices using a graphic interface. All the processes necessary for the analysis are performed automatically, including importing phenotypic and genotype data, imputing missing data(phenotype or genotype), filtering markers on minor allele frequency, generating principal components and a kinship matrix to represent population structure and cryptic relationships, optimizing compression level and performing GWAs The command-line version of TASSEL, called the Pipeline, provides users the ability to program tasks using a script instcad of the graphic uscr interface(GUn. This fcaturc allows rcscarchcrs to define tasks using a few lines of code and provides the ability to use tassel as part of an analysis pipeline or to perform simulation studies Due to the increasing availability of open data sources, TASSEL utilizes a data browser from the Genomic Diversity and Phenotype Connection(GDPC) project n0 to provide an interface to relation databases. As a result, TASSEL users can access any data source that provides a gdPc service Using this middleware, which provides a common graphical interface, TASSEL users can avoid writing SQL queries to access data. Currently, GDPC provides connections to Panzea, Gramene, Germinate, and GriN USDAS Germplasm Resources Information Network) TASSEL is written in Java, thereby enabling its use with virtually any operating system. It can be installedusingJavaWebStarttechnologybysimplyclickingonalinkatwww.maizegenetics.net/tassel A stand-alone version of TASSEL can also be downloaded to use in pipeline mode or in any situation whcrc the uscr wishes to start the softwarc from a command linc Getting started a quick way to get started using TASSEL is to load the tutorial data and try performing analyses However, because some of the necessary steps may not be intuitive, we recommend that new users follow the tutorial at end of this manual. The objective of this section is to provide information necessary to install and start TASSEL Software and to provide a brief overview of the interface Most functions are organized into three modes(Data, Analysis and Results) which correspond to the first three buttons on the TaSsel interface as shown below Clicking one of these buttons changes the functions represented by the second row of buttons. Those three modes are described in detail in the subscqucnt scctions of this manual. The scrccn shot shows TaSsel after the tutorial files have bccn loaded E TASSEL (Trait Analysis by aSSociation, Evolulion, and Linkage)3.0.37 Data m同a图s$m@·qsm地 O n Join CC sep A OPhysical Postions O te Numbers O Locus O Alleles 章upl冲p o mcp genot p 口 鲁 mcp population structure NINN 日Ma GCTCACICAGTCA GCTCACICAGTICAICICAACATICA C Synon?rizer 帕6 CACIGAGCICACICAACAITICA GCTCACICAGTCACICAACATCA B104 D14A GCTCACICAIGTCACIO 91112e1o三seq=ces;91 GCTICACICAGTICACI ICA Data type: IUFACNucleotide GCITCACICAGTCACICAACAITCA NMNNNNNNNNNINNNNNN NN NN CT187 Table NANNINNNNNNNNNNNNNNINNW 出 ee wlot SCUESHET SACCAA GAT G CML1O 2 PLot L Plot nCE AL261 CTc|AA5 CCACICAACA「CA ar七 GCTICACICAGITICACICI Datasets we d to test inStallation The graphic version of TASsEL can be installed in one of the three ways: using Java Web Start, as a stand-alone application, or using the source code 1.1.1 Web start TASSEL can be installed using Java Web Start technology, which automatically checks for the most recent version of TASSEL each time the application is executed. In addition, Java Web Start will ensure that the correct version of the Java Runtime Environment is running, thus avoiding complicated installation and upgrade procedures. Users should use Web Start unless they have a specific reason to use one of the other installation methods To begin, Java Web Start (WS) must be installed (prior to the installation of TASSel). JWS is included as part of Java Runtime Environment (JRE)5.0 and above. PC's and Mac's will most likely have JWS alreadyinstalledIfyouneedtoinstallJava,themostrecentversionisavailableathttp://www.java.com The casicst way to tcll if it is installed on your computcr is to try running Tassel from the following link http://www.maizegenetics.net/tassel If you will be using Tassel frequently and would prefer to launch the application from your desktop rather than by revisiting the website, Java Web Start can be used to manually launch TASSEL each time and/or to create a shortcut. Access the Java Application Cache Viewer by going to Start> Settings Control Panel > Java. From the general tab click on Settings in the Temporary Internet Files section and then click on View Applications. and the Java Application Cache Viewer will appear.(Another way of achieving this is by going to Start> Run and typing in javaws). The Tassel icon should now be visible and can be used to launch the application Shortcuts can be created from the menu of the Java Application Cache Viewer: Application Install Shortcuts. 1.1.2 Stand-alone Downloading astand-alone" version is recommended for anyone who has a slow Internet connection While Java Web start is a very good way of deploying software, it does not ask the user before attempting to download updates. Thus, a slow Internet connection may start a download process that requires an unreasonable amount of time to complete. If you are not interested in disabling your network connection each time before starting TASSEL, we recommend downloading the stand-alone version which does not attempt to update the program however, given that TaSSeL is a Java application, a Java Runtime Environment(version 1.6.0 or greater) is still required. To get the stand-alone version, download tassel3. 0 standalone. zip from the TASSEL web site. To run the stand-alone version, double-click on the JAR filc (stasseljjar alternatively, from a command prompt (in windows go to Start> Run and type n“ cmdr‘ command”), change into the tassel3。 standalone directory and execute this command start tassel. bat (For Windows start tassel. pl(For UNiX 1.1.3 Open source code OpensourcecodefortheTassElsoftwarepackageisavailableat:http:/sourceforge.net/projects/tassel The package uses a number of other libraries that are included in the Tassel distribution. These include amodifiedversionofthePallibrary(http://www.cebl.aucklandac.nz/pal-project/),theColtlibrary (http://dsd.lbl.gov/hoschek/colt/),andjfReechart(http://www.jfree.org/jfreechart/).Gdpcmiddleware (http://www.maizegenetics.net/gdpc)providesdatabaseaccess 12 Panels TASSEL is organized into five main panels. (1) The Control Panel at the top contains menus and buttons to control functions. (2) The Data Tree Panel is located beneath the Control Panel on the left side. This panel organizes data sets and results. Data set(s)displayed in the Data Tree Panel must first be selected before a desired function or analysis can be performed. To select multiple data sets, press the ctrl key while selecting the data sets. (3 The report Panel is located below the Data Tree Panel. It displays information about a selected data set from the data Tree panel, such as the type of data and how it was created.(4) The Progress Monitoring Panel below the report Panel shows the progress of running tasks and has buttons that can be used to cancel tasks. (5) The Main Panel occupies the right side of the viewing area. It displays the content of a selected data set from the data Tree panel Functions in TASSEL are accessed by buttons and menus on the control Panel The three buttons on the top left are the Mode Selectors(Data, Analysis and Results). The buttons below the mode selectors changed when a new mode selector is clicked. the modes are described in section 2 4. To the right of the Mode selectors are the Progress Bar, and the delete, Print, Save and help buttons 2 Data mode【0 Data mode serves the purpose of importing and managing data. Data mode is the default mode when TASSEL Starts. Click on the data button to switch to this mode Tassel has two ways of importing data. One way is via gDPc to import data from databases. The other way is via flat files formatted as genotypes(e.g. hapmap, flapjack, and plink), phenotypes(trait data population structure and kinship matrices The preliminary data manipulations include filtering data by site or taxa, joining data and data trans formation 2.1 GDPc ve GDPC Genotype and phenotype data generated from numerous genomic research projects are still valuable resources for the public, even after results are published. Some of these data have been migrated to several databases and can be accessed using Genotype Data and Phenotype Connection(GDPC). GDPC is middleware that eliminates the need for end users of data to understand various database schemas and write Sql queries to extract data. Instead, the gdpc browser provides a single, easy-to-use interface which can extract genotype and phenotype data from a variety of sources 10 Currently, GDPC has connections to the following databases Gramene diversity for maize, wheat and rice -14 http://www.gramene.org/db/diversity/diversityview Pa 15-17 anzea http://www.panzea.org GRIN http://www.ars-grin.gov GDPC can be used within TASSEL or as a stand-alone application. to display gdPC in TASSEL, click on the gdpc button in Data mode 【实例截图】
【核心代码】
标签:
小贴士
感谢您为本站写下的评论,您的评论对其它用户来说具有重要的参考价值,所以请认真填写。
- 类似“顶”、“沙发”之类没有营养的文字,对勤劳贡献的楼主来说是令人沮丧的反馈信息。
- 相信您也不想看到一排文字/表情墙,所以请不要反馈意义不大的重复字符,也请尽量不要纯表情的回复。
- 提问之前请再仔细看一遍楼主的说明,或许是您遗漏了。
- 请勿到处挖坑绊人、招贴广告。既占空间让人厌烦,又没人会搭理,于人于己都无利。
关于好例子网
本站旨在为广大IT学习爱好者提供一个非营利性互相学习交流分享平台。本站所有资源都可以被免费获取学习研究。本站资源来自网友分享,对搜索内容的合法性不具有预见性、识别性、控制性,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,平台无法对用户传输的作品、信息、内容的权属或合法性、安全性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论平台是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二与二十三条之规定,若资源存在侵权或相关问题请联系本站客服人员,点此联系我们。关于更多版权及免责申明参见 版权及免责申明
网友评论
我要评论