# G2P [![](https://img.shields.io/badge/Issues-0%2B-brightgreen.svg)](https://github.com/XiaoleiLiuBio/G2P/issues) [![](https://img.shields.io/badge/Release-v1.0.1-blue.svg)](https://github.com/XiaoleiLiuBio/G2P/commits/master)
## A Genome-Wide-Association-Study Simulation Tool for [G](https://github.com/XiaoleiLiuBio/G2P)enotype Simulation, [P](https://github.com/XiaoleiLiuBio/G2P)henotype Simulation, and [P](https://github.com/XiaoleiLiuBio/G2P)ower Evaluation
<p align="center">
<a href="https://raw.githubusercontent.com/XiaoleiLiuBio/G2P/master/results/G2P_logo.png">
<img src="results/G2P_logo.png" height="250px" width="450px">
</a>
</p>
More abundant simulation functions could be referred to our newly developed package [SIMER](https://github.com/xiaolei-lab/SIMER) for simulation of life science and breeding
### Authors:
> You Tang and ***Xiaolei Liu***
### Contact:
> [xiaoleiliu@mail.hzau.edu.cn](Xiaolei Liu)
### Contents
<!-- TOC updateOnSave:false -->
- [Installation](#installation)
- [Environment Setup](#environment-setup)
- [Windows](#windows)
- [MAC](#mac)
- [Linux](#linux)
- [Data Preparation](#data-preparation)
- [ped](#ped)
- [map](#map)
- [pop](#pop)
- [Genotype Simulation](#genotype-simulation)
- [Single Population _ GUI](#single-population-_-gui)
- [Single Population _ Pipeline](#single-population-_-pipeline)
- [Multi Populations _ GUI](#multi-populations-_-gui)
- [Multi Populations _ Pipeline](#multi-populations-_-pipeline)
- [Random Simulation _ GUI](#random-simulation-_-gui)
- [Random Simulation _ Pipeline](#random-simulation-_-pipeline)
- [Phenotype Simulation](#phenotype-simulation)
- [Phenotype _ GUI](#phenotype-_-gui)
- [Phenotype _ Pipeline](#phenotype-_-pipeline)
- [Population Structure](#population-structure)
- [Population structure _ GUI](#population-structure-_-gui)
- [Population structure _ Pipeline](#population-structure-_-pipeline)
- [Quality Control](#quality-control)
- [Quality control _ GUI](#quality-control-_-gui)
- [Quality control _ Pipeline](#quality-control-_-pipeline)
- [GWAS](#gwas)
- [GWAS _ GUI](#gwas-_-gui)
- [GWAS _ Pipeline](#gwas-_-pipeline)
- [Method Evaluation](#method-evaluation)
- [Method Evaluation _ GUI](#method-evaluation-_-gui)
- [Method Evaluation _ Pipeline](#method-evaluation-_-pipeline)
- [FAQ and Hints](#faq-and-hints)
<!-- /TOC -->
---
# Installation
**[back to top](#contents)**
## Environment Setup
**[back to top](#contents)**
**JDK1.8 should be installed and environment variables must be configured before using G2P (http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html)**
## Windows
**[back to top](#contents)**
**GUI**
Download all files from https://github.com/XiaoleiLiuBio/G2P/tree/master/gG2P_win_64 and double click the .jar file
**Pipeline**
Download all files from https://github.com/XiaoleiLiuBio/G2P/tree/master/kG2P_win_64
## Mac
**[back to top](#contents)**
**GUI**
Download all files from https://github.com/XiaoleiLiuBio/G2P/tree/master/gG2P_mac and double click the .jar file
**Pipeline**
Download all files from https://github.com/XiaoleiLiuBio/G2P/tree/master/kG2P_mac
*permission setting*
```bash
$ chmod 777 gemma oldplink plink
```
## Linux
**[back to top](#contents)**
**GUI**
Download all files from https://github.com/XiaoleiLiuBio/G2P/tree/master/gG2P_linux_x86_64
and run
```bash
$ Java -jar gG2P.jar
```
**Pipeline**
Download all files from https://github.com/XiaoleiLiuBio/G2P/tree/master/kG2P_linux_x86_64
*permission setting*
```bash
$ chmod 777 gemma oldplink plink
```
# Data Preparation
*All files should be prepared with the same prefix*
## ped
*details see http://zzz.bwh.harvard.edu/plink/data.shtml#ped*
**[back to top](#contents)**
|Family ID|Individual ID|Father ID|Mother ID|Sex|Trait|marker 1|marker 2|marker 3|marker 4|marker 5|marker 6|
| :---: | :---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |
|1|33-16| 0| 0| 0| 2| 0 0| A A| A A| A G| A G| A G|
|1|38-11| 0| 0| 0| 2| 0 0| A G| A G| A A| A G| A G|
|1|4226 | 0| 0| 0| 2| 0 0| A G| A A| A A| A G| A G|
|1|4722| 0| 0| 0| 2| 0 0| A G| A G| A A| A G| A G|
|1|A188 | 0| 0| 0| 2| 0 0| A A| A A| A A| A G| A G|
|1|A214N| 0| 0| 0| 2| 0 0| A G| A A| A G| A A| A G|
|1|A239 | 0| 0| 0| 2| 0 0| A A| A A| A G| A G| A A|
|Family ID|Individual ID|Father ID|Mother ID|Sex|Trait|marker 1|marker 2|marker 3|marker 4|marker 5|marker 6|
| :---: | :---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |
|1|33-16| 0| 0| 0| 2| 0 0| 1 1| 1 1| 1 3| 1 3| 1 3|
|1|38-11| 0| 0| 0| 2| 0 0| 1 3| 1 3| 1 1| 1 3| 1 3|
|1|4226 | 0| 0| 0| 2| 0 0| 1 3| 1 1| 1 1| 1 3| 1 3|
|1|4722| 0| 0| 0| 2| 0 0| 1 3| 1 3| 1 1| 1 3| 1 3|
|1|A188 | 0| 0| 0| 2| 0 0| 1 1| 1 1| 1 1| 1 3| 1 3|
|1|A214N| 0| 0| 0| 2| 0 0| 1 3| 1 1| 1 3| 1 1| 1 3|
|1|A239 | 0| 0| 0| 2| 0 0| 1 1| 1 1| 1 3| 1 3| 1 1|
## map
*details see http://zzz.bwh.harvard.edu/plink/data.shtml#map*
**[back to top](#contents)**
|Chromosome ID|Marker ID|Genetic Distance|Physical Distance|
| :---: | :---: |:---: |:---: |
|1| PZB00859.1| 0| 157104|
|1| PZA01271.1| 0| 1947984|
|1| PZA03613.2| 0| 2914066|
|1| PZA03613.1| 0| 2914171|
|1| PZA03614.2| 0| 2915078|
|1| PZA03614.1| 0| 2915242|
|1| PZA00258.3| 0| 2973508|
## pop
**[back to top](#contents)**
*new samples will be generated using samples within sub-population*
|Sample ID|sub-Population ID|
| :---: | :---: |
|33-16| 1|
|38_11| 1|
|4226| 1|
|4722| 2|
|A188| 2|
|A214N| 2|
|A239| 2|
|A272| 2|
|A441-5| 2|
|A554| 3|
|A556| 3|
|A6| 3|
|A619| 3|
## qtn
**[back to top](#contents)**
*each column represents simulated QTNs for each phenotype*
|Phenotype 1|Phenotype 2|Phenotype 3|Phenotype 4|Phenotype 5|
| :---: | :---: | :---: | :---: | :---: |
|66 |67 |80 |83 |90|
|9 |15 |52 |59 |135|
|90 |96 |143 |147 |174|
|3 |3 |15 |58 |89|
|89 |118 |185 |203 |212|
|69 |72 |72 |84 |110|
|46 |59 |125 |204 |207|
|14 |15 |19 |29 |39|
|9 |23 |65 |111 |131|
|19 |52 |74 |179 |194|
# Genotype Simulation
## Single Population _ GUI
**[back to top](#contents)**
<p align="center">
<a href="https://raw.githubusercontent.com/XiaoleiLiuBio/G2P/master/results/Single Population.png">
<img src="results/Single Population.png" height="400px" width="460px">
</a>
</p>
```
Ped: ped file
Map: map file
Path for output Ped/Map: path for output ped and map file
Block: Yes or No, if "Yes", the whole genome will be divided into blocks and shuffled to generate new samples
Number of SNPs in each block: Number of SNPs in each block
Mutation rate: the frequency of new mutations
Imputation: if TRUE, major allele will be used to impute missing values
Population size: simulated sample size
```
## Single Population _ Pipeline
**[back to top](#contents)**
### Windows
```
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.map --outgen D:\data\output --rn 100 --block 4 –impute
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.map --outgen D:\data\output --rn 100 --block 4 --mutation 0.0001 --impute
```
### Linux/Mac
```
java -jar kG2P.jar --ped /root/data/AG.ped --map /root/data/AG.map --outgen /root/data/output --rn 100 --block 4 –impute
java -jar kG2P.jar --ped /root/data/AG.ped --map /root/data/AG.map --outgen /root/data/output --rn 100 --mutation 0.0001
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.map --outgen D:\data\output --rn 100 --block 4
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.map --outgen D:\data\output --rn 100 --impute
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.map --outgen D:\data\output --rn 100
java -jar kG2P.jar --ped D:\data\AG.ped --map D:\data\AG.map --outgen D:\data\output --rn 100 --mutation 0.0001
```
```
jar: executive software
ped: ped file
map: map file
outgen: output path
block: number of SNPs in each block
rn: simulated sample size
impute: if 'impute' is added, maj