|
Post by djoser-xyyman on Mar 5, 2019 13:30:57 GMT -5
Admixture Abusir-Luxmanda-Malawi-Hora Skoglund Ancient Africans dataset – Reich Labs!? Yeah!! Reich Lab. That is not a good starting point but that is all we got. reich.hms.harvard.edu/datasetsAbusir dataset The mapped BAM files for the 90 mitochondrial samples and three nuclear samples are deposited in the European Nucleotide Archive (http://www.ebi.ac.uk/ena) with the study ID ERP017224. www.ebi.ac.uk/ena/data/search?query=ERP017224Admixture Abusir-Luxmanda-Malawi-Hora reich.hms.harvard.edu/datasetsAdmixture for Abusir-Ancient SkoglundAfricans Binary to VCF-need to merge in VCFTOOLS not plink plink --bfile skoglund16africans --recode --out skoglund16africans plink --ped skoglund16africans.ped --map skoglund16africans.map --recode vcf --out skoglund16africans bgzip skoglund16africans.vcf tabix -p vcf skoglund16africans.vcf.gz bgzip abusir-bam.vcf tabix -p vcf abusir-bam.vcf.gz vcf-merge abusir-bam.vcf.gz skoglund16africans.vcf.gz | bgzip -c > skoglundAfricans-abusir.vcf.gz cleaning up…..plink --vcf skoglundAfricans-abusir.vcf --const-fid 0 --make-bed --out skoglundAfricans-abusir plink --bfile skoglundAfricans-abusir --geno 0.999 --make-bed --out skoglundAfricans-abusir-fix ADMIXTURE process
|
|
|
Post by djoser-xyyman on Mar 5, 2019 16:05:18 GMT -5
|
|
|
Post by djoser-xyyman on Mar 5, 2019 22:03:51 GMT -5
This is Abusir and Skoglund ancient Africans @k3. I am running currently running to K10
|
|
|
Post by djoser-xyyman on Mar 5, 2019 22:13:58 GMT -5
|
|
|
Post by djoser-xyyman on Mar 6, 2019 9:05:36 GMT -5
Abusir-Skoglund Ancient Africans at K2. I need to include outlier like. Great Britain and/or La Brana. I have a lot of work ahead. Plus I am still working on using LD in my analysis. As I become more proficient any suggestions are welcome. I will lay down my scripts/commands as I go along.
|
|
|
Post by djoser-xyyman on Mar 6, 2019 12:52:59 GMT -5
Quote from Razib Khan: ------------------ "
I put the PCA function in the script, but to remove individuals you will want to run the PCA manually:
PCA plink --bfile testfile.bed --pca 10
To make use of the pairwise Fst you need the fst.R script. If everything is set up right, all you need to do is
type: source("fst.R")
But I gave you an R script. It’s RPCA.R.
You need to install some packages. First, open R or R studio. If you want to go command line at the terminal, type R. Then type: install.packages("ggplot2") install.packages("reshape2") install.packages("plyr") install.packages("ape") install.packages("igraph") install.packages("ggplot2")
Once those packages are loaded you can use the script: source("RPCA.R")
Then, to generate the plot at the top of this post: plinkPCA()
So now you have the PCA, Fst and admixture. What else? Well, there’s treemix.
Also, as you know treemix comes with R plotting functions.
Then: >source("src/plotting_funcs.R") >plot_tree("TreeMix")
But actually, you don’t need to do the above. I added a script to generate a .png file with the treemix plot in pairwise.perl. It’s called TreeMix.TreeMix.Tree.png.
OK, so that’s it.
----------------------
|
|
|
Post by djoser-xyyman on Mar 7, 2019 8:06:00 GMT -5
|
|
|
Post by Tukuler al~Takruri on Mar 7, 2019 11:09:47 GMT -5
As I become more proficient any suggestions are welcome. OK Mensamind Of course all Africa study must always include Mt Hora Girl. One day there'll be a way to distinguish OoA stay at home ancestry from WHG, 'Taurus', and 'Zagros' ancestries. As it is now, you must add a WHG, Hotu equivalent, and anc Anatoli to test Eurasian ancestry. Otherwise what's the use? Keep truly global referential samples. Can't assess Africa out flow nor extra Africa influx w/o America Oceania E/C_Asia N/NW Europe. Using only African reference samples means even if you run Nanook of the North you'll only get African ancestries. I know it's resource expensive but if you're selecting K count instead of unsupervised you gotta ramp K up to like between 18 and 23. Oh. What does K=10 rep in nature? K=2 reps StayedInAfrica vs OutOfAfrica. Yeah, there's OoA ancestry that never left. I know you're just starting out but I also know you want to build a lab, not a cullert lab. Thx 4 all u r doing and making your work transparent 4 us all.
|
|
|
Post by djoser-xyyman on Mar 7, 2019 11:27:34 GMT -5
|
|
|
Post by djoser-xyyman on Mar 7, 2019 11:34:00 GMT -5
The Admixture cannot give more K values beyond K5. K6-K10 cannot separate out into more populations. 5 is the highest differentiation into populations. I will include "outliers" soon. Like Modern Finns and some Bedouins and Mende. Don't want to include too many at this time. See how the admixture separate out with each addition.
|
|
|
Post by djoser-xyyman on Mar 7, 2019 11:41:16 GMT -5
|
|
|
Post by djoser-xyyman on Mar 7, 2019 12:08:19 GMT -5
Shifted around and sorted at K5.
|
|
|
Post by Tukuler al~Takruri on Mar 7, 2019 16:31:53 GMT -5
A K=5 governor? Pretty useless. The current standard K max is 21, iirr. (Baker 2017 Human ancestry correlates ...)
So far, you're not using any but East African populations. Consequently, your reference set can't possibly tell if Levant, Euro, or other ancestries apply.
Running without at least an extreme Euro and/or an extreme Rock (Anatoli Caucasus Iran) sample, or pool, is stacking the deck against uncovering non-African contributions to the Abusir 3.
Since you don't seem to object to them I made reduxes for 2 3 4 & 5 anyway. Coming up soon.
To immediately get a feel for the Abusir 3, I'd run 1) at least 3 far away 'NigerCongo' speakers _____ Atlantic BightOfBenin Lakes SE Afr 2) 3-5 NiloSaharan speakers _____ NW of Lakes and east-by-south of Lakes 3) the least mixed Rain Forest ethny 4) a Europe pool or select an extreme (Scandanavian) 5) ditto for Levant & Arabian Plate ____( Lebanon Saudi Asio-African speakers are notoriously 'swirled'. Need closest to inbred taMazight speakers possible. Better yet Taforalt is available. Ditto for Chadic but dunno who to choose except NOT Hausa. Maybe the overlooked Tebu (central 'Trogodytes').
That's the most bang for your bucks, imnsho. If I could rent or lease a ready lab and 'employ' two assistants I'd be done with it already. Now may not be the time but no one will cooperate. Look at all the hands involved in aDNA reports. No go it alone rugged individualists. Cooperative teamwork is how it's done. Design Analysis Interpretation Text Editing
|
|
|
Post by djoser-xyyman on Mar 7, 2019 19:57:58 GMT -5
keep in mind the opposing team is not transparent. It looks like only 135,000 common variants for the Abusirs. Meaning? Looks like there is nothing, overlap, beyond K5. Most experts agree that K3 - K6 is most reliable. For Abusir and Skoglund ancient african dataset there is no results beyond K5. it runs indefintely. As I become more proficient any suggestions are welcome. OK Mensamind Oh. What does K=10 rep in nature? K=2 reps StayedInAfrica vs OutOfAfrica. Yeah, there's OoA ancestry that never left. I know you're just starting out but I also know you want to build a lab, not a cullert lab. Thx 4 all u r doing and making your work transparent 4 us all. [ote]
|
|
|
Post by Tukuler al~Takruri on Mar 7, 2019 20:26:55 GMT -5
Please listen to a dumbass like me. Don't reply until a 24 hour bake in relaxes you enough to accept input from a detached outsider before summarily explaining away or outright rejecting without due consideration.
Your East Africa limited reference set is why you're getting nothing beyond 5 ancestries. You're totally relying on 4 or 5 countries South Africa Malawi Tanzania Kenya Ethiopia forming a letter J from Ethiopia to South Africa. Considering the reference set no other ancestries than ones related to 'NG' NS and Click stand a ghost of a chance. Wanna up your K's? Include Great Britain. Just that one. Just to see. I double dee dawg dare ya! The K5 remix
|
|