In Brief: The document below was written by the geneticist conducting the DNA research on the Starchild Skull. It is the Abstract of a larger document explaining the pertinent details of what he has thus far discovered in the Starchild’s DNA. It is written as a general text for specialists, so non-specialists might find it difficult to comprehend.
We are posting it to illustrate that the Starchild’s DNA analysis is in extremely competent hands, and that as soon as we can secure the funding necessary for our geneticist to recover and sequence the entire genome, he will indeed make history with it as big as history can be made.
Abstract On The Genetic and Physical Analysis From The Starchild Skull
Starchild Project Geneticist, 2012
Introduction. Around 1930, two skeletons were found in an abandoned mine tunnel in Mexico. One was an apparently normal human, and the other was noticeably shorter and misshapen. The skulls of both skeletons and a detached piece of the broken upper jaw (maxilla) from the misshapen one eventually ended up in a private collection in El Paso.
Over the past decade, since 1999, both skulls were analyzed by dozens of scientists from various areas of anthropology, none of whom disputed their authenticity as human or human-like relics. Radiocarbon analysis (C-14) indicated both individuals died around 900 years ago. There are many morphological features that distinguish the misshapen skull from a normal human skull, all of which point to significant anatomical and physiological differences in the biological entity from which the misshapen skull originated:
1. Bone of the misshapen skull is less than one-third the thickness of the bone in a human skull, and it weighs correspondingly less; (Fig. 2)
2. The misshapen skull’s bone is much harder than human skull bone, as determined by its strong resistance to cutting and drilling, which likely is a consequence of biochemical differences and its internal architecture;
3. Its chemical makeup resembles human tooth enamel much more than the makeup of human skeletal bone;
4. The misshapen skull contained a brain of an apparently different shape (Fig. 3), which was approximately 30% larger than normal human brains, as estimated by measuring its cranial capacity;
5. Eye sockets of a normal adult human are 1.5 – 2.0 inches deep, while the misshapen skull’s eye sockets are barely 0.5 inch deep, which indicates the corresponding biological entity had very different eyes;
6. As deduced from the maxilla fragment, the entity’s mouth was infant-sized and had a flat roof, with none of the arching that is normal in humans, indicating either a very small or possibly missing tongue (Fig.4);
7. The maxilla nevertheless shows a simultaneous presence of both well-worn adult teeth and unerupted teeth visible in x-ray images (Fig.5);
8. CAT scan shows the misshapen skull has much larger inner ears (Fig. 6).
Rationale. The skull’s extremely unusual morphological features suggest that when death occurred 900 years ago, it possessed anatomical and physiological differences variant enough from normal humans to suggest it could belong to an unknown humanoid species. However, morphological evidence alone cannot substantiate the hypothesis.
A simple explanation is that the entity was a human with a very rare disorder, or a combination of disorders, that resulted in all of its observed differences. This can be established if the individual’s genetic makeup falls within the currently understood boundaries of human genetic diversity. However, there must be a genetic basis for the development of such strong and durable bone, which on its own will undoubtedly be of scientific, medical and practical significance. The accumulated knowledge of the genetics of human bone formation will be very helpful in the identification of similarities and differences between the genetic makeup of normal humans and the misshapen individual.
On the farthest end of the consideration range is the possibility that the entity was of extraterrestrial origin in a general sense, and was not a part of, or a result of, the evolution that connects all life forms on this planet. This scenario represents an incomparably more challenging task, since most techniques for studying life were developed using examples existing on Earth. Did the entity use DNA, RNA and proteins to store and utilize genetic information? How can one be confident even in the possibility of collecting interpretable data?
Our confidence is based on a number of critical prerequisites. First and foremost, the chemistry of the entity’s skull is similar to that of human bone. Available elemental analysis demonstrated the presence of the same elements as in human bone – carbon, oxygen, calcium, phosphate, and others. Corroborating evidence is that the individual’s skull displays worn teeth, perhaps from chewing gritty local foods, implying that consumed food had at least some nutritional value and was metabolically relevant.
However, the most critical prerequisite for the collection of interpretable genetic evidence is that the biochemistry of the entity must be similar to human biochemistry at the more complex molecular and cellular levels. It must be based on the use of amino acids, proteins, nucleotides, and nucleic acids (DNA and RNA). Moreover, these building blocks of life must be interconnected by a similarly arranged flow of information and its encoding. Otherwise, the molecular and genetic approaches developed for the analysis of Earth’s existing life forms will be useless, similar to English grammar being of virtually no value for understanding a Chinese text. Recently, we obtained molecular genetic evidence strongly indicating that this prerequisite is also met.
The author of this Abstract obtained skull bone samples under the condition that DNA samples would be confined to one laboratory. DNA was extracted from bone samples according to published protocols of ancient DNA extraction, with all of the normal precautions taken against contamination with modern DNA. It was then sequenced using a 454 GS Jr instrument. Sequence reads of approximately 50 million nucleotides in total length were compared with human genome sequences using the accompanying software, and also analyzed by BLAST against the DNA sequence database maintained by NCBI/NIH (GenBank). The majority of reads did not match human sequences, or sequences of any species deposited in the NCBI database. However, a substantial number of reads did match human non-coding sequences, located in introns or regulatory regions, or in so-called junk DNA, which together occupy > 90% of the human genome. Both of these findings were of low informative value. However, one fragment has loosely matched exons 6 and 7 of the human FOXP2 gene. This fragment was analyzed further by manually aligning it to the FOXP2 mRNA sequence of humans and several other animals. The results of these multiple alignments are shown in the Appendix, Fig. 1.
Human FOXP2 is known as a “master switch” protein because it is produced very early in the embryonic period to regulate over 300 other genes that are responsible for the correct development of a human embryo. One of FOXP2’s functions in humans is to control speech development, which was acquired relatively recently in evolution, and separated humans from the great apes.
All higher species (mammals, birds, and fish) have species-specific variants of the FOXP2 gene, which despite certain interspecies differences is still highly conserved due to its extreme importance for embryonic development. In humans, the gene’s shown fragment has only one synonymous difference (AàG, indicated in the human sequence by R, an IUPAC designation for puRine, either A or G) found in the FOXP2 gene of a Wara South American Indian individual. Twenty other human individuals belonging to different ethnicities and language groups across the globe show exactly the same nucleotide sequence in the corresponding fragment.
The chart in Appendix 1 shows the entity’s FOXP2-like fragment in comparison with matched segments of FOXP2 from diverse animal species, from fish to humans. While there are no variations at the protein level, except for species-specific gaps or shortened variants, there are distinct species-specific variations (depicted in red) in the protein coding of corresponding DNA. This particular segment of FOXP2 is enriched in glutamine (Gln), which is encoded in DNA by only two nucleotide triplets, either CAG or CAA. Among other non-human species, only the dog (a predator) displays six amino acid changes in this segment, all clustered in one locus. The rhesus macaque, being quite distinct from humans, shows only one nucleotide difference in this fragment (shown in red), with no amino acid changes, and a deletion of exactly one triplet (one amino acid) as compared to the human sequence.
Among the 211nucleotides of the entity’s FOXP2-like DNA fragment, we find a stunning 62 differences at the nucleotide level, and 18 amino acid differences (all shown in red). Apart from species-specific gaps, this fragment shows more differences from the corresponding human FOXP2 gene fragment than any species included in the comparison. Moreover, the obtained sequence of the entity’s FOXP2-like DNA fragment appears to represent a FOXP2-like pseudogene, since sequences found in exons 6 and 7 of the human FOXP2 gene are precisely spliced together in the entity’s genomic DNA. However, so far no FOXP2 pseudogene is known to exist in humans or other mammalian genomes. The stop codon interrupting the sequence (shown in blue) may or may not be present in the entity’s gene. It may result from cytosine deamination known to occur in the DNA of ancient bones. Such deamination results in the conversion of deoxycytidine to deoxyuridine, which is recognized as deoxythymidine in a DNA polymerase – a catalyzed reaction resulting in the observed CAGàTAG mutation. Due to the randomness of deamination events, such artifacts can be accounted for by increasing the depth of coverage during sequencing.
The probability that such a highly specific arrangement of changes in a small fragment could have occurred by accumulation of sequencing errors is extremely low, if not close to zero. Nor could this arrangement have occurred due to contamination with the DNA of any known species. From this evidence one may conclude that the underlying biochemistry of the entity’s life form must be either the same as, or highly similar to, humans or other species. Yet, the use of the genetic code, which still remains universal, is distinctively different, implying that this life form is very likely the result of a markedly variant and non-intersecting evolutionary process. This may be illustrated by comparing Macintosh OS and Windows OS, both of which run on the same Intel processor, or by comparing the grammar rules of English and French, both of which belong to the same language group (Latin). The most important point here is that in either case, despite the existing differences, the encoded information can be recovered and decoded.
We plan to approach this challenge by identifying and analyzing the entity’s collagen or collagen-like genes, which we believe should also constitute the organic component of the entity’s bone. In all humans and other mammals, collagen represents 25% to 35% of the body’s entire protein content. Collagen fibrils are formed by packing two molecules of collagen a1 and one molecule of collagen a2, which are encoded by genes COL1A1 (Chr17) and COL1A2 (Chr7) in humans. The length of the COL1A1gene is 17.54 kb with 51 exons, which after splicing is translated to a precursor protein of 1464 amino acids. The length of the COL1A2 gene is 36.67 kb with 52 exons that encode a precursor protein of 1366 amino acids. Both proteins are enriched in Gly and Pro (proline), with a characteristic repeat pattern of Gly-Xaa-Yaa.
In collagens, Gly is highly conserved, since in fibrils formed by a1 and a2 chains it is packed inside the triple helix with tight space constraints, which disallow amino acids with side chains. Thus, one may expect that the entity’s collagen or collagen-like proteins are arranged in a similar way and contain conservative Gly. Exons coding for fragments of these proteins can be identified by repeated GGN NNN NNN GG or GGN NNN NNN GGN NNN NNN GG patterns, with a large number of NN in these repeats being represented by CC.
As the entity’s FOXP2-like fragment indicates, its genetic profile is based on the use of the same universal genetic code, with Gly encoded by GGN triples and Pro encoded by CCN triplets. Targeting conserved collagen or collagen-like protein genes in the entity’s genome will enable identification of exons and mapping of exon-intron boundaries, thus enabling definition of grammar and syntax of the unknown genetic language. Decoding this information will enable reconstruction of protein sequences encoded by corresponding genes, which then can be assembled de novo and expressed in prokaryotic or eukaryotic hosts for studies of their processing and biological properties. Since each human collagen gene contains over 50 exons, deciphering the encoded information can provide sufficient evidence for understanding how different or similar is the molecular genetics of the life form represented by the unknown biological entity, which will aid in decoding other information stored within its genome.
We tested the feasibility of this approach using the earlier collected sequencing data set. Using program FuzzNuc, a part of the EMBOSS bioinfomatics package, we searched the obtained data set for a pattern GGN(7)GGN(7)GGN(7)GGN(7)GG. Recovered hits were extracted from the data set using their unique IDs, and translated into protein using the universal genetic code. One of the fragments, shown in Fig. 2 in the Appendix, displayed the characteristic collagen coding pattern – 21 Gly-Xaa-Yaa repeats highly enriched in proline (Pro). Nucleotide and protein BLAST against the NCBI database indicated lack of homology to human or mammalian sequences both at the nucleotide and protein levels, indicating that the identified sequence is not the result of contamination with human or animal DNA. However, the fragment appeared loosely homologous to the a1 chains of human and animal collagens of type VIII and type III. The fact that the fragment encodes 21 Gly-Xaa-Yaa repeats suggests that it actually encodes a collagen VIII – like protein fragment. Genes of fibrillar collagens (such as type I or III) contain about 50 exons, each of 54 bp in length (that is encoding 18 aa or six Gly-Xaa-Yaa repeats), or occasionally 45 bp in length (five repeats). In contrast, the COL8A1 gene has only 5 exons, and only exons 4 and 5 encode long protein fragments. As shown in Fig.2, the identified fragment indeed aligns to a part of the longest exon 5 of the human COL8A1 gene. More precise identification of this collagen-like protein requires additional sequencing, but this example illustrates the feasibility of the proposed approach.
General discussion of other research in this area. DNA extraction from ancient bones has been extensively studied and used for genetic characterization of extinct animals and human relatives. This approach culminated in recovery of the Neanderthal genome and discovery of the previously unknown extinct relative of the Homo species, provisionally called Denisova hominin. The existing instrumentation for next generation sequencing technology includes genetic analyzers manufactured by Illumina, Life Technologies (formerly Applied Biosystems) and 454 Life Sciences. Currently, the most widely used platform is Illumina, which in a few days or a few weeks can produce from 5 Gb to 100 Gb of sequence data, and it is supported by rapidly developing bioinformatics tools. Since reliable sequence data require from 40X to 100X coverage, an average mammalian genome of 3-5 Gb should be sequenced many times over to cover all gaps and account for inevitable errors and polymorphisms. We believe this task is well within the limits and capabilities of modern molecular genetics.
Human collagens and their genes are intensively studied both in basic and applied biomedical research, in tissue bioengineering, and therapy of injuries, such as wound healing. A vast body of literature, both reviews and original publications, describe large numbers of assays developed specifically for studies of the biological and biochemical properties of collagens and their use in modern therapies. This accumulated knowledge and expertise can be adapted to characterize biological properties of novel collagens and collagen-like proteins identified with the help of genetic studies proposed in this project.
- - - - - - - - - - - -